Configuring a PostgreSQL DB with your Dockerized FastAPI App

undraw svg icon

Welcome to Part 2 of Up and Running with FastAPI. If you missed part 1, you can find it here.

This series is focused on building a full-stack application with the FastAPI framework. The app allows users to post requests to have their residence cleaned, and other users can select a cleaning project for a given hourly rate.

Last time we left off, we dockerized our fastapi backend and setup a router with a single dummy endpoint. The goal of this post will be to implement a data persistence layer. Our application will be powered by the de-facto open-source, relational database - PostgreSQL.

We'll spend the entirety of this post setting up our database, configuring our db environment, and getting migrations in place. In the next post, we'll flesh out our router with the appropriate endpoints, and test our endpoints using Pytest and Docker.

There is a LOT of setup that needs to happen before we do any serious development, so be warned. This part can be painful.

Setting up PostgreSQL with Docker

Anyone who's set up postgres on their machine knows that it can be a pain in the ass at times. What makes Docker so great is how seamlessly we can integrate any database into our currently running application.

We'll begin by configuring docker to create a container housing our postgres instance. Update your docker-compose.yml file to look like this:

docker-compose.yml
version: '3.7'
services:
server:
build:
context: ./backend
dockerfile: Dockerfile
volumes:
- ./backend/:/backend/
- /var/run/docker.sock:/var/run/docker.sock
command: uvicorn app.api.server:app --reload --workers 1 --host 0.0.0.0 --port 8000
env-file:
- ./backend/.env
ports:
- 8000:8000
depends_on:
- db
db:
image: postgres:12.1-alpine
volumes:
- postgres_data:/var/lib/postgresql/data/
env_file:
- ./backend/.env
ports:
- 5432:5432
volumes:
postgres_data:

A few things going on here. We've added a new service called db and we've pulled down the standard postgres 12.1-alpine image that comes prepackaged with everything we need for our postgres database.

We're also adding a postgres_data volume to our container. Volumes exist so that our containers can maintain state between runs. We wouldn't want to drop our database each time we killed our running container.

We'll need a few additional packages to interact with our database, setup tables, and run migrations. We'll need to add each of those to our requirements.txt file like so

requirements.txt
# app
fastapi==0.55.1
uvicorn==0.11.3
pydantic==1.4
email-validator==1.1.1
# db
databases[postgresql]==0.3.1
SQLAlchemy==1.3.16
alembic==1.4.2

We've now added new dependencies, so we'll need to rebuild our containers. Once we do, we'll be able to make full use of these libraries:

  • databases - An async interface to a number of popular databases.
  • SQLAlchemy - The most well known SQL toolkit for Python. Many apps employ this package as an ORM, but we'll writing our database queries in raw SQL, so we'll use it simply for managing database tables.
  • alembic - A lightweight database migration tool that works well with SQLAlchemy.

All that's left now is to build our new containers. Execute the following command:

docker-compose up --build

Once the build process is finished, you should see your container running with a nice warning about no password being set for the database. Take note of that - we're going to come back and fix it later on.

In another tab, run docker ps to see a list of running Docker processes. Do you see two containers - one for your server and one for your database? If so, congrats. You've got your very own PostgreSQL database.

FastAPI Config

It's best practice in any application to store sensitive information as environment variables, and make sure they're not checked into git. FastAPI and the underlying framework it's built on - Starlette - make this relatively straightforward. First, head into the .gitignore file we made earlier and add the following:

.gitignore
# Byte-compiled files
__pycache__/
# Environment files
.env

Then go ahead and initialize git, and make the first commit.

git init
git add .
git commit -m "Dockerized FastAPI app with postgres."

With that out of the way, we can start adding config data to our .env file.

SECRET_KEY=supersecret
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_SERVER=db
POSTGRES_PORT=5432
POSTGRES_DB=postgres

Each time we clone this repo, this file will have to be recreated, so it's often a good idea to make a .env.template file with the names (NOT VALUES) of the environment variables the repo requires.

We're using default for postgres user, password, and db - but don't worry. We'll update those later as well.

Now create a config.py file in the core directory.

touch backend/app/core/config.py

Inside that file, add the following:

core/config.py
from databases import DatabaseURL
from starlette.config import Config
from starlette.datastructures import Secret
config = Config(".env")
PROJECT_NAME = "phresh"
VERSION = "1.0.0"
API_PREFIX = "/api"
SECRET_KEY = config("SECRET_KEY", cast=Secret, default="CHANGEME")
POSTGRES_USER = config("POSTGRES_USER", cast=str)
POSTGRES_PASSWORD = config("POSTGRES_PASSWORD", cast=Secret)
POSTGRES_SERVER = config("POSTGRES_SERVER", cast=str, default="db")
POSTGRES_PORT = config("POSTGRES_PORT", cast=str, default="5432")
POSTGRES_DB = config("POSTGRES_DB", cast=str)
DATABASE_URL = config(
"DATABASE_URL",
cast=DatabaseURL,
default=f"postgresql://{POSTGRES_USER}:{POSTGRES_PASSWORD}@{POSTGRES_SERVER}:{POSTGRES_PORT}/{POSTGRES_DB}"
)

What we've done here is compose a number of environment variables into a single database connection string that we'll use to link our FastAPI app to our postgres db. Starlette provides a Config object that allows us to specify a file - .env - to look for environment variables in. We then create some constants that we'll use accross our app, cast them to the appropriate data types, and give defaults where we might want them.

NOTE: Any config variable that does not have a default MUST be provided a value in the .env file or an error will be thrown.

As we develop this backend, we'll revisit the config.py file regularly.

Connecting FastAPI to PostgreSQL

We can now use the connection string we've created to connect to our database using FastAPI's startup event.

Let's start by creating a db directory to house all database-related code and a tasks.py file to go along with it.

mkdir backend/app/db
touch backend/app/db/__init__.py backend/app/db/tasks.py

The tasks file we've created will establish our database connection and handle any additional configuration we require. We'll also create a tasks file in the core directory, to wrap the startup and shutdown events for our app.

touch backend/app/core/tasks.py

Open up the core/tasks.py file and add the following:

core/tasks.py
from typing import Callable
from fastapi import FastAPI
from app.db.tasks import connect_to_db, close_db_connection
def create_start_app_handler(app: FastAPI) -> Callable:
async def start_app() -> None:
await connect_to_db(app)
return start_app
def create_stop_app_handler(app: FastAPI) -> Callable:
async def stop_app() -> None:
await close_db_connection(app)
return stop_app

We've defined two functions that will run when our application starts up and when our application shuts down. Each one returns an async function that's responsible for creating our database connection and shutting it down. We can add other tasks here later, but this is all we need for now. We're importing the connect_to_db and close_db_connection functions from our db/tasks.py file, though we haven't defined those yet.

Let's do that now.

db/tasks.py
from fastapi import FastAPI
from databases import Database
from app.core.config import DATABASE_URL
import logging
logger = logging.getLogger(__name__)
async def connect_to_db(app: FastAPI) -> None:
database = Database(DATABASE_URL, min_size=2, max_size=10) # these can be configured in config as well
try:
await database.connect()
app.state._db = database
except Exception as e:
logger.warn("--- DB CONNECTION ERROR ---")
logger.warn(e)
logger.warn("--- DB CONNECTION ERROR ---")
async def close_db_connection(app: FastAPI) -> None:
try:
await app.state._db.disconnect()
except Exception as e:
logger.warn("--- DB DISCONNECT ERROR ---")
logger.warn(e)
logger.warn("--- DB DISCONNECT ERROR ---")

In the connect_to_db function we're using the databases package to establish a connection to our postgresql db with the database url string we configured in our core/config.py file. We also add key word arguments for the minimum and maximum number of connections to have at a given time. Don't worry too much about that for now.

Then, we wait until the connection successfully completes, and we attach it as a _db key to the state attribute on our FastAPI app. When the app shuts down, we disconnect from the database to clean things up nicely.

Now let's register those event handlers on our actual application in the server.py file. We'll also replace some of the original metadata with constants from our config file.

server.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.core import config, tasks
from app.api.routes import router as api_router
def get_application():
app = FastAPI(title=config.PROJECT_NAME, version=config.VERSION)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.add_event_handler("startup", tasks.create_start_app_handler(app))
app.add_event_handler("shutdown", tasks.create_stop_app_handler(app))
app.include_router(api_router, prefix="/api")
return app
app = get_application()

First, we'll stop our running container by hitting CONTROL+C, and then spin everything up again with docker-compose up.

If all goes well, you should see a log with the following:

server_1 | INFO: Connected to database postgresql://postgres:********@db:5432/postgres

And boom! The database is fired up and ready to go.

Configuring Database Tables and Migrations with SQL Alchemy and Alembic

Migrations serve as documentation on how a database has been created/changed over time. Though there are advantages to creating multiple migration files, we're going to start with one large migration file and modify it incrementally as we build our application.

Setting up migrations is also a pain, so bear with me here.

Inside our db directory, we're going to create two more directories - migrations and repositories. We're also going to create some alembic-specific files for our migrations, and a base.py file in repositories.

mkdir backend/app/db/migrations backend/app/db/repositories
touch backend/app/db/migrations/script.py.mako backend/app/db/migrations/env.py
touch backend/app/db/repositories/__init__.py backend/app/db/repositories/base.py

If you refer to the Alembic documentation, you'll notice that we've laid down the foundation for setting up a migration environment. The script.py.mako file is a Mako template, will instruct alembic on how to generate new migration scripts, and the env.py file is a Python script that is run whenever the alembic migration tool is invoked.

We're going to need to specify a directory that our migrations should live in, so let's do that now.

mkdir backend/app/db/migrations/versions

On top of that, we need to configure our alembic environment - usually done in an alembic.ini file located in our base directory: backend.

touch backend/alembic.ini

The contents of the alembic.ini file are mostly taken directly from the generic example in the documentation.

alembic.ini
# A generic, single database configuration.
[alembic]
# path to migration scripts
script_location = ./app/db/migrations
# version location specification; this defaults
# to alembic/versions. When using multiple version
# directories, initial revisions must be specified with --version-path
version_locations = ./app/db/migrations/versions
[loggers]
keys = root,sqlalchemy,alembic
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = WARN
handlers = console
qualname =
[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine
[logger_alembic]
level = INFO
handlers =
qualname = alembic
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic
[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S

Don't worry too much about what you see here. A majority of the file refers mostly to how alembic should log information to the terminal. The main thing we've changed is the script_location, which specifies where our Mako template script is.

The script.py.mako file should look like this:

"""${message}
Revision ID: ${up_revision}
Revises: ${down_revision | comma,n}
Create Date: ${create_date}
"""
from alembic import op
import sqlalchemy as sa
${imports if imports else ""}
# revision identifiers, used by Alembic
revision = ${repr(up_revision)}
down_revision = ${repr(down_revision)}
branch_labels = ${repr(branch_labels)}
depends_on = ${repr(depends_on)}
def upgrade() -> None:
${upgrades if upgrades else "pass"}
def downgrade() -> None:
${downgrades if downgrades else "pass"}

Our template file will be used to create migration scripts whenever we ask alembic to generate them. Don't worry if this looks foreign, it'll make more sense once we put it to use.

And finally, the env.py file.

db/migrations/env.py
import pathlib
import sys
import alembic
from sqlalchemy import engine_from_config, pool
from logging.config import fileConfig
import logging
# we're appending the app directory to our path here so that we can import config easily
sys.path.append(str(pathlib.Path(__file__).resolve().parents[3]))
from app.core.config import DATABASE_URL # noqa
# Alembic Config object, which provides access to values within the .ini file
config = alembic.context.config
# Interpret the config file for logging
fileConfig(config.config_file_name)
logger = logging.getLogger("alembic.env")
def run_migrations_online() -> None:
"""
Run migrations in 'online' mode
"""
connectable = config.attributes.get("connection", None)
config.set_main_option("sqlalchemy.url", str(DATABASE_URL))
if connectable is None:
connectable = engine_from_config(
config.get_section(config.config_ini_section),
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
with connectable.connect() as connection:
alembic.context.configure(
connection=connection,
target_metadata=None
)
with alembic.context.begin_transaction():
alembic.context.run_migrations()
def run_migrations_offline() -> None:
"""
Run migrations in 'offline' mode.
"""
alembic.context.configure(url=str(DATABASE_URL))
with alembic.context.begin_transaction():
alembic.context.run_migrations()
if alembic.context.is_offline_mode():
logger.info("Running migrations offline")
run_migrations_offline()
else:
logger.info("Running migrations online")
run_migrations_online()

Oh wow. That was a lot. If you already have some experience setting up migrations, much of this will look standard to you. If you're new to this world, don't try to understand it all at once. The best way to wrap your head around what's going on here is to use it. And that's exactly what we're about to do.

Migrating Our Database

Let's get some practice executing commands in a running docker container. We'll do it a bunch, so get familiar with it.

First, open up a new terminal window or tab. Then get the IDs of all running processes like so:

docker ps

You should see an output that starts with something that looks like the following:

CONTAINER ID IMAGE
a123bc007edf phresh_server
867g5309hijk postgres:12.1-alpine

Copy the ID of the container running our server. In this example that would be a123bc007edf. Now we can start executing bash commands as the container's root use by typing:

docker exec -it a123bc007edf bash

The -it part indicates we want to use the container interactively so we can run commands and inspect the container. By default, Docker containers run and exit as soon as they've finished executing.

Inside the bash shell, we can start exploring a little bit.

ls

We should see all of the files and directories that Docker has copied into our container.

Dockerfile alembic.ini app requirements.txt tests

Feel free to explore the app directory and make sure everything is as it seems.

Now we can start using alembic commands right in the shell. We'll start by generating our first migration script:

alembic revision -m "create main tables"

If all is well with our setup, we should see a file has been created inside our db/migrations/versions directory. The two main functions - upgrade and downgrade - will be used to create and drop database tables, execute SQL commands, and each time we migrate. If all is well, the file will resemble the following:

"""create main tables
Revision ID: 12345678654
Revises:
Create Date: 2020-05-05 10:41:35.468471
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic
revision = '12345678654'
down_revision = None
branch_labels = None
depends_on = None
def upgrade() -> None:
pass
def downgrade() -> None:
pass

The first thing we'll do is define an additional function for creating our cleanings table. Then, we'll run that function inside of the upgrade function and create an opposite action in the downgrade function.

"""create main tables
Revision ID: 12345678654
Revises:
Create Date: 2020-05-05 10:41:35.468471
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic
revision = '12345678654'
down_revision = None
branch_labels = None
depends_on = None
def create_cleanings_table() -> None:
op.create_table(
"cleanings",
sa.Column("id", sa.Integer, primary_key=True),
sa.Column("name", sa.Text, nullable=False, index=True),
sa.Column("description", sa.Text, nullable=True),
sa.Column("cleaning_type", sa.Text, nullable=False, server_default="spot_clean"),
sa.Column("price", sa.Numeric(10, 2), nullable=False),
)
def upgrade() -> None:
create_cleanings_table()
def downgrade() -> None:
op.drop_table("cleanings")

We have 5 columns for each cleaning - id, name, description, cleaning_type, and price.

  • The id column will be the primary key column - an auto-incrementing integer that uniquely identifies each entry.
  • The name column is self-explanatory. We set index=True to make lookups by name faster.
  • The description column allows users to describe the job requirements and is nullable - meaning we don't need a value here when creating a cleaning.
  • The cleaning_type column will differentiate the intensity of the cleaning being requested. We set the default to "spot_clean".
  • The price column allows users to specify a per-hour price that will be paid to cleaners who accept the job.

We use SQLAlchemy to create the different data types for each column. Ids are represented by integers. Name, description, and cleaning_type are represented by text. Price is represented by a floating point number, with 2 decimal places.

Go ahead and run the migrations against the database.

alembic upgrade head

We should see output indicating that alembic is working. If done correctly, there should be a line that looks like this:

INFO [alembic.runtime.migration] Running upgrade -> 12345678654, create main tables

Interacting with the database

Ok, almost done.

To make sure that our database has been modified correctly, we'll execute psql in the db container.

docker-compose exec db psql -h localhost -U postgres --dbname=postgres

We enter the container running our db service with the -h flag for host and -U flag for user.

Now we can enter commands directly into the terminal to interact with our database.

Here's a few psql commands that can be useful:

  • \l - list all databases
  • \d+ - list all tables (relations) in the current database
  • \c postgres - connect to the postgres database
  • \d cleanings - describe the cleanings table and the associated columns

Start by running \d cleanings. If the migration worked, we expect the cleanings table to exist and the 5 columns with the correct datatypes to be displayed in the output.

We can also execute SQL queries directly against the database. Try the following:

SELECT id, name, price FROM cleanings;

We should see the result of the query shown in the terminal consisting of the columns we selected and 0 rows matching the query. That's because we haven't added data yet. We'll get to that next.

Wrapping Up and Resources

At this point, we have a working database and are ready to start populating it with real data. This post has gone on long enough, so we'll wait until the next one to hammer out those details.