Alexandre Jacques

Posted on Apr 18, 2020 • Originally published at alexandremjacques.github.io

Django (DRF) 12 Factor App with examples

#django #rest

I've been working with Django (especially with DRF) for a while now. Deploying Django apps is the one thing
that bothers me the most. It's hard and does not have a pattern. So I decided to build a guide here to keep notes for
myself and may help anyone that is struggling with the lack of patterns out there.

What is and why 12 Factors App?

The Twelve-Factor App is a methodology for building SaaS apps. The main concern is to keep
everything as isolated and as secure as possible. It was first presented by Heroku in 2011 and is still referenced as a
good guide for best practices in building and deploying apps.

Tools used on this guide:

Gitlab CI
VIM (my main editor)
Django and friends (special mention to python-decouple)

Assumes the bellow project structure (that I try to keep for all my projects)

├── Dockerfile
├── apps
│   ├── app1
│   └── app2
├── config
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── docker-compose.yml
├── docker-entrypoint.sh
├── manage.py
├── requirements
│   ├── dev-requirements.in
│   └── requirements.in
└── requirements.txt

So, without further ado, to the guide!

Each factor title has a link to its explanation on the original site. Remember: all things explained here can be
ported to other CI/CD environments and to other languages.

1. Codebase - One codebase tracked in revision control, many deploys

My approach is to use Gitflow. There are some variations on
this but, since Gitflow envisions different branches for different phases of your app, it's easy to create git hooks
to start a build and/or a deploy.

Since I'm using Gitlab CI, hooks are "translated" to the only tag on .glitlab-ci.yml configuration file. Examining
the example file bellow, notice the only tag. It informs Gitlab CI to only execute this task on commits to the
branches list:

image: gitlab/dind

stages:
  - build
  - deploy

run_build:
  stage: build
  only:
    - master
    - develop

run_deploy:
  stage: deploy
  only:
    - master

2. Dependencies - Explicitly declare and isolate dependencies

Recently I migrated my projects to pip-tools. Before the migration I was using pip but the workflow for managing
development/production dependencies was not working for me. So now, I have a /requirements directory on my project
root with dev-requirements.in and requirements.in.

The first line of my dev-requirements.in file includes requirements.in.

-r requirements.in
flake8
isort
...

So, for local development I can run:

pip-compile requirements/dev-requirements.in -o ./requirements.txt
pip-sync

And have both dev/production dependencies. In production, my pipeline has a slightly different pip-compile:

pip-compile requirements/requirements.in -o requirements.txt
pip-sync

And there you have: no development dependencies.

3. Config - Store config in the environment

Most of the "juice" of this workflow relies on this item. I have struggled for a while to find some kind of solution for
Django settings.py environment separation. The great majority involves different files for each environment. But those
solutions always ends up with sensitive information (SECRET_KEYs, database passwords, third-party app keys, etc.) in your version
control system.

As I mentioned in the begging of this guide, python-decouple comes to the rescue!

There are other packages that does similar job - it's just a personal preference to use python-decouple.

The way Decouple works is simple: try to find the declared KEYS on the following places in this order:

Environment variables
ini or .env files
Default argument passed to config

So, to illustrate, take this config/settings.py file excerpt:

import os

from decouple import Csv, config

BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

DEBUG = config('DEBUG', default=False, cast=bool)
ALLOWED_HOSTS = config('ALLOWED_HOSTS', cast=Csv())
SECRET_KEY = SECRET_KEY = config('SECRET_KEY')

If you have a DEBUG=True in your environment (exported on your Linux/macOS system or declared in your Windows preferences), python-decouple will read it, parse it to boolean and inject it
in your settings.py. If it cannot find the DEBUG key in you environment, it will try to look for it in a .env (or .ini) file.

In development, I keep an .env file in the root of my project. The .env file format is just key/value pairs:

DEBUG=True
ALLOWED_HOSTS=localhost,127.0.0.1
SECRET_KEY=secret

In production, I take advantage of Gitlab's CI environment settings. It allows me to declare the same key/value pairs
and make it available to the build environment:

This way, no database passwords, secrets or app keys ends up on your source control (to avoid deploying your .env
files do production, add it to your .gitignore - even though it wouldn't be a big deal as python-decouple gives precedence to
environment variables).

4. Backing services - Treat backing services as attached resources

Implementing #4 is easy if you implement #3. Since every external configuration, URL address, port binding, etc. should
be in environment variables (or .env file, for that matter), it's already treated as an attached resource. It's quite
simple to verify that, in my case. I just push code to my remote master branch and wait the Gitlab CI do its job. No need to change
anything in code or config files. Just push code.

5. Build, release, run - Strictly separate build and run stages

As described in factors #1 and #3, part of this problem is already solved. Still, there's an inherent problem since I'am
using Docker/docker-compose to deply my apps. Due to Docker way of doing things, passing environment variable values to
the build process is a little tricky.

The secret is knowing how Dockerfile and docker-compose environment tag works.

In production, I don't use virtualenv. It wouldn't make much sense since, by definition, a container is aleady an isolated
"environment". Given that, my Dockerfile is usually pretty straight forward:

FROM python:3.8

ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1

RUN mkdir /code
WORKDIR /code/
ADD . /code/

RUN pip install --upgrade pip && \
    pip install pip-tools && \
    pip-compile -q /code/requirements/requirements.in --output-file /code/requirements.txt && \
    pip install --no-cache-dir -r /code/requirements.txt

EXPOSE 8001

For everything to work right, my docker-compose.yml file looks like the following:

version: '3'

services:
  api:
    image: image_name
    build: .
    ports:
      - "8001:8001"
    volumes:
      - /var/www/sample_project/media:/code/media
    entrypoint: sh docker-entrypoint.sh
    environment:
      - DJANGO_SETTINGS_MODULE=config.settings
      - SECRET_KEY=${S_KEY}
      - DEBUG=${DEBUG}
      - ALLOWED_HOSTS=${ALLOWED_HOSTS}
      - DATABASE_NAME=${DB_NAME}
      - DATABASE_USER=${DB_USER}
      - DATABASE_PASSWORD=${DB_PASSWORD}
      - DATABASE_HOST=${DB_HOST}
      - DATABASE_PORT=${DB_PORT}

That means: docker-compose reads from the environment variables set in Gitlab CI the value of, for example,
${S_KEY} and sets it on the Docker build environment under the key SECRET_KEY. It happens that, on my
settings.py file, its the exact key python-decouple looks for.

To simplify even more:

CI environment variable --> Docker enviroment variable --> settings.py

To complete the process and follow the factor #5 premisses, your build pipeline should make a TAG on your source control
before cloning/checkout the code and build it. This way you could track how and when a release was generated.

6. Processes - The app is executed in the execution environment as one or more processes.

This one has a little to do on how we build the app and much more on how we architect it. As I mentioned before, I
mainly work with DRF (Django REST Framework) to build REST APIs for web and mobile clients.

This usually means that I rely on JWT tokens to keep "sessions" on my apps. Since no sticky sessions are needed and
my APIs don't have a meaninful state, I'm covered on this item. But, in case I needed some kind of server-side state, I
once relied on Redis to do that. Just remember to keep Redis server address, port and credentials on your configuration
environment (.env file and/or environment variables).

That all means that, if needed, this whole setup can scale horizontaly by just spanning more processes (containers in
this case).

7. Port binding - Export services via port binding

As seen on my Dockerfile and my docker-compose.yml files, I export the 8001 port to the operational system. That
means my services are accesible through that port. Although its possible and valid keep things this way, its usual to
have a proxy (a reverse proxy) in front of my services.

For that I have to configure 2 things: a gunicorn WSGI server and a Nginx proxy.

The first one, the Gunicorn server, is configured by the docker-entrypoint.sh script (notice that this script is
called by the docker-compose.yml on entrypoint tag):

#!/bin/bash
python manage.py collectstatic --noinput
python manage.py migrate

gunicorn -b 0.0.0.0:8001 -w 4 config.wsgi:application

This means that, on execution, docker-compose builds the container and run this script to start the container execution.
Gunicorn in binding on port 8001 which is exposed to the host operational system on the same port. If needed, we could
change that by changing the ports on docker-compose.yml file.

The second step is to configure a proxy server. My setup already have a Nginx server running on an VPS instance and its
not containerized. That's by design since I can move the API container to a cloud provider and have the another kind of
reverse proxy pointing to my API container (and that's why my docker-compose don't start a proxy service inside the
container).

Configuring a Nginx as a reverse proxy is very straight forward:

server {
   listen 80;
   listen [::]:80;
   server_name api.server.com;

   location / {
       proxy_read_timeout 360;
       proxy_set_header Host $host;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       proxy_pass http://127.0.0.1:8001;
   }
}

And that's it. Now requests made to port 80 gets proxied to 8001 (the container!).

8. Concurrency - Scale out via the process model

As stated before, this whole setup is based on Docker and is stateless. That means its scalable via processes. I'm not
an expert on cloud but, I bet that with minimal changes to port bindings (and network definitions on containers), this setup can run on Kubernetes or some sort
of container manager.

9. Disposability - Maximize robustness with fast startup and graceful shutdown

Again, since factor #8 is covered, this one is easy. Kill and restart the process or any container process and you still have a
new container ready to receive requests. Of course this spawning of containers must be managed by some kind of container
manager.

Just have to check how Gunicorn manages SIGTERM graceful finalizations.

10. Dev/prod parity - Keep development, staging, and production as similar as possible

I don't use Docker on my development machine. But I could as easily as in production. In fact, during the process of
building this setup, I had to validate my ideas and configurations. And, for that, I relied on running Docker and
docker-compose locally.

Notice that my database is not started inside the container as well. That's also by design. My production setup has a
dedicated database server machine. And to keep the environment parity, I also have a database installation on my local
machinei (but this could be another local container - I just happen to have an old installation already running). This
keeps things as similar as possible to my production environment and my configuration "compliant" with factor #4.

11. Logs - Treat logs as event streams

That's one missing piece on my setup. I just don't log. I mean, on my development machine I log some stuff for
debugging purposes. In production, I don't have any kind of application logs. I can, in fact, peek inside de container
and see the container and Gunicorn logs, but that's all.

Ideally I should use a third-party log service but, for now, it's still too expensive for my budget.

12. Admin processes - Run admin/management tasks as one-off processes

This one I achieve with the help of Docker. We can easily run REPL commands inside the container, e.g.:

docker exec -it <container id> python manage.py makemigrations

It may vary depending on the chosen Docker image but the idea is that, even with a leaner image (like an *-alpine one),
you can install the needed tools changing the Dockerfile file.

Conclusion

I find it nice to have some kind of best practices already defined and battle tested. Applying it to my stack was a
great way to learn some stuff new and useful. Its definitly not the end of it. Sure I can evolve this to something
better.

And by putting it together I hope I can help, or give some ideas, to someone it trouble trying to figure some of this
out.

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

DEV Community