zchtodd

Posted on Mar 25, 2023 • Originally published at circumeo.app

Dockerizing Django: How to Create a Consistent Development Environment for Your Team

#devops #beginners #python #productivity

In this tutorial I'll show how you can create a shared and repeatable development environment that your entire team can use
while working on a Django application. We'll use Docker to containerize not only your Django application, but a PostgreSQL database,
Celery instances, and a Redis cache server. Along the way, we'll use best practices to improve developer productivity and keep configuration
maintainable.

Setting Up Docker

This article will assume we're using Ubuntu 22, but don't worry if you're on Windows or Mac. Installation instructions for all platforms
can be found on the Docker website.

To get started installing Docker for Ubuntu, we'll first need to make sure apt has the packages it needs to communicate with repositories over HTTP.

sudo apt-get update
sudo apt-get install \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

Next we'll need the official GPG key from Docker to verify the installation.

sudo mkdir -m 0755 -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

Now we'll add Docker as a repository source.

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

We should be able to issue an apt update to begin working with the new repository and then install Docker.

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Once Docker is installed, you should be able to run the following to download and run a test image.

sudo docker run hello-world

It would be nicer to run docker without root, so let's add a docker user group for that purpose. We'll also go ahead
and add

sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker

Containerizing Django

As a first step, we'll focus on writing a Dockerfile for the Django container. Dockerfiles are essentially a recipe for building
an environment that has exactly what's required for an application, no more, and no less.

Let's take a look at the Dockerfile for Django that we'll use.

FROM python:3.11-alpine

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8000
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

To break down what's happening here, let's start with the first line.

The FROM statement defines the image that we're
inheriting from, or building off of. In this case, we are starting with an image built to provide python3.11. The alpine
means that the image is very bare-bones (i.e. it won't have many common programs such as bash) and so takes up much less
disk space than some other images. This can become important, for instance, if you're running tests on a platform like
GitHub, where the image will be frequently downloaded.

The WORKDIR command sets the current directory and also creates that directory if it doesn't exist yet. Consequently,
the COPY command on the next line is relative to the /app directory we set just above. The COPY command copies
the requirements.txt file from the host machine into the Docker image. As soon as that's available, we run pip install
to make sure all the dependencies are available.

Finally, we run one more COPY command to copy over all code in the current directory to the image.

At this point, you might wonder why we copied requirements.txt specifically when we're just going to copy everything a few
steps later.

Why not write the Dockerfile like this?

FROM python:3.11-alpine

WORKDIR /app

COPY . .
RUN pip install --no-cache-dir -r requirements.txt

EXPOSE 8000
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

To be sure, this would work just fine!

The issue lies in how Docker rebuilds images. To be efficient, Docker caches images into layers. Each command
that we run creates a new layer. When rebuilding, Docker will reuse cache layers when possible. The Dockerfile layout above, however, would require
rerunning pip install when any code is touched. Copying requirements.txt and running pip install above the last COPY allows Docker to cache the
dependencies as a separate layer, meaning we can change application code without triggering a time consuming pip install process.

There are a few pitfalls to watch out for with the COPY command. While we want to copy all the code, we don't want to pull in
unnecessary files and blow up the size of the final image. In addition to wasting disk space and taking longer to build, this can also
cause frequent rebuilds if constantly changing files are pulled in. Luckily, this is exactly what the .dockerignore file helps with.

The syntax of the .dockerignore file matches .gitignore if you've written one of those. Any file that matches a pattern in
.dockerignore will be ignored, so we can use COPY without fear of dragging in unnecessary files.

Here's an example of how we can prevent compiled Python bytecode from ending up inside the image. We'll also ignore the .git directory
as well, since we definitely don't need that living inside the image either.

*.pyc
__pycache__
.git

Assuming you have a Django project and a Dockerfile similar to the above, we should be ready to try starting a Django container.
You can build the image like below, using the -t or tag argument to give the image a name.

docker build -t myawesomeapp .

You can run the Docker container and expose port 8000 to test it once the image has finished building.

docker run -p 8000:8000 myawesomeapp

With the container up and running, the Django welcome page should display when you visit http://localhost:8000 in the browser.

Building a Simple Compose File

Having the app containerized is a great first step, but there's a lot more to a development environment, like the database, caching server, and static file server.
Although Django can serve static files, I find it simpler to go ahead and mirror the production setup, with a server such as Caddy in front for static files.

We'll start with a basic docker-compose.yml that contains just the app service.

 services:
   app:
     build: .
     command: python manage.py runserver 0.0.0.0:8000
     restart: unless-stopped
     volumes:
       - $PWD:/app
     ports:                                                 
       - 8000:8000

The service is named app in this example, but it could be named anything as long as it's valid YAML syntax.

The build key specifies that we want docker-compose to trigger a rebuild of the Docker image when dependencies have changed. This is useful, as we won't have to manually rebuild an image when we make a change, like adding a new module to requirements.txt. The command key will override whatever CMD is specified in the Dockerfile. We don't have to specify command if a CMD was given in the Dockerfile but I like to include it here for clarity.

Restart defines what we'd like to happen if the container stops. By default, a container that stops is not restarted. Here we unless-stopped to declare that we'd like the container to restart unless it was manually stopped. Most importantly, this means that our services will restart after a machine reboot, which is probably what we want in a development environment.

Volumes give a container access to host directories. The left side of the volume assignment maps to the host, while the right defines where the volume should appear within the container. The $PWD is just a short-hand for the directory where docker-compose is running so that we don't have to hard-code that path, which is likely different for every developer on the team.

But what use are volumes? Mapping the code into the container is very common in development setups. Otherwise, the code is frozen in time as of the moment we built the image. Without volumes, we'd have to rebuild the image every time we changed the code.

Lastly, we have the port key. Just as we did when launching the container manually, we're specifying that port 8000 on the host should forward traffic to port 8000 within the container.

Building and Running the App Service

With our first docker-compose.yml file written, we should be ready to start up the app.

Run the command docker-compose up within the Django project root to start the app container. Since it's the first time starting the app, Compose will build an image before running the app. Once the image is built you should be left with the app running and reachable on port 8000. I prefer running Compose in the background, which you can do with the docker-compose up -d command.

You'll notice that if you add a new app to the Django project and include a "hello world" view, that the view is automatically available, without needing to rebuild the container. This is due to the volume that we defined in the docker-compose.yml file.

On the other hand, if we add djangorestframework to the requirements.txt, that won't be available until we rebuild. This is because while our code is mapped into the container, the directories where Python stores third-party code are not. We can run the docker-compose build command to rebuild the image, taking into account new additions to the requirements.txt file. This will, however, leave the old container running if we started Compose in the background. To combine building and running the container, you can use the docker-compose up --build command.

Adding the Database Service

Now we'll add a PostgreSQL instance to the docker-compose.yml file. We'd like the data to be persisted so that it will survive the container being removed, and we also need the app server to start after the database has initialized.

Let's take a look at the updated file.

services:
   app:
     build: .
     command: python manage.py runserver 0.0.0.0:8000
     restart: unless-stopped
     depends_on:
       - database
     volumes:
       - $PWD:/app
     env_file:
       - app.env
     ports:                                                 
       - 8000:8000 

   database:
     image: postgres:13
     restart: unless-stopped
     environment:
       - POSTGRES_USER=postgres
       - POSTGRES_PASSWORD=example_password
       - POSTGRES_DB=example_db
     volumes:
       - ./postgres-data:/var/lib/postgresql/data
     ports:
       - 5432:5432

There are a few new concepts here. First, we're controlling what environment variables will exist inside the containers with the env_file and environment keys. The two function very similarly, but env_file pulls the values from a separate file. This can be nice when you have several services each using an overlapping set of variables.

Here's what the app.env file referenced above should contain at this point:

POSTGRES_USER=postgres
POSTGRES_PASSWORD=example_password
POSTGRES_HOST=database
POSTGRES_PORT=5432
POSTGRES_DB=example_db

One interesting thing to note here is that the POSTGRES_HOST is set to database to match the service name. Services are automatically reachable from one container to another, with each service name added as a DNS entry.

We should now be able to modify the Django settings.py file to connect to the database.

DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.postgresql",
        "NAME": os.environ["POSTGRES_DB"],
        "USER": os.environ["POSTGRES_USER"],
        "PASSWORD": os.environ["POSTGRES_PASSWORD"],
        "HOST": os.environ["POSTGRES_HOST"],
        "PORT": os.environ["POSTGRES_PORT"],
    }
}

Assuming you've run docker-compose up -d again to create the database, it should be possible to run the Django database migrations to populate the initial schema.

docker exec -it tutorial_app_1 python manage.py migrate

The name of your app container might be different, so you should run docker ps first to see what name you'll need to use.

Adding the Caddy Service

It's typically not recommended to run Django with the development server facing the Internet, so we'll use Caddy to serve static files and pass requests to Django. As a bonus, the Caddy webserver will handle SSL setup, interacting with LetsEncrypt on our behalf.

Let's take a look at the updated docker-compose.yml and then dive into what's changed.

services:
   app:
     build: .
     command: gunicorn tutorial.wsgi:application -w 4 -b 0.0.0.0:8000
     restart: unless-stopped
     depends_on:
       - database
     volumes:
       - $PWD:/app
     env_file:
       - app.env
     ports:                                                 
       - 8000:8000    

  caddy:
    image: caddy:2.4.5-alpine
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile:ro
      - ./caddy_data:/data
      - ./caddy_config:/config
      - ./static:/static

   database:
     image: postgres:13
     restart: unless-stopped
     environment:
       - POSTGRES_USER=postgres
       - POSTGRES_PASSWORD=example_password
       - POSTGRES_DB=example_db
     volumes:
       - ./postgres-data:/var/lib/postgresql/data
     ports:
       - 5432:5432

The app service is now using Gunicorn instead of the built-in Django development server. The Caddy server has volumes for its config file and for access to a directory where we'll put static files. Caddy will use the caddy_data and caddy_config volumes to store some of its own internal state, which means we can afford to lose the container, but recreate it if need be without any problems.

In order to switch to using Gunicorn, you'll need to add gunicorn to requirements.txt and issue a docker-compose build command.

Now let's examine the Caddyfile configuration.

localhost {
  handle_path /static/* {
    root * /static
    file_server
  }

  @app {
    not path /static/*
  }

  reverse_proxy @app {
    to app:8000
  }
}

This config proxies requests to all URLs except those beginning with /static over to the app service. If you recall from earlier, we use the name app here because it matches the service name in the docker-compose.yml file. Caddy handles all requests to /static with the file_server directive, so that Django is never involved in serving static files.

Once these updates are in place, you should be able to run docker-compose up -d to get Caddy running.

If you got to localhost in your browser, however, you'll notice that the self-signed certificate is not trusted. Normally, Caddy will add its own certificates to the system trust store. Since we're running Caddy inside a container, however, it can't make that update. If you like, however, you can copy the root certificate out of the container in order to avoid the browser warnings.

sudo cp ./caddy_data/caddy/pki/authorities/local/root.crt ./

For Firefox, for example, you can go to Settings, Privacy & Security, and click View Certificates. Under the Authorities tab, you should see an Import button. Import the root.crt file we copied out of the caddy_data directory. With that added, visiting localhost should no longer raise the SSL warning.

Adding Celery Instances

Now we'll dig into adding Celery instances so we can do some background job processing with our new project.

Celery requires a broker and a result backend and supports multiple destinations for both. Redis is a popular choice here, and we'll use it as the broker and result backend for simplicity sake. Adding Redis to the existing docker-compose.yml is straightforward. I'll show that below while omitting what we've already written, since it won't change.

  redis:
    image: 'redis:7.0'
    restart: unless-stopped
    environment:
      - ALLOW_EMPTY_PASSWORD=yes
    ports:
      - 6379:6379

The ALLOW_EMPTY_PASSWORD setting should only be used in development as a convenience to avoid specifying passwords.

With redis added to the environment, we'll need to update the app.env file to give the app and worker containers enough information to connect. Add the following to the app.env to make the environment variables available to the containers.

CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0

Typically when building an app that uses Celery, I like to separate jobs into different types that are assigned to their own queues. This offers a number of advantages, such as isolating trouble with one job type to a single queue, instead of causing problems across the entire app.

Having several queues requires that we spin up a container per queue. Normally, this would result in duplication, as we'd have to redefine much of the same configuration over and over. Fortunately, we can use the templating functionality of YAML to avoid this issue.

Here's an updated docker-compose.yml that features two Celery instances.

x-worker-opts: &worker-opts
  build: .
  restart: unless-stopped
  volumes:
    - $PWD:/app
  env_file:
    - app.env
  depends_on:
    - redis

services:
  tutorial-worker1:
    command: tools/start_celery.sh -Q queue1 --concurrency=1
    <<: *worker-opts

  tutorial-worker2:
    command: tools/start_celery.sh -Q queue2 --concurrency=1
    <<: *worker-opts

The start_celery.sh shell script contains the rather lengthy full command to start Celery. The command is wrapped with watchmedo which will reload the Celery instance any time a Python file changes inside the project directory.

#!/bin/sh
watchmedo auto-restart --directory=./ --pattern=*.py --recursive -- celery -A tutorial worker "$@" --loglevel=info

The bash syntax $@ substitutes the arguments to the shell script into that position. In this case, that means inserting the queue argument into the command.

Next Steps

There is of course still much more to cover when it comes to Docker, such as adapting this setup to work with a CI/CD pipeline. Luckily, Docker and Compose have features that make running the same app in different environments fairly straightforward.

For now though, I hope this tutorial has been useful in setting up a development environment for your Django app.

Happy coding!

DEV Community

Dockerizing Django: How to Create a Consistent Development Environment for Your Team

Setting Up Docker

Containerizing Django

Building a Simple Compose File

Building and Running the App Service

Adding the Database Service

Adding the Caddy Service

Adding Celery Instances

Next Steps

Top comments (0)

Read next

VS Code Cursor Problem: Why It’s Deleting Instead of Inserting Text

Scripting: Creating Windows For Your After Effects Scripts

6 Powerful Python Techniques for Efficient Graph Processing and Analysis

Web Development Roadmap - Beginner to Intermediate