Jakub Dubec

Posted on Nov 28, 2021 • Updated on Nov 29, 2021

Creating containers for Django apps with periodical tasks

#django #cron #docker #tasks

As a Django application developer, you quite often encountered a situation where you would like to have the ability to perform periodical asynchronous background tasks. It comes in handy if you want to create some background checks, send notifications or build cache.

Motivation

My first choice was the installation of the django-celery-beats. I was quite satisfied with the result. I was able to dynamically configure my periodic task according to the user configuration because Celery read execution configuration from the database.

On the other hand, my application now has a dependency on the celery (which has to run as a separated service) and Redis server (which celery uses as message broker).

The containerization of the application became weird. I was not sure, if I am supposed to include the celery service inside of the container or as a dependency using docker-compose. I also thought it's ridiculous that I need to have Redis instance only to perform periodical tasks. I also wanted to reduce the size of the hypervisord configuration.

I wanted to keep flexibility but reduce the number of dependencies and configuration boilerplate.

Solution

I get rid of the celery in the applications where it was not necessary (I was not using the rest of the great celery features). I just wanted to load periodicity from the django.conf.settings or the database.

I use Alpine Linux as a base image for my applications. The base operating systems already can handle the execution of the periodical tasks. It's good old crond. The configuration is described in the Alpine Linux docs.

I created a simple Django management command called setup in my application. Let us also assume I have another Django management command called popularity which is supposed to run every five minutes. In this example I will read a configuration from the Django settings variable CRON_JOBS which can look like this:

CRON_JOBS = {
    'popularity': '*/5 * * * *'
}

The variable consists of the Django management command name and periodicity pairs. The command will create a crond configuration according to it using python-crontab library. This command is supposed to be run before the first run and every time if configuration changes (keep in mind as you are already creating CRON job rules from the management command, you can use ORM to read configuration from the database).

# management/commands/setup.py
from crontab import CronTab
from django.core.management import BaseCommand

class Command(BaseCommand):
    cron = CronTab(tabfile='/etc/crontabs/root', user=True)
    cron.remove_all()

    for command, schedule in settings.CRON_JOBS.items():
        job = cron.new(command='cd /usr/src/app && python3 manage.py {}'.format(command), comment=command)
        job.setall(schedule)
        job.enable()

    cron.write()

Keep in mind, if you are loading the data from the database, you have to perform this change once again using the setup management command. You can achieve this using call_command method.

from django.core import management

def save_config():
    # do whatever you want
    management.call_method('setup')

Creating container

As I said before, my Django applications are based on the Alpine Linux containers and are executed from the entrypoint.sh which is responsible for (in mentioned order):

executing migrations,
executing our setup management command which creates an initial CRON job configuration,
and initialize the supervisord service (which will manage gunicorn and crond service).

supervisord configuration

I use supervisord to manage the execution of the gunicorn application server and the crond service.

If the application is located in the /usr/src/app directory and gunicorn is installed in /root/.local/bin/gunicorn the could supervisor.conf look like this:

[supervisord]
nodaemon=true

[program:gunicorn]
directory=/usr/src/app
command=/root/.local/bin/gunicorn -b 0.0.0.0:8000 -w 4 my_app.wsgi --log-level=debug --log-file=/var/log/gunicorn.log
autostart=true
autorestart=true
priority=900

[program:cron]
directory=/usr/src/app
command=crond -f
autostart=true
autorestart=true
priority=500
stdout_logfile=/var/log/cron.std.log
stderr_logfile=/var/log/cron.err.log

If you are interested in the details of the configuration, don't hesitate to ask me in the comments or check the hypervisord documentation.

Dockerfile

The minimal Dockerfile has to contain at least the:

copying the application source code,
installing the dependencies,
copying the configuration,
execution of the entry-point.

FROM alpine:3.15

WORKDIR /usr/src/app

# Copy source
COPY . .

# Dependencies
RUN apk add --no-cache python3 supervisor 
RUN pip3 install --user gunicorn
RUN pip3 install --user -r requirements.txt

# Configuration
COPY conf/supervisor.conf /etc/supervisord.conf

# Execution
RUN chmod +x conf/entrypoint.sh
CMD ["conf/entrypoint.sh"]

entrypoint.sh

#!/bin/sh

# Wait until the PostgreSQL is ready
until PGPASSWORD=$DATABASE_PASSWORD psql -h "$DATABASE_HOST" -U "$DATABASE_USER" -c '\q'; do
  >&2 echo "Postgres is unavailable - sleeping"
  sleep 1
done

# Execute migrations
python3 manage.py migrate
# Execute our setup management command which installs CRON jobs
python3 manage.py setup

# Execute supervisord service
supervisord -c /etc/supervisord.conf

For complete example check the EvilFlowersCatalog/EvilFlowersCatalog repository.

Top comments (2)

andreesnavarroo • Apr 15 '23

Can be used only with docker, without supervisor?

Jakub Dubec • Apr 26 '23 • Edited

Hi @andreesnavarroo, first of all: sorry I haven't noticed your comment. To answer your question: sure, there are multiple strategies on how to achieve it without supervisord. You can use OpenRC for example. You still need to run the crond somehow. Check the Alpine Linux FAQ for a brief example. I can write a small blog post about it if you wish. Solution is connected with the base operating system you are running in your container.

DEV Community

Creating containers for Django apps with periodical tasks

Motivation

Solution

Creating container

supervisord configuration

Dockerfile

entrypoint.sh

Top comments (2)

Read next

Why do I need to store environment variables in a separate file when going to production?

Understanding the differences between `reverse()` and `reverse_lazy()` in Django's `get_absolute_url()` method.

How to Dockerize Vite

django-components v0.94 - Templating is now on par with Vue or React