Django Database Migrations: A Comprehensive Overview
by Damian Hites
The Django web framework is designed to work with an SQL-based relational database backend, most commonly PostgreSQL or MySQL. If you’ve never worked directly with a relational database before, managing how your data is stored/accessed and keeping it consistent with your application code is an important skill to master.
You’ll need a contract between your database schema (how your data is laid out in your database) and your application code, so that when your application tries to access data, the data is where your application expects it to be. Django provides an abstraction for managing this contract in its ORM (Object-Relational Mapping).
Over your application’s lifetime, it’s very likely that your data needs will change. When this happens, your database schema will probably need to change as well. Effectively, your contract (in Django’s case, your Models) will need to change to reflect the new agreement, and before you can run the application, the database will need to be migrated to the new schema.
Django’s ORM comes with a system for managing these migrations to simplify the process of keeping your application code and your database schema in sync.
Django’s migration tool simplifies the manual nature of the migration process described above while taking care of tracking your migrations and the state of your database. Let’s take a look at the three-step migration process with Django’s migration tool.
In Django, the contract between your database schema and your application code is defined using the Django ORM. You define a data model using Django ORM’s models and your application code interfaces with that data model.
When you need to add data to the database or change the way the data is structured, you simply create a new model or modify an existing model in some way. Then you can make the required changes to your application code and update your unit tests, which should verify your new contract (if given enough testing coverage).
Django maintains the contract largely through its migration tool. Once you make changes to your models, Django has a simple command that will detect those changes and generate migration files for you.
Finally, Django has another simple command that will apply any unapplied migrations to the database. Run this command any time you are deploying your code to the production environment. Ideally, you’ll have deploy scripts that would run the migration command right before pushing your new code live.
Django takes care of tracking migrations for you. Each generated migration file has a unique name that serves as an identifier. When a migration is applied, Django maintains a database table for tracking applied migrations to make sure that only unapplied migrations are run.
The migration files that Django generates should be included in the same commit with their corresponding application code so that it’s never out-of-sync with your database schema.
Django has the ability to rollback to a previous migration. The auto-generated operations feature built-in support for reversing an operation. In the case of a custom operation, it’s on you to make sure the operation can be reversed to ensure that this functionality is always available.
Now that we have a basic understanding of how migrations are handled in Django, let’s look at a simple example of migrating an application from one state to the next. Let’s assume we have a Django project for our blog and we want to make some changes.
First, we want to allow for our posts to be edited before publishing to the blog. Second, we want to allow people to give feedback on each post, but we want to give them a curated list of options for that feedback. In anticipation of those options changing, we want to define them in our database rather than in the application code.
For the purposes of demonstration, we’ll setup a very basic Django project called
django-admin startproject foo
Within that project, we’ll set up our blogging application. From inside the project’s base directory:
./manage.py startapp blog
Register our new application with our project in
foo/settings.py by adding
INSTALLED_APPS = [ ... 'blog', ]
blog/models.py we can define our initial data model:
class Post(models.Model): slug = models.SlugField(max_length=50, unique=True) title = models.CharField(max_length=50) body = models.TextField()
In our simple application, the only model we have represents a blog post. It has a slug for uniquely identifying the post, a title, and the body of the post.
Now that we have our initial data model defined, we can generate the migrations that will set up our database:
Notice that the output of this command indicates that a new migration file was created at
blog/migrations/0001_initial.py containing a command to
If we open the migration file, it will look something like this:
# Generated by Django 2.2 on 2019-04-21 18:04 from django.db import migrations, models class Migration(migrations.Migration): initial = True dependencies = [ ] operations = [ migrations.CreateModel( name='Post', fields=[ ('id', models.AutoField( auto_created=True, primary_key=True, serialize=False, verbose_name='ID' )), ('slug', models.SlugField(unique=True)), ('title', models.CharField(max_length=50)), ('body', models.TextField()), ], ), ]
Most of the migration’s contents are pretty easy to make sense of. This initial migration was auto-generated, has no dependencies, and has a single operation: create the
Now let’s set up an initial SQLite database with our data model:
The default Django configuration uses SQLite3, so the above command generates a file called
db.sqlite3 in your project’s root directory. Using the SQLite3 command line interface, you can inspect the contents of the database and of certain tables.
To enter the SQLite3 command line tool run:
Once in the tool, list all tables generated by your initial migration:
Django comes with a number of initial models that will result in database tables, but the 2 that we care about right now are
blog_post, the table corresponding to our
Post Model, and
django_migrations, the table Django uses to track migrations.
Still in the SQLite3 command line tool, you can print the contents of the
sqlite> select * from django_migrations;
This will show all migrations that have run for your application. If you look through the list, you’ll find a record indicating that the
0001_initial migration was run for the blog application. This is how Django knows that your migration has been applied.
Now that the initial application is setup, let’s make changes to the data model. First, we’ll add a field called
published_on to our
Post Model. This field will be nullable. When we want to publish something, we can simply indicate when it was published.
Post Model will now be:
from django.db import models class Post(models.Model): slug = models.SlugField(max_length=50, unique=True) title = models.CharField(max_length=50) body = models.TextField() published_on = models.DateTimeField(null=True, blank=True)
Next, we want to add support for accepting feedback on our posts. We want 2 models here: one for tracking the options we display to people, and one for tracking the actual responses
from django.conf import settings from django.db import models class FeedbackOption(models.Model): slug = models.SlugField(max_length=50, unique=True) option = models.CharField(max_length=50) class PostFeedback(models.Model): user = models.ForeignKey( settings.AUTH_USER_MODEL, related_name='feedback', on_delete=models.CASCADE ) post = models.ForeignKey( 'Post', related_name='feedback', on_delete=models.CASCADE ) option = models.ForeignKey( 'FeedbackOption', related_name='feedback', on_delete=models.CASCADE )
With our model changes done, let’s generate our new migrations:
Notice that this time, the output indicates a new migration file,
blog/migrations/0002_auto_<YYYYMMDD>_<...>.py, with the following changes:
- Create model
- Add field
- Create model
These are the three changes that we introduced to our data model.
Now, if we go ahead and open the generated file, it will look something like this:
# Generated by Django 2.2 on 2019-04-21 19:31 from django.conf import settings from django.db import migrations, models import django.db.models.deletion class Migration(migrations.Migration): dependencies = [ migrations.swappable_dependency(settings.AUTH_USER_MODEL), ('blog', '0001_initial'), ] operations = [ migrations.CreateModel( name='FeedbackOption', fields=[ ('id', models.AutoField( auto_created=True, primary_key=True, serialize=False, verbose_name='ID' )), ('slug', models.SlugField(unique=True)), ('option', models.CharField(max_length=50)), ], ), migrations.AddField( model_name='post', name='published_on', field=models.DateTimeField(blank=True, null=True), ), migrations.CreateModel( name='PostFeedback', fields=[ ('id', models.AutoField( auto_created=True, primary_key=True, serialize=False, verbose_name='ID' )), ('option', models.ForeignKey( on_delete=django.db.models.deletion.CASCADE, related_name='feedback', to='blog.FeedbackOption' )), ('post', models.ForeignKey( on_delete=django.db.models.deletion.CASCADE, related_name='feedback', to='blog.Post' )), ('user', models.ForeignKey( on_delete=django.db.models.deletion.CASCADE, related_name='feedback', to=settings.AUTH_USER_MODEL )), ], ), ]
Similar to our first migration file, each operation maps to changes that we made to the data model. The main differences to note are the dependencies. Django has detected that our change relies on the first migration in the blog application and, since we depend on the auth user model, that is marked as a dependency as well.
Now that we have our migrations generated, we can apply the migrations:
The output tells us that the latest generated migration is applied. If we inspect our modified SQLite database, we’ll see that our new migration file should be in the
django_migrations table, the new tables should be present, and our new field on the
Post Model should be reflected in the
Now, if we were to deploy our changes to production, the application code and database would be updated, and we would be running the new version of our application.
In this particular example, the
blog_feedbackoption table (generated by our migration) will be empty when we push our code change. If our interface has been updated to surface these options, there is a chance that we forget to populate these when we push. Even if we don’t forget, we have the same problem as before: new objects are created in the database while the new application code is deploying, so there is very little time for the interface to show a blank list of options.
To help in scenarios where the required data is somewhat tied to the application code or to changes in the data model, Django provides utility for making data migrations. These are migration operations that simply change the data in the database rather than the table structure.
Let’s say we want to have the following feedback options: Interesting, Mildly Interesting, Not Interesting and Boring. We could put our data migration in the same migration file that we generated previously, but let’s create another migration file specifically for this data migration...
... check out the code on Kite's blog! Continue with "Bonus: Data Migrations"
Damian Hites is the CTO of Sylo, which is looking to improve Social Media Marketing by offering 3rd party trusted measurement. He has 10+ years of experience writing software and leading teams.