These days, I’ve been working on a legacy project that uses Django for its backend services. In this project, models are indexed with a CharField
as the primary key, which also serves as the model's ID. This ID is manually defined by the user each time, and the user must verify its existence before importing it into our application. This approach negatively impacts the user experience. Therefore, one of the tasks I was assigned is to create a new incremental integer ID, which must align with the latest Django best practices.
After that, I took a look in some recent Django reports and some discussions across various Django developer forums, The best type of incremental integer to use as a primary key (PK) for your data is the AutoField
. This type is highly recommended due to its various advantages, such as not requiring manual intervention. In addition to that, the integer automatically increments without needing a separate function to handle the incrementation.
Adding this field to our models hasn't been smooth sailing, given that we already have existing data and face restrictions on database access. Additionally, this type of field has strict requirements. In this blog, where we’ll use a simple example project inspired by my real case and focusing only on the models.py
, admin.py
, and migration files, I’m going to show you how this can be done.
Initial state
Let's start by getting familiar with the example project. Imagine we have a blog application, with the backend service responsible for managing blog comments. For the purpose of this tutorial, we'll concentrate on a specific model named Comment
. We'll begin by analyzing the initial definition of this model in the models.py
file.
models.py
from django.db import models
class Comment(models.Model):
old_id = models.CharField(max_length=1000, blank=True, unique=True, primary_key=True)
blog_id = models.IntegerField(blank=True, null=False)
user_id = models.IntegerField(blank=True, null=False)
content = models.TextField(blank=True, null=False)
def __str__(self):
return self.content
The code above shows the models.py
file for the Comment
model. This model currently has four fields:
-
old_id
, aCharField
that is being used as the primary key but will be replaced with a new ID; -
blog_id
, which links the comment to the corresponding blog post using its numeric ID; -
user_id
, which identifies the user who wrote the comment by their numeric ID; - And
content
, which holds the text of the comment.
For the admin.py
file, we only have our model added to the admin page using admin.site.register
, as shown below. This file will remain unchanged.
admin.py
from django.contrib import admin
from .models import Comment
admin.site.register(Comment)
When we run the application and navigate to the Comment
model section in the admin page (e.g., http://www.localhost:8000/admin/myapp/comment/), we can see that there are already three entries in this model, as illustrated below.
Preparation Guidelines : Tips
Before moving forward, I’d like to share with you these three important tips to help you avoid serious errors.
Backup data
Taking a data backup before any operation in development is crucial, especially when there is only a single environment (development, test, and production combined). This ensures that if something goes wrong during the operation, such as data corruption or accidental deletion, you can quickly restore the original state, minimizing downtime and preventing potential loss of valuable information. It acts as a safeguard against unforeseen issues and helps maintain the integrity of the environment.
In Django, you can back up and restore data directly without using the database engine’s command line interface. Here’s how you can do it:
To back up your data, use the following command:
backup command
python manage.py dumpdata > backup.json
And to restore the data, use this command:
restore command
python manage.py loaddata backup.json
Enable DEBUG Mode for effective troubleshooting
Assigning DEBUG = True
in the settings.py
file the project is important during operations because it enables detailed error messages and debugging information, which are essential for identifying and fixing issues quickly. It provides visibility into the application's behavior, making it easier to trace problems. BUT, it should only be used in a non-production environment, as it can expose sensitive information.
Shutting Down Django Server
Turning off the Django application server during migration operations is crucial to prevent users from interacting with the database while it's being altered. This reduces the risk of data corruption or conflicts that can arise if the application tries to read or write data during the migration process. So ensuring the server is down during migrations helps maintain the integrity of the database and ensures that the migration completes without issues.
Operation steps
Now that we have a clearer understanding of the goal, let’s dive into the steps we must take to achieve it. Be sure not to skip any instructions, as that could lead to some errors.
Instructions 1 : Add Integer field to your model
In our model definition (models.py
file), we are going to add an IntegerField
which will take the name of the new ID we want to introduce. In our case we’ll name it new_id
.
After making this change, your models.py
file should look like this :
models.py
from django.db import models
class Comment(models.Model):
new_id = models.IntegerField(default=0, unique=True) # THIS THE NEW FIELD
old_id = models.CharField(max_length=1000, blank=True, unique=True, primary_key=True)
blog_id = models.IntegerField(blank=True, null=False)
user_id = models.IntegerField(blank=True, null=False)
content = models.TextField(blank=True, null=False)
def __str__(self):
return self.content
Run the makemigrations
command. This will generate a migration file with an AddField
operation as shown below.
0011_comment_new_id.py
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('myapp', '0010_comment'),
]
operations = [
migrations.AddField(
model_name='comment',
name='new_id',
field=models.IntegerField(default=0, unique=True),
preserve_default=False,
),
]
Generate two empty migration files for the same app by running makemigrations myapp --empty
twice (where myapp
is the name of the application where the Comment
model is defined). It’s recommended to rename the migration files to give them meaningful names, as shown in the examples below.
Now, we’ll copy the AddField
operation from the first of the three new generated migration files to the last migration (the third files), and we’ll change AddField
to AlterField
(don’t forget to import models
). The result in our case looks like:
0013_add_new_id_field.py
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('myapp', '0012_remove_new_id_null'),
]
operations = [
migrations.AlterField(
model_name='comment',
name='new_id',
field=models.IntegerField(default=0, unique=True),
preserve_default=False,
),
]
In the first migration file, change unique=True
to null=True
. This will allow the creation of an intermediary null field and defer the unique constraint until we've populated all rows with unique values. The first migration file should look similar to this:
0011_comment_new_id.py
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('myapp', '0010_comment'),
]
operations = [
migrations.AddField(
model_name='comment',
name='new_id',
field=models.IntegerField(default=0, null=True),
preserve_default=False,
),
]
In the first empty migration file (which matches with the second of the three last migration files), we’ll define a new function that will be used inside a RunPython
operation to generate a unique integer value for each existing row. For example:
0012_remove_new_id_null.py
from django.db import migrations
def gen_new_id(apps, schema_editor):
MyModel = apps.get_model("myapp", "Comment")
for i, row in enumerate(MyModel.objects.all()):
row.new_id = i
row.save(update_fields=["new_id"])
print(f"Change row new_id with the content {row.content} to {i}")
class Migration(migrations.Migration):
dependencies = [
('newsletter', '0011_comment_new_id'),
]
operations = [
migrations.RunPython(gen_new_id, reverse_code=migrations.RunPython.noop),
]
Now apply the migrations directly with the migrate
command.
N.B. : Don't forget to update the migration's name in the dependencies
dictionary.
N.B. : Make sure that the preserve_default
option in both the first and last migration operations is set to False
.
You must get in your terminal an output similar to this :
Instructions 2 : Replace IntegerField with AutoField
Up to this point, we've only assigned a unique integer to all existing rows of our Comment
model. The upcoming instructions will guide us in achieving our goal: converting the new_id
field type into an AutoField
.
To do that, we need to change the type of the new_id
field from models.IntegerField(default=0, unique=True)
to models.AutoField(unique=True, null=False, primary_key=True)
. Additionally, ensure that old_id
is no longer the primary key of this model (i.e., the primary_key
option should not be present). The final result must look like :
models.py
from django.db import models
class Comment(models.Model):
new_id = models.AutoField(unique=True, null=False, primary_key=True) # THIS THE NEW FIELD
old_id = models.CharField(max_length=1000, blank=True, unique=True)
blog_id = models.IntegerField(blank=True, null=False)
user_id = models.IntegerField(blank=True, null=False)
content = models.TextField(blank=True, null=False)
def __str__(self):
return self.content
Now we can apply the migration commands: makemigrations
and migrate
. We must ensure that the last migration file contains only two AlterField
operations. After running the application server, navigate to any entry of the Comment
model in the admin page. You should find that the new field (new_id
) is not displayed like the other fields, indicating that it has become an auto-incrementing field and the primary key for the Comment
table.
Our final step to get the application running smoothly is to try adding a new row to the model. Initially, this will cause an error because Django doesn't yet know where to start assigning the AutoField
value. It will attempt to assign the value 1
, which already exists for the second row.
To fix this, simply return to the admin page and try adding a new row again.
Now, if we go back to the model section in the Django admin page, we can see the magic; the row we just tried to add must be available, which means it has been successfully added.
Then, if you want to keep only the new ID field, you can easily remove the old ID field (old_id
) from the Comment
model definition and then apply the migration commands.
Conclusion
In this article, we explored the process of converting an existing CharField
primary key into an AutoField
in a Django model while preserving the integrity of the existing data. By carefully following each step, we were able to successfully implement an auto-incrementing integer primary key, which streamlines future data management and improves overall performance.
This method is essential for maintaining a clean and efficient database structure, especially when working with legacy projects and restricted database access. By adopting these practices, you can ensure that your Django applications remain scalable and adaptable as they evolve.
I hope this tutorial has provided you with valuable insights and practical steps for managing similar tasks in your own projects. If you have any questions or thoughts about the topic, please feel free to share them in the comments section below.
Top comments (2)
Very interesting.
Ty Anass !