What is UUID?
UUID (universally unique identifier) is a 128-bit label, generated with the use of the standardized algorithm, almost impossible to duplicate, which makes it a perfect identifier. It’s represented as a string that contains five groups of 32 hexadecimal digits separated by hyphens. Here is an example:
f6a7f195-8220-408c-9906-7395b870db61
Why it’s so useful?
The question arises as to why UUID is better than the typical sequential integer Primary Key (PK) generated by the database, which seems to be handled efficiently.
We have been using integer IDs for many years, as this is what databases and Django does by default. We decided to move to UUID for a few reasons.
Firstly, UUIDs are unpredictable, unlike standard integer IDs. This lack of predictability enhances security.
Secondly, consider a system usable both online and offline, where users can create new content. When the user reconnects to the internet, the new instances are merged into the database. With conventional auto-incrementing primary keys, there's a significant chance of conflicts. However, by employing UUIDs as primary keys, the likelihood of encountering such conflicts is nearly eliminated, providing a robust solution to potential problems.
Aside from that UUIDs are less prone to human error when writing queries and working with code, it's hard to spot that a wrong field has been used in a JOIN statement and using integer IDs is less likely to cause an error in cases like that. With UUIDs it's less likely.
How to migrate my data from ID to UUID while ensuring both the wolves are satisfied and the sheep remain unharmed
So, what should you do if you decide that UUIDs are necessary and want to migrate your database objects to use UUIDs as primary keys? If you’re in the early stages, without a production environment, the transition is relatively straightforward. However, if your system is already in production, hosting millions of objects with complex relationships, you’ll need to carefully consider this decision. The process can be challenging and require significant effort. But don’t worry, I’ll guide you through it to make it as smooth as possible. 🙂
Happy scenario
To illustrate the problem, imagine we have the following Order model, which represents an order placed by a customer in an online store.
class Order(models.Model):
id = models.PositiveIntegerField(unique=True)
created_at = models.DateTimeField(auto_now_add=True)
total_price = models.DecimalField(
max_digits=10,
decimal_places=2,
)
currency = models.CharField(
max_length=3,
)
The most straightforward solution, that probably came to your mind, is to add a new UUID field that will be the new primary key (PK). Let’s name it token. Populate it with unique values and make this field the new PK. Let's illustrate this process using the Order model as an example.
Note: All examples in this article are for illustration purposes; production systems will need solutions that scale with locking and batch updates.
-
Migration for adding new
tokenfield:
from django.db import migrations, models import uuid class Migration(migrations.Migration): dependencies = [ ('order', '0135_alter_order_options'), ] operations = [ migrations.AddField( model_name='order', name='token', field=models.UUIDField(null=True, unique=True, blank=True), ) ] -
Custom migration that populates the
tokenfield of existing instances with uniqueUUIDvalues.
from django.db import migrations from django.contrib.postgres.functions import RandomUUID def set_order_token_values(apps, _schema_editor): Order = apps.get_model("order", "Order") Order.objects.filter(token__isnull=True).update( token=RandomUUID() ) class Migration(migrations.Migration): dependencies = [ ('order', '0136_order_token'), ] operations = [ migrations.RunPython( set_order_token_values, migrations.RunPython.noop, ), ] -
Changing token into the primary key
from django.db import migrations, models import uuid class Migration(migrations.Migration): dependencies = [ ("order", "0137_fulfil_order_token"), ] operations = [ migrations.AlterField( model_name="order", name="token", field=models.UUIDField( default=uuid.uuid4, editable=False, primary_key=True, serialize=False, unique=True, ), ), ]
And... this will work, but EXCLUSIVELY when your model contains ONLY one to many relations.
Now, consider that the Order model has a many-to-one relationship with the Invoice model, where each Invoice represents a generated invoice.
class Invoice(models.Model):
order = models.ForeignKey(
Order,
related_name="invoices",
null=True,
on_delete=models.SET_NULL,
)
In this case, the migration would fail with the following error:
django.db.utils.InternalError: cannot drop constraint order_ordereditem_pkey on table order_order because other objects depend on it.
Why is that?
The reason is that your related models will still have primary key columns pointing to the obsolete pk values, which will no longer exist. If things seem a bit foggy, no worries! Check out the next section that explains in detail how relationships are established in the database.
How the relations are defined in your database?
Before we go on let me explain how the relations that you define in ORM are implemented in the database. If it’s clear to you, you can skip this section and go directly to So what is the problem?
The relational database consists of tables with columns and constraints that specify the rules for the data in tables. All limitations are ensured by the constraints, like uniqueness, value requirement, primary keys, and relations.
- Many-to-one
In the many-to-one relation, we have two models where the entity of model A can have multiple associations with entities of model B, but the entity of model B can have only one association with the entity of model A.
In the Django framework, it’s defined with the use of ForeignKey in model B, which points the model A. Under the hood, the new column is added to the model B, and the foreign key constraint is created. The column name is the field name with _id ending and will keep the ids of associated entities. The foreign key constraint is just the rule of the database table (model A) that says the values in a given column must match the values (in our case the id) of a column in some other table (model A). The referenced column (in our case is on mode a) must contain only unique values. So the foreign key constraint depends on the unique or primary key constraint from the relational table.
- One-to-one
In one-to-one relationship, the entity of model A can have only one association with entities of model B, and the same rule applies to model A in relation to model B. In the Django framework, it’s defined with the use of the OneToOne field on any of those models. Under the hood, the same things happened as in a many-to-one relationship, but also the unique constraint for the newly created column is added, to ensure that only one entity of model A will exist with the given field value of the pointed column in table B.
- Many-to-many
In many-to-many relation entities of model A can have multiple associations with entities of model B, and model B can have multiple associations with entities of model A. In Django framework, it’s defined with the use of the ManyToMany field on any model. Under the hood, this case is a little bit more complex than the previous ones. As we can have multiple connections from both sides of the relationship, we cannot just create a new column on one of the tables. Instead, a new table is created that keeps the ids from both sides of relations. The new table contains columns, one for each site of relations, that keep the corresponding entity field value. To ensure references, for each field the foreign key constraint is created. Additionally, a unique constraint is added to ensure that there is only one row responsible for the relationship between every two instances.
So what is the problem?
As we learned in the previous section, relationships based on foreign key constraints utilize a primary key or unique constraint. Modifying the model's pk to UUID involves recreating the primary key, which deletes the old constraint and creates a new one. If you try to change the primary key, you might encounter errors such as:
django.db.utils.InternalError: cannot drop constraint order_ordereditem_pkey on table order_order because other objects depend on it.
This error implies that the primary key constraint of the id field, which defines foreign key relations, cannot be dropped.
That's why we had to update the relations before modifying the pk.
The solution in a nutshell
To solve the issue, we will introduce a new field called old_id to keep a copy of the old Order id. Then, we will redirect our existing relationships to this new field, and create another field on related models to store the new UUID values (that will become the pk values) of related instances. After changing the ID to UUID in our target model, on the related models, we'll transform the fields that store related instances' UUID values into relational fields, and remove the old relational fields pointing to old_ids. This way, we’ll maintain the same relationships, but with pointing to new UUID values. Sounds complicated? Let’s explore the whole process in the example to make it clear.
💡 Solution Shortcut
- On the target model (e.g.
Order) create the new fields, one the UUID field (e.g.tokenfield) that will become a PK, and one for keeping the old ID (e.g.old_idfield) - Populate those fields
- Redirect each existing relation to the field with the old id (
old_idfield) - On each relation model create a new field that will keep the new UUID values of the related instances (e.g.
order_token) - Change the
UUIDfield into the primary key. - On each relation model, change the field that keeps new UUID values (
order_token) into the relation field (e.g.ForeginKey) - On each relation model, remove the old relation field
- On each relation model, rename the new relation field to the initial name
Preparation
Before delving into managing relationships, let's add an old_id field to store a copy of the previous integer primary key. We also need a token field, which will later serve as our new primary key. Let’s update the changes introduced in our happy scenario.
-
The first step is adding a new nullable
tokenandold_idfields.Here are the changes in the
models.pyfile:
class Order(models.Model): token = models.UUIDField(null=True, unique=True) old_id = models.PositiveIntegerField(unique=True, null=True, blank=True)And corresponding data migration:
from django.db import migrations, models import uuid class Migration(migrations.Migration): dependencies = [ ('order', '0135_alter_order_options'), ] operations = [ migrations.AddField( model_name='order', name='token', field=models.UUIDField(null=True, unique=True, blank=True), ), migrations.AddField( model_name='order', name='old_id', field=models.PositiveIntegerField(blank=True, null=True), ), ] -
Next, besides populating the
tokenfield of existing instances with uniqueUUIDvalues, we'll also copy the currentidto theold_idfield using a custom migration.
from django.db import migrations from django.contrib.postgres.functions import RandomUUID def set_order_token_values(apps, _schema_editor): Order = apps.get_model("order", "Order") Order.objects.filter(token__isnull=True).update( token=RandomUUID() ) def set_order_old_id(apps, schema_editor): Order = apps.get_model("order", "Order") Order.objects.all().update(old_id=F("id")) class Migration(migrations.Migration): dependencies = [ ('order', '0136_order_token_and_old_id'), ] operations = [ migrations.RunPython( set_order_token_values, reverse_code=migrations.RunPython.noop, ), migrations.RunPython( set_order_old_id, reverse_code=migrations.RunPython.noop ), ]
Handling database relations
Now, we can move to the question: what is the easiest and safest way to update the model relations?
We need to handle all relations: many to one, many to many and one to one. In all cases, we can split the changes into two parts: before model id migration to UUID, and after.
Now, let's examine each step for each relation type in detail.
Few tips at the beginning:
- Make sure that at each stage of the migration, you hold the identifier of the related objects that will allow to recreate the relationships.
- Always create a full backup and test the code before running the migrations. If something goes wrong, it might be challenging or even impossible to undo.
- Be patient. If the model you're migrating has numerous relations, preparing everything might take some time, but don’t give up it’s doable!
💡 Our goal in this data migration is to preserve the references between objects.
Before UUID migration
This phase comprises two steps:
- Redirect each existing relation to the
old_idfield. - For each relation model, create and populate a new field (
order_token) that will store the newUUIDvalues of the relatedorderinstances.
Let’s see how to apply those steps to both many-to-one, one-to-one, and many-to-many relations.
- Many-to-one and one-to-one relationship
The many-to-one and one-to-one relations are similar. The only difference is that the one-to-one relation has an additional unique constraint. Therefore, the method for handling these relations is the same.
In those cases, we need to prepare migrations on the model where the relation is defined. Let’s see the process on an example of Invoice that has Many-to-one relation with Order model:
-
The first step is to define the
UUIDfield that will hold the newUUIDvalues of related objects. In the example below, we have theInvoicemodel that is related toOrder. We're adding a new field,order_token, which will be filled with the newtokenvalue of the correspondingOrderinstances.
class Invoice(models.Model): order = models.ForeignKey( Order, related_name="invoices", null=True, blank=True, on_delete=models.SET_NULL, ) order_token = models.UUIDField(null=True)The migration as follows:
from django.db import migrations, models class Migration(migrations.Migration): dependencies = [ ("invoice", "0006_invoiceevent_app"), ] operations = [ migrations.AddField( model_name="invoice", name="order_token", field=models.UUIDField(null=True), ), ] -
The next step is to populate this new field with corresponding
tokenvalues. We could write a Python function for this, but it could be challenging to write an efficient one for a large dataset. The quickest solution is to write a simple SQL operation:
from django.db import migrations class Migration(migrations.Migration): dependencies = [ ("invoice", "0007_invoice_order_token"), ("order", "0137_fulfil_order_token_and_old_id"), ] operations = [ migrations.RunSQL( """ UPDATE invoice_invoice SET order_token = ( SELECT token FROM order_order WHERE invoice_invoice.order_id = order_order.id ) WHERE order_id IS NOT NULL; """, reverse_sql=migrations.RunSQL.noop, ), ]So, what's going on here? We're updating the
order_tokencolumn in theinvoice_invoicetable, which corresponds to theInvoicemodel. For all instances where theOrderrelation exists, we populate the neworder_tokenfield with thetokenvalue from the correspondingOrderinstance. Thistokenvalue is sourced from theorder_ordertable, where theidmatches theorder_idfrominvoice_invoicetable.Also, note that the dependencies list is extended with the order migration. We need to ensure that this migration is applied after the migration that populates the new
tokenfield in the order model.💡 To write an SQL operation, it's essential to know the table and column names. In Django, the table name is formed by joining the app name and model name with an underscore. If you are unsure about your table name, you can always check it in your database shell.
-
The final step is to change the current relation field to point to the
old_idfield. This action will drop the existing foreign key constraint that relies on the primary key constraint, and create a new one for theold_idfield. You can see this change in themodels.pyfile:
class Invoice(models.Model): order = models.ForeignKey( Order, related_name="invoices", null=True, blank=True, on_delete=models.SET_NULL, to_field="old_id", ) order_token = models.UUIDField(null=True)And corresponding migration:
from django.db import migrations, models import django.db.models.deletion class Migration(migrations.Migration): dependencies = [ ("invoice", "0008_fulfill_invoice_order_token"), ] operations = [ migrations.AlterField( model_name="invoice", name="order", field=models.ForeignKey( null=True, on_delete=django.db.models.deletion.SET_NULL, related_name="invoices", to="order.order", to_field="old_id", ), ), ] Be sure to prepare these migrations for all models that depend on a changing model.
- Many-to-many relations
For many_to_many relations, we need to perform similar steps but operate on the cross-reference table created for this relation. This involves writing an SQL operation to execute the changes.
We’ll analyze the steps in the example of the relation between GiftCard and Order.
class Order(models.Model):
gift_cards = models.ManyToManyField(
GiftCard,
blank=True,
related_name="orders"
)
-
Similar to the previous steps, firstly we will add a new column to store the new
tokenvalues.
ALTER TABLE order_order_gift_cards ADD COLUMN order_token uuid; -
Next, we aim to fill the newly added column with corresponding values. In this case, we don't need to check if the column is empty since all instances of cross-reference table must have a value.
UPDATE order_order_gift_cards SET order_token = ( SELECT token FROM order_order WHERE order_order_gift_cards.order_id = order_order.id ); -
Now, we can set the condition that the new column
order_tokenmust not be null:
ALTER TABLE order_order_gift_cards ALTER COLUMN order_token SET NOT NULL; -
The next step is to remove the old foreign constraint for the column that points to the current
idfield. You need to find the name of this constraint. As before, you can check it in thepsqlcommand line or the administration platform - pgAdmin.
ALTER TABLE order_order_gift_cards DROP CONSTRAINT order_order_gift_cards_order_id_ce5608c4_fk_order_order_id; -
The final step involves creating a new foreign key constraint for our column that stores token values.
ALTER TABLE order_order_gift_cards ADD CONSTRAINT order_order_gift_cards_order_id_fk_order_order_old_id FOREIGN KEY (order_id) REFERENCES order_order (old_id);
Now, let's consolidate everything into the migration:
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("order", "0137_fulfil_order_token_and_old_id"),
]
operations = [
migrations.RunSQL(
"""
ALTER TABLE order_order_gift_cards
ADD COLUMN order_token uuid;
UPDATE order_order_gift_cards
SET order_token = (
SELECT token
FROM order_order
WHERE order_order_gift_cards.order_id = order_order.id
);
ALTER TABLE order_order_gift_cards
ALTER COLUMN order_token SET NOT NULL;
ALTER TABLE order_order_gift_cards
DROP CONSTRAINT
order_order_gift_cards_order_id_ce5608c4_fk_order_order_id;
ALTER TABLE order_order_gift_cards
ADD CONSTRAINT order_order_gift_cards_order_id_fk_order_order_old_id
FOREIGN KEY (order_id) REFERENCES order_order (old_id);
""",
reverse_sql=migrations.RunSQL.noop,
),
]
As before, remember to create corresponding migrations for all many-to-many relationships linked to the model you're modifying.
Afterward, we are prepared to change the primary key to the UUID value.
Set UUID column as the primary key
Modifying the model pk also involves two steps, but both can be executed within a single migration.
-
First, we aim to convert our
tokenvalue into a primary key. Django will automatically eliminate the oldidfield along with the primary key constraint. The modifications in the models.py file will appear as follows:
class Order(models.Model): token = models.UUIDField( primary_key=True, editable=False, unique=True, default=uuid4 ) old_id = models.PositiveIntegerField( unique=True, null=True, blank=True ) -
Next, we will rename the
tokentoidto achieve the target state:
class Order(models.Model): id = models.UUIDField( primary_key=True, editable=False, unique=True, default=uuid4 ) old_id = models.PositiveIntegerField( unique=True, null=True, blank=True )Here is the migration that will combine both changes:
from django.db import migrations, models import uuid class Migration(migrations.Migration): dependencies = [ ("order", "0138_alter_order_gift_cards"), ("invoice", "0008_fulfill_invoice_order_token"), ] operations = [ migrations.RemoveField( model_name="order", name="id", ), migrations.AlterField( model_name="order", name="token", field=models.UUIDField( default=uuid.uuid4, editable=False, primary_key=True, serialize=False, unique=True, ), ), migrations.RenameField( model_name="order", old_name="token", new_name="id", ), ]
What’s important, the migration's dependencies list must include all migrations that set token values on new temporary fields and redirect relation to old_id. In this context, the required migration to include in the dependencies list is 0008_fulfill_invoice_order_token from the invoice app.
After UUID migration
Now we are almost done. The last step is to rewrite the relations to point at the new primary key field instead of old_id.
The steps for this phase applied to the relation models are:
- Change the field that keeps new UUID values (
order_token) into the relation field -ForeginKey - Remove the old relation field
- Rename the new relation field to the initial name
Let’s analyze those in our example.
- Many-to-one and one-to-one relationship
The steps for rewriting the many-to-one and one-to-one relationships can be combined into one migration. Each step will be explained separately, and then we'll consolidate them into a single migration.
-
Firstly, our goal is to convert all temporary fields storing instances'
tokenvalues into relational fields. Specifically, we will convert theInvoice.order_tokenUUIDfield into aForeignKey, as follows:
class Invoice(models.Model): order = models.ForeignKey( Order, related_name="invoices", null=True, blank=True, on_delete=models.SET_NULL, to_field="old_id", ) order_token = models.ForeignKey( Order, null=False, blank=False, on_delete=models.CASCADE, )Migration operation:
migrations.AlterField( model_name="invoice", name="order_token", field=models.ForeignKey( on_delete=django.db.models.deletion.CASCADE, to="order.order" ), ) -
The next step is to delete the previous relation field which was pointing to
old_id. It's no longer needed since we now have a field pointing to the newOrderprimary key. As a result, theInvoicemodel will only retain theorder_tokenfield.
class Invoice(models.Model): order_token = models.ForeignKey( Order, null=False, blank=False, on_delete=models.CASCADE, )Migration operation:
migrations.RemoveField( model_name="invoice", name="order", ) -
Next, we will rename the
order_tokenfield to match the name of the field we previously removed. So we are changingorder_tokentoorderfield.
class Invoice(models.Model): order = models.ForeignKey( Order, null=False, blank=False, on_delete=models.CASCADE, )Migration operation:
migrations.RenameField( model_name="invoice", old_name="order_token", new_name="order", ) -
The final step involves modifying our new relational field to match the original one. In our case, the only change is to set the proper
related_namevalue.
class Invoice(models.Model): order = models.ForeignKey( Order, null=False, blank=False, on_delete=models.CASCADE, related_name="invoices", )Migration operation:
migrations.AlterField( model_name="invoice", name="order", field=models.ForeignKey( on_delete=django.db.models.deletion.CASCADE, related_name="invoices", to="order.order", ), )
Bringing it all together, the migration will look as follows:
from django.db import migrations, models
import django.db.models.deletion
class Migration(migrations.Migration):
dependencies = [
("order", "0139_update_order_pk"),
("invoice", "0009_alter_invoice_order"),
]
operations = [
migrations.AlterField(
model_name="invoice",
name="order_token",
field=models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE, to="order.order"
),
),
migrations.RemoveField(
model_name="invoice",
name="order",
),
migrations.RenameField(
model_name="invoice",
old_name="order_token",
new_name="order",
),
migrations.AlterField(
model_name="invoice",
name="order",
field=models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="invoices",
to="order.order",
),
),
]
Naturally, our migration must depend on the migration from the order module that changes id to UUID. If not, we will encounter an error during the migration process.
💡 Remember to apply the changes to all corresponding models.
- Many-to-many relations
Changes to many-to-many relationships, as before, are more complex. We need to perform the same changes as for many-to-one relationships, but in the proxy table using SQL operations.
-
First, we will add the
ForeignKeyconstraint to our field that stores thetokenvalue of correspondingorderinstances. Since theidhas already been migrated toUUID, we will point to theidfield of theOrdermodel:
ALTER TABLE order_order_gift_cards ADD CONSTRAINT order_order_gift_cards_order_token_fk_order_order_id FOREIGN KEY (order_token) REFERENCES order_order (id); -
The next step is to remove the constraint that we previously created, which points to the
old_idfield.
ALTER TABLE order_order_gift_cards DROP CONSTRAINT order_order_gift_cards_order_id_fk_order_order_old_id; -
Now, we can delete the column that holds the old ID values. We wouldn't be able to do this without the previous step.
ALTER TABLE order_order_gift_cards DROP COLUMN order_id; -
Afterward, rename the column that points to the
UUIDvalues to the initial name.
ALTER TABLE order_order_gift_cards RENAME COLUMN order_token TO order_id; -
The final step involves renaming the constraint that was created in
step 1to match the new column name:
ALTER TABLE order_order_gift_cards RENAME CONSTRAINT order_order_gift_cards_order_token_fk_order_order_id TO order_order_gift_cards_order_id_ce5608c4_fk_order_order_id;
Bringing it all together, the migration will look as follows:
class Migration(migrations.Migration):
dependencies = [
("order", "0139_update_order_pk"),
]
operations = [
# rewrite order - gift cards relation - all operations are performed on
# order_order_gift_cards table which is responsible for order-gift cards
# many to many relation
# - add fk constraint to order id
# - delete constraint to order old_id
# - delete order_id column
# - rename order_token to order_id
# - rename newly added constraint
migrations.RunSQL(
"""
ALTER TABLE order_order_gift_cards
ADD CONSTRAINT order_order_gift_cards_order_token_fk_order_order_id
FOREIGN KEY (order_token) REFERENCES order_order (id);
ALTER TABLE order_order_gift_cards
DROP CONSTRAINT order_order_gift_cards_order_id_fk_order_order_old_id;
ALTER TABLE order_order_gift_cards
DROP COLUMN order_id;
ALTER TABLE order_order_gift_cards
RENAME COLUMN order_token TO order_id;
ALTER TABLE order_order_gift_cards
RENAME CONSTRAINT order_order_gift_cards_order_token_fk_order_order_id
TO order_order_gift_cards_order_id_ce5608c4_fk_order_order_id;
""",
reverse_sql=migrations.RunSQL.noop,
)
]
Again, remember to include the migrations responsible for changing the primary key in your dependencies list.
And that's it. We have returned to the primary stage with all the relations intact, but with the model id changed to UUID. ✨
At the end
At the end, you can discard old_id if it's not needed. Also, reconsider the default ordering. If your default ordering uses pk, it will no longer work as before, since the instances won't be sorted in the creation order due to the random value of UUID. If you wish to maintain the order of instance creation, consider adding a creation date to your model, if one doesn't exist, and sort by this value.
The field responsible for the creation date can be added in the same step where token and old_id are added. To preserve the old order, fill the instances with the current date and time, replacing the seconds with the old id value.
Conclusion
As you can see, there are lots of steps to do, but I hope that this tutorial helps you go smoothly through this process with an understanding of each step.
You can also check our https://github.com/saleor/saleor repository for real-life examples. We performed such operations on Order, OrderLine, and CheckoutLine models.






Top comments (0)