DEV Community

loading...

Save your Django models using update_fields for better performance

CH S Sankalp jonna
✍️ Writing sankalpjonna.com/all-posts • 👨‍💻 Building delightchat.io for D2C brands on Shopify •
Originally published at sankalpjonna.com ・2 min read

The Django ORM is designed to turn the rows of your database tables into objects that can then conform to object oriented principles. This makes it very easy to create, update and delete entries from your database.

However, there are certain advantages to using raw queries instead of an ORM. For instance when you update a row in your table, you might want to update only a subset of the columns in that row and not all of the columns.

Saving a Django model object updates all your columns every single time you call a save() method. To prevent this from happening you must be explicit.

What save() does internally

Consider a Django model called Record which has the following fields:

from django.db import models

class Record(models.Model):
  # id will be created automatically
  name = models.CharField(max_length=255)
  created_at = models.DateTimeField(auto_now_add=True)
  is_deleted = models.BooleanField(default=False)
Enter fullscreen mode Exit fullscreen mode

If you would like to update the name of a record you might do something like this

>>> record = Record.objects.get(id=1)
>>> record.name = "new record name"
>>> record.save()
Enter fullscreen mode Exit fullscreen mode

If you turn on logs in the underlying database that you are using which in my case is Postgres, the query that actually runs is this:

UPDATE "record"
SET    "name" = 'new record name',
       "created_at" = '2021-06-12T15:09:05.019020+00:00' :: timestamptz,
       "is_deleted" = FALSE
WHERE  ""id" = 1 
Enter fullscreen mode Exit fullscreen mode

This may not seem like a big deal, but what if your model consisted of 20 fields and you run a save() operation on it very frequently? 

At a certain scale the database query that updates all of your columns every time you call save() can start causing you some unnecessary overhead. 

Why is the overhead unnecessary? Because it can be prevented with a simple tweak.

Use update_fields in save()

If you would like to explicitly mention only those columns that you want to be updated, you can do so using the update_fields parameter while calling the save() method.

>>> record = Record.objects.get(id=1)
>>> record.name = "new record name"
>>> record.save(update_fields=['name'])
Enter fullscreen mode Exit fullscreen mode

The underlying query now becomes

UPDATE "record"
SET    "name" = 'new record name'
WHERE  "record"."id" = 1 
Enter fullscreen mode Exit fullscreen mode

You can also choose to update multiple columns by passing more field names in the update_fields list. 

This is clearly a more efficient way to run your queries and will save you some database overhead.

TL;DR

If you use the save() method with the intention of updating some specific columns in your database row, explicitly mention those fields by using the update_fields parameter and calling the save() method like this:

obj.save(update_fields=['field_1', 'field_2']) as opposed to just obj.save()

This will save you some database overhead by making the underlying query more efficient.

Discussion (3)

Collapse
paoloc68 profile image
Paolo Calvi

There is another important reason to use update_fields:
Using the above mentioned model let's assume that one user sets the is_deleted flag to True and another user changes the name of the "record".
Let's assume that the two events happens in the order mentioned into two separate processes, in the case of using the generic save the first operation will be unwillingly reverted by the second operation, as the second process will have in memory the is_deleted=False value which therefore will save in that state.
It is clear that a project that has a high changes of contemporary operations on certain records should consider some more robust conflict management but in most of cases writing only what is required to be updated removes most of the chances of unwanted side effects.

Collapse
sankalpjonna profile image
CH S Sankalp jonna Author

That's true. I hadn't considered this particular use case.

Collapse
jhelberg profile image
Joost Helberg

What if there are triggers? Policies? Do not use an ORM. ORM'S do not scale, 1000's maybe, 10000's maybe, but not where rdbms' get useful.