DEV Community

Quoc-Hung Hoang
Quoc-Hung Hoang

Posted on • Edited on

Upgrade django queries with F() expression

TL; DR

When you reference to a model field for read/write operations, let's use F() expression

  • Help refer to model field directly in the database, no need to load it into Python memory -> save queries.
  • Can help avoid race condition or dirty read.
  • Need to refresh_from_db after query because Python only knows about SQL expression instead of actual result.

Bulk update

Suppose the government in your country raise tax rate by 5% which makes you have to raise your listing product price by 20%. What would your django query look like ?

class Product(models.Model):

    name = models.TextField()
    price = models.DecimalField()
    in_stock = models.IntegerField(
        help_text="Number of items available in inventory"
    )
Enter fullscreen mode Exit fullscreen mode

A naive implementation of updating multiple products may be like this:

products = Product.objects.all()
for product in products:
    product.price *= 1.2
    product.save()
Enter fullscreen mode Exit fullscreen mode

In this case, you are doing SELECT price FROM product then UPDATE product SET price = new_value WHERE condition each record. It means 2 query (1 for READ and 1 for WRITE) for each object.

Think of it more carefully, we can realize that the new price is relative to the current price no matter what it is. Intuitively, we want to reference to price field of Product model when running update process.

And here it comes, F() expression. The Django official doc states:

An F() object represents the value of a model field, transformed value of a model field, or annotated column.
It makes it possible to refer to model field values and perform database operations using them without actually having to pull them out of the database into Python memory.

Let's try the problem with F() and update() queryset method

from django.db.models import F

Product.objects.update(price=F("price")*1.2)
Enter fullscreen mode Exit fullscreen mode

Although the above query looks like a normal Python assignment of value to an instance attribute, in fact it is a SQL expression. This expression instruct database to multiply the price field in database by 120 percent.

New price value is based on current price value so we don't need to load it into Python memory. That's why F() comes into play.

Update a single object

Let's say you want to update in_stock field after every order payment is completed.

A naive implementation may be like this:

def process_payment(product: Product):
    with transaction.atomic():
        payment = Payment.objects.create(product=product)
        product.in_stock = product.in_stock - 1
        product.save(update_fields=["in_stock"])
Enter fullscreen mode Exit fullscreen mode

So what's the problem ?
Let's imagine there are multiple users trying to make orders for a product, the scenario looks like this:

Process 1 Process 2 in_stock
Select in_stock -> 5 5
Select in_stock -> 5 5
Update in_stock = 5-1 4
Update in_stock = 5-1 4

In this case, two processes are updating product.in_stock at the same time but in_stock value just decrease by 1. That's incorrect.

The main issue is that you decrease in_stock based on what you fetched, what if you give database an instruction to update in_stock based on what is currently stored ?

def process_payment(product: Product):
    with transaction.atomic():
        payment = Payment.objects.create(product=product)
        product.in_stock = F("in_stock") - 1
        product.save(update_fields=["in_stock"]])
Enter fullscreen mode Exit fullscreen mode

The difference between two approach is quite small but let's look at SQL generated by update commands:

The naive approach:

UPDATE product_product
SET in_stock = 4
WHERE id = 262;
Enter fullscreen mode Exit fullscreen mode

This will decrease the quantity = 4 regardless of the current value of in_stock in database

The F() approach:

UPDATE product_product
SET in_stock = in_stock + 1
WHERE id = 262;
Enter fullscreen mode Exit fullscreen mode

The quantity of product with id 262 will reduce by 1 and not set by a fixed value. This is how to use an F expression to solve the race condition problem.

Note

The F() object which is assign to model field persist after saving model instance and will be applied on each save() so we need to refresh_from_db to get the updated instace.

Try to read an instance without refreshing from database may lead to unexpected result:

In [12]: product = Product.objects.get(id=262)

In [13]: product.in_stock = F("in_stock") - 1

In [14]: product.save()

In [15]: product.in_stock
Out[15]: <CombinedExpression: F(in_stock) - Value(1)>

In [16]: 
Enter fullscreen mode Exit fullscreen mode

Summary

Throughout the article, we pointed out two use cases of F() expression

  • Reduce the number of queries some operations require by getting the database, rather than Python, to do work.
  • Avoid race condition when two process retrieve and update the same instance.

Top comments (0)