DEV Community

Lucas Magnum
Lucas Magnum

Posted on

#DjangoTip: Select & Prefetch Related

Today the #DjangoTip will be about using select_related and prefetch_related to improve our queries performance.

You can clone this project from [http://github.com/LucasMagnum/django-tip-02](http://github.com/LucasMagnum/django-tip-02) and follow the examples below.

It takes time to fully understand how select_related and prefetch_related works, so keep looking for other resources and I hope this post helps you to understand a bit more. Code and have fun ❤

Before we start, let me ask you one question; Did you clapped today? I would love to hear your claps and read your feedback!

Product model

This will be the model used in all examples below, so we can always come back here and read it again.

https://github.com/LucasMagnum/django-tip-02/blob/master/app/products/models.py

It’s pretty simple, right? The Product has a foreign key relationship with the Category and the Category has one many to many relationships with itself.

Let’s start, shall we?

List of products

Let’s start creating a list of products function, this function will return one list of dicts with the fields: id, title and category name:

https://github.com/LucasMagnum/django-tip-02/blob/master/app/products/queries.py#L128

For test our function, we need products into our database. We have one command to load products into the database, just execute the load_products command and 500 products and 50 categories will be inserted into our database.

python app/manage.py load\_products
python app/manage.py load_products

Now we open our shell running python app/manage.py shell and test our function:

We have all our products, but what’s the problem with this function?
Let’s see how many queries this function does into the database:

Note: I created one debugger decorator to see how many queries each example takes, you don’t have to understand how it works, just have in mind that it shows the number of the queries and the time in seconds for each example :p

[https://gist.github.com/LucasMagnum/81a122b7317baccf7c6bb4c9c549c928](https://gist.github.com/LucasMagnum/81a122b7317baccf7c6bb4c9c549c928)
https://gist.github.com/LucasMagnum/81a122b7317baccf7c6bb4c9c549c928

Wow! 501 queries for that simple piece of code?? What happened?

For each time we access one foreign key that is not in cache, another
query will be made to retrieve the value.

In our case, we accessing the category inside a loop, one query will be made for each time the loop executes.

for product in product_qs:
   products.append({
       'id': product.id,
       'title': product.title,
       'category': product.category.name
   })

Enter fullscreen mode Exit fullscreen mode

To avoid this high number of queries when accessing foreign keys or one to one fields we can use the select_related method.

Select related

We can use the select_related method to load the foreign key values and cache those results for each object. By default Django ORM doesn’t make a JOIN for our results, so each time we need to access the foreign key another query will be made.

Let’s change our code to use the select_related with our products queryset:

https://github.com/LucasMagnum/django-tip-02/blob/master/app/products/queries.py#L148

Now we can verify that these two functions return the same values and see how many queries the products_list_select_related does:

[https://gist.github.com/LucasMagnum/5d5e54e80a7ec4b1c1b399e65081a4c4](https://gist.github.com/LucasMagnum/5d5e54e80a7ec4b1c1b399e65081a4c4)
https://gist.github.com/LucasMagnum/5d5e54e80a7ec4b1c1b399e65081a4c4

Congrats! Now we did only 1 query instead of 501, it’s a real improvement :D

List of categories

It was an amazing improvement on the last function, now we will create a list of categories function, this will help us to understand the behavior of the prefetch_related method.

Let’s create our categories list function to retrieve all categories and their subcategories. This function should return the name, is_active and a subcategories list for each category.

https://github.com/LucasMagnum/django-tip-02/blob/master/app/products/queries.py#L7

If we run this function in the debugger mode, we will see how many queries this function is doing:

[https://gist.github.com/LucasMagnum/b35cb3d6158644ebbc40ca7e345896ed](https://gist.github.com/LucasMagnum/b35cb3d6158644ebbc40ca7e345896ed)
https://gist.github.com/LucasMagnum/b35cb3d6158644ebbc40ca7e345896ed

Ok, we saw this problem before, we just need to add the select_related and we will end up with just one query, right? Err, No!

If we try to change the categories_qs to use select_related we will receive an error from Django:

Exception when trying to use select\_related in a many to many field
Exception when trying to use select_related in a many to many field

We can not use the select_related with many to many relationships, do you remember? To improve our queries we need to use a new method called prefetch_related .

Prefetch related

When we are using many to many relationships we could use prefetch_related method to booster our queries =D

Let’s create another function using the prefetch_related to see our improvements:

Categories list using prefetch related to improve our queries.

We just changed the line 5 to use prefetch_related on subcategories field. Now let’s see how many queries this new function is doing:

[https://gist.github.com/LucasMagnum/efe74895552ebd24c5897db90b81712a](https://gist.github.com/LucasMagnum/efe74895552ebd24c5897db90b81712a)
https://gist.github.com/LucasMagnum/efe74895552ebd24c5897db90b81712a

Good job! We improved the performance of our query, but wait for a second, why we have two queries?

Different from select_related the prefetch_related made the JOIN using Python rather than in the database.

In this case, Django made two queries and then join the results into one queryset for us.

Let’s create a new function that will return only the active subcategories using the prefetch_related:

Categories list using the prefetch related function.

Now, when we run the debugger for this function, we see:

[https://gist.github.com/LucasMagnum/f5022fc09830a21c534e8ca1f2b8bd92](https://gist.github.com/LucasMagnum/f5022fc09830a21c534e8ca1f2b8bd92)
https://gist.github.com/LucasMagnum/f5022fc09830a21c534e8ca1f2b8bd92

WOW! We increased the number of queries, how it’s possible?????
Calm down, there is a reason for this:

When we use the prefetch related we are saying to Django we want all the results to be JOINED, but when we use the filter(is_active=True) we are changing the primary query and then Django doesn’t JOIN the right results for us.

That’s why we have 52 queries, 51 queries iterating over the categories and 1 query to get all the results in prefetch.

Let’s see how to solve this problem using the new Prefetch introduced recently by Django (;

Using Prefetch with to_attr

Prefetch using to_attr to return prefetch related values

There’s two changes in this new function, I changed the categories_qs query to use the custom Prefetch and changed the loop to use a new attribute active_subcategories.

Let’s understand what it is doing:

Category.objects.prefetch_related(
    Prefetch(
         'subcategories',
          queryset=Category.objects.filter(is_active=True),
          to_attr='active_subcategories'
    )
)
Enter fullscreen mode Exit fullscreen mode

We are prefetching all values from subcategoriesfield, using the Category.objects.filter(is_active=True) as base queryset and passing telling Django we want all the prefetched values into the attribute active_subcategories. This will create an attribute active_subcategories for each category returned by our queryset.

We can see it in our django shell :

[https://gist.github.com/LucasMagnum/a13afb6ad657136d7c17f5e53d2a6f68](https://gist.github.com/LucasMagnum/a13afb6ad657136d7c17f5e53d2a6f68)
https://gist.github.com/LucasMagnum/a13afb6ad657136d7c17f5e53d2a6f68

As we can see, when we try to access the active_subcategories without making the prefetch related we will receive an error.

We need to use the new Prefetch to set the new attribute for our use.

Now we will make a new query to prefetch all the results;

YEES!! Now we can understand a bit better how the new Prefetch works.

Let’s debug our function now:

[https://gist.github.com/LucasMagnum/f6060fe8bc0a1390aed0e27439eb029b](https://gist.github.com/LucasMagnum/f6060fe8bc0a1390aed0e27439eb029b)
https://gist.github.com/LucasMagnum/f6060fe8bc0a1390aed0e27439eb029b

Great!! Now we have only two queries and not 52 =D

Let’s how we could achieve the same result without using the to_attr parameter.

Using Prefetch without to_attr

If we don’t want to create a new attribute, we can reuse the attribute from the field and just change our base query:

https://gist.github.com/LucasMagnum/21074818968d99d7507d24526fa63818

Here, we are using the Prefetch without specifying an attribute.
When we are accessing the category.subcategories we are using the values from the Prefetch, so we don’t need to filter the is_active here because the results will come filtered =)

And if we debug this function, we still have only two queries:

[https://gist.github.com/LucasMagnum/5bda88c72978ad4679b8062a5cd7d6b0](https://gist.github.com/LucasMagnum/5bda88c72978ad4679b8062a5cd7d6b0)
https://gist.github.com/LucasMagnum/5bda88c72978ad4679b8062a5cd7d6b0

We can check if our functions are returning the same values:

[https://gist.github.com/LucasMagnum/65425f73807da299b08acac8ef393daa](https://gist.github.com/LucasMagnum/65425f73807da299b08acac8ef393daa)
https://gist.github.com/LucasMagnum/65425f73807da299b08acac8ef393daa

FANTASTIC!!! Everything is working and I hope that you learned a bit more about the new Prefetch today =)

TL;DR;

  • Select related & Prefetch related are performance boosters
  • Select related does one query to JOIN the tables
  • Select related could be used with ForeignKey and OneToOneField
  • Select related doesn’t work with ManyToMany fields
  • Prefetch related makes two queries and do the JOIN using Python
  • Prefetch related loses it effect when you change the base query

Useful links

See you later, keep coding and have fun ❤

Top comments (0)