DEV Community

loading...

Write better queries using values_list in Django

CH S Sankalp jonna
✍️ Writing sankalpjonna.com/all-posts • 👨‍💻 Building delightchat.io for D2C brands on Shopify •
Originally published at sankalpjonna.com ・3 min read

The Django ORM is great but it may not always be a good idea to use it in its most primitive state as it could lead to inefficient queries.

For instance, consider a database model called Record that consists of id, name, created_at and an is_deleted flag indicating whether it has been deleted or not.

from django.db import models

class Record(models.Model):
  # id will be created automatically
  name = models.CharField(max_length=255)
  created_at = models.DateTimeField(auto_now_add=True)
  is_deleted = models.BooleanField(default=False)
Enter fullscreen mode Exit fullscreen mode

When you query using Record.objects.filter(is_deleted=False), the underlying database query looks something like this:

SELECT id,
       name,
       created_at,
       is_deleted
FROM   records
WHERE  NOT is_deleted; 
Enter fullscreen mode Exit fullscreen mode

Not only is this querying all fields of the table but it returns a queryset of model instances like this:

>>> Record.objects.filter(is_deleted=False)
<QuerySet [<Record: Record object (1)>, <Record: Record object (2)>, <Record: Record object (3)>]>
Enter fullscreen mode Exit fullscreen mode

In most cases you do not require all of the fields of your model. What is even worse is that if your Record model consists of 10 fields, this query will end up fetching all of them from the database making your query significantly slower.

Fortunately there are two very straightforward ways to make this query efficient. You could either use values or values_list.

Using values() to retrieve specific DB columns

Let’s say you want to display the names of all the records that have not been deleted. You can retrieve only the name column from the database by making a slight modification to the ORM query: 

>>> Record.objects.filter(is_deleted=False).values('name')
<QuerySet [{'name': 'First record'}, {'name': 'Second Record'}, {'name': 'Third Record'}]>
Enter fullscreen mode Exit fullscreen mode

As you can see, instead of returning a list of Record objects, this queryset returns a list of dictionaries that consist of just the name field. The underlying query now becomes:

SELECT name
FROM   records
WHERE  NOT is_deleted; 
Enter fullscreen mode Exit fullscreen mode

You can fetch more columns if you wish by providing the list of columns that you need in the values() method.

>>> Record.objects.filter(is_deleted=False).values('id', 'name')
<QuerySet [{'id': 1, 'name': 'First record'}, {'id': 2, 'name': 'Second Record'}, {'id': 3, 'name': 'Third Record'}]>
Enter fullscreen mode Exit fullscreen mode

Using values_list() to retrieve specific DB columns

This works pretty much the same way as values() except that it returns a tuple instead of a dictionary:

>>> Record.objects.filter(is_deleted=False).values_list('name')
<QuerySet [('First record',), ('Second Record',), ('Third Record',)]>
>>>
>>> Record.objects.filter(is_deleted=False).values_list('id', 'name')
<QuerySet [(1, 'First record'), (2, 'Second Record'), (3, 'Third Record')]>
>>>
Enter fullscreen mode Exit fullscreen mode

The nifty thing about values_list() is that if you only need to pass in a single field, you can use the flat=True parameter to return a list of single values instead of tuples.

>>> Record.objects.filter(is_deleted=False).values_list('name',flat=True)
<QuerySet ['First record', 'Second Record', 'Third Record']>
Enter fullscreen mode Exit fullscreen mode




Closing notes

In terms of performance, there is no difference between values() and values_list(). 

They are both intended as optimizations for retrieving a subset of data while avoiding creation of model instances and making the database query more efficient by only selecting specific fields.

The decision of which one of these to use depends on what kind of data structure you prefer to work with. 

I usually find a dictionary more intuitive than a tuple and hence prefer values() over values_list(). However if I only need one field from the query I prefer to use values_list() with flat=True.

Originally posted on my blog

Discussion (2)

Collapse
hanpari profile image
Pavel Morava

I believe that for obtaining specific column for database one can use the method only.

Official documentation mentions certain optimisation but only for records with no foreign keys when using values and valuelist methods.

Collapse
sankalpjonna profile image
CH S Sankalp jonna Author

I suppose if there are foreign keys, fetching each record would involve joining of two tables. Instead if you only request for specific columns, that join would be prevented