Arjun Adhikari

Posted on May 15, 2021 • Edited on May 23, 2022

Django Models Anti Patterns

#python #django #webdev #tutorial

Hello pals,

While working with Django, we all write code that does the job, but some code may be performing excessive computations or operations that we are unaware of. These operations may be ineffective and/or counterproductive in practice.

Here, I am going to mention some anti-patterns in Django models.

A. Using len(queryset) instead of queryset.count()

The queryset in Django are lazily evaluated which means that records in database aren't read from database until we interact with the data.
len(queryset) performs the count of database records by Python interpreter in application level. For doing so, all the records should be fetched from the database at first, which is computationally heavy operation.

Whereas, queryset.count() calculates the count at the database level and just returns the count.

For a model Post :

from django.db import models

class Post(models.Model):
    author = models.CharField(max_length=100)
    title = models.CharField(max_length=200)
    content = models.TextField()

If we use len(queryset), it handles the calculation like SELECT * FROM post which returns a list of records (queryset) and then python interpreter calculates the length of queryset which is similar to list data structure. Imagine the waste in downloading many records only to check the length and throw them away at the end! But, if we need the records after reading the length, then len(queryset) can be valid.

If we use queryset.count(), it handles the calculation like SELECT COUNT(*) FROM post at database level. It makes the code execution quicker and improves database performance.

B. Using queryset.count() instead of queryset.exists()

While we kept praising the use of queryset.count() to check the length of a queryset, using it may be performance heavy if we want to check the existence of the queryset. For the same model Post, when we want to check if there are any post written by author Arjun, we may do something like:

posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun')

if posts_by_arjun.count() > 0:
    print('Arjun writes posts here.')
else:
    print('Arjun doesnt write posts here.)

The posts_by_arjun.count() performs an SQL operation that scans every row in a database table. But, if we are just interested in If Arjun writes posts here or not ? then, more efficient code will be:

posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun')

if posts_by_arjun.exists():
    print('Arjun writes posts here.')
else:
    print('Arjun doesnt write posts here.)

posts_by_arjun.exists() returns a bool expression that finds out if at least one result exists or not. It simply reads a single record in the most optimized way (removing ordering, clearing any user-defined select_related() or distinct() methods.)

Also, checking existence / truthiness of queryset like this is inefficient.

posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun')

if posts_by_arjun:
    print('Arjun writes posts here.')
else:
    print('Arjun doesnt write posts here.)

This does the fine job in checking if there are any posts by Arjun or not but is computationally heavy for larger no of records. Hence, use of queryset.exists() is encouraged for checking existence / truthiness of querysets.

C. Using signals excessively

Django signals are great for triggering jobs based on events. But it has some valid cases, and they shouldn't be used excessively. Think of any alternative for signals within your codebase, brainstorm on its substitution and try to place signals logic in your models itself, if possible.
They are not executed asynchronously. There is no background thread or worker to execute them. If you want some background worker to do your job for you, try using celery.
As signals are spread over separate files if you're working on a larger project, they can be harder to trace for someone who is a fresh joiner to the company and that's not great. Although, django-debug-toolbar does some help in tracing the triggered signals.

Let's create a scenario where we want to keep the record of Post writings in a separate model PostWritings.

class PostWritings(models.Model):
    author = models.CharField(max_length=100, unique=True)
    posts_written = models.PositiveIntegerField(default=0)

If we want to automatically update the PostWritings record for a use based on records created on Post model, there are ways to achieve the task with / without signals.

A. With Signals

from django.db.models.signals import post_save
from django.db.models import F
from django.dispatch import receiver
from .models import Post

@receiver(sender=Post, post_save)
def post_writing_handler(sender, instance, created, **kwargs):
if created:
    writing, created = PostWritings.objects.get_or_create(author=instance.author)
    writing.update(posts_written=F('posts_written') + 1)

B. Without Signals

We need to override the save() method for Post model.

from django.db import models
from django.db.models import F

class Post(models.Model):
    author = models.CharField(max_length=100)
    title = models.CharField(max_length=200)
    content = models.TextField()

    def save(self, *args, **kwargs:
        # Overridden method.
        author = self.author
        if self.id:
            writing, created = PostWritings.objects.get_or_create(author=author)
            writing.update(posts_written=F('posts_written') + 1)
        super(Post, self).save(*args, **kwargs)

As the same job can be accomplished without signals, the code can be easily traced and prevent unnecessary event triggers.

If someone feels about not having readability on save() method here, breaking up code is always great. Let's do that.

from django.db import models
from django.db.models import F

class Post(models.Model):
    author = models.CharField(max_length=100)
    title = models.CharField(max_length=200)
    content = models.TextField()

    def _update_post_writing(self, created=False, author=None):
        if author is not None and created:
            writing, created = PostWritings.objects.get_or_create(author=author)
            writing.update(posts_written=F('posts_written') + 1)

    def save(self, *args, **kwargs:
        # Overridden method.
        author = self.author
        created = self.id is None
        super(Post, self).save(*args, **kwargs)
        self._update_post_writing(created, author)

Looks like we've learned how to mitigate some Django Model Anti Patterns. For now, thank you everyone for having me here. I'll be continuing with more stuffs about Django very soon.
You can also find me on GitHub. Till then keep coding :)

Top comments (2)

Rishit Chaudhary • May 16 '21 • Edited

Really interesting article!

I had a question: In the example shared describing the replacement of signals with overriding of the save() method:
How do we implement a similar approach to decrement the number of articles written by an author, in the case where a post is deleted by an author?

Thanks

Edit: Fixed markdown syntax

Arjun Adhikari • May 17 '21

For that we can use the delete() model method, and create the _delete_post_writing and decrement the count.