DEV Community

Cover image for Find and Fix N+1 Queries in Django Using AppSignal
Nik Tomazic for AppSignal

Posted on • Originally published at blog.appsignal.com

Find and Fix N+1 Queries in Django Using AppSignal

In this article, you'll learn about N+1 queries, how to detect them with AppSignal, and how to fix them to speed up your Django apps significantly.

We'll start with the theoretical aspects and then move on to practical examples. The practical examples will mirror scenarios you might encounter in a production environment.

Let's get started!

What Are N+1 Queries?

The N+1 query problem is a prevalent performance issue in web applications that interact with a database. These queries can cause significant bottlenecks, which intensify as your database grows.

The problem occurs when you retrieve a collection of objects and then access the related objects for each item in the collection. For instance, fetching a list of books requires a single query (1 query), but accessing the author for each book triggers an additional query for every item (N queries).

N+1 problems can also occur when creating or updating data in a database. For example, iterating through a loop to create or update objects individually, rather than using methods like bulk_create() or bulk_update(), can result in excessive queries.

N+1 queries are highly inefficient because executing numerous small queries is significantly slower and more resource-intensive than consolidating operations into fewer, larger queries.

Django's default QuerySet behavior can inadvertently lead to N+1 issues, especially if you're unaware of how QuerySets work. Querysets in Django are lazy, meaning no database queries are executed until the QuerySet is evaluated.

Prerequisites

Ensure you have:

Note: The source code for this project can be found in the appsignal-django-n-plus-one GitHub repository.

Project Setup

We'll work with a book management web app. The web app is built to demonstrate the N+1 query problem and how to resolve it.

Start by cloning the base branch of the GitHub repo:

$ git clone git@github.com:duplxey/appsignal-django-n-plus-one.git \
    --single-branch --branch base && cd appsignal-django-n-plus-one
Enter fullscreen mode Exit fullscreen mode

Next, create and activate a virtual environment:

$ python3 -m venv venv && source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Install the requirements:

(venv)$ pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Migrate and populate the database:

(venv)$ python manage.py migrate
(venv)$ python manage.py populate_db
Enter fullscreen mode Exit fullscreen mode

Lastly, start the development server:

(venv)$ python manage.py runserver
Enter fullscreen mode Exit fullscreen mode

Open your favorite web browser and navigate to http://localhost:8000/books. The web app should return a JSON list of 500 books from the database.

The Django admin site is accessible at http://localhost:8000/admin. The admin credentials are:

user: username
pass: password
Enter fullscreen mode Exit fullscreen mode

Install AppSignal for Django

To install AppSignal on your Django project, follow the official docs:

Ensure everything works by restarting the development server:

(venv)$ python manage.py runserver
Enter fullscreen mode Exit fullscreen mode

Your app should automatically send a demo error to AppSignal. From this point forward, all your errors will be sent to AppSignal. Additionally, AppSignal will monitor your app's performance and detect any issues.

Web App Logic

The prerequisite to fixing N+1 queries is understanding your app's database schema. Pay close attention to your models' relationships: they can help you pinpoint potential N+1 problems.

Models

The web app has two models โ€” Author and Book โ€” which share a one-to-many (1:M) relationship. This means each book is associated with a single author, while an author can be linked to multiple books.

Both models have a to_dict() method for serializing model instances to JSON. On top of that, the Book model uses deep serialization (serializing the book as well as the book's author).

The models are defined in books/models.py:

# books/models.py
class Author(models.Model):
    first_name = models.CharField(max_length=64)
    last_name = models.CharField(max_length=64)
    birth_date = models.DateField()

    def full_name(self):
        return f"{self.first_name} {self.last_name}"

    def to_dict(self):
        return {
            "id": self.id,
            "first_name": self.first_name,
            "last_name": self.last_name,
            "birth_date": self.birth_date,
        }

    def __str__(self):
        return f"{self.first_name} {self.last_name}"


class Book(models.Model):
    title = models.CharField(max_length=128)
    author = models.ForeignKey(
        to=Author,
        related_name="books",
        on_delete=models.CASCADE,
    )
    summary = models.TextField(max_length=512, blank=True, null=True)
    isbn = models.CharField(max_length=13, unique=True, help_text="ISBN-13")
    published_at = models.DateField()

    def to_dict(self):
        return {
            "id": self.id,
            "title": self.title,
            "author": self.author.to_dict(),
            "summary": self.summary,
            "isbn": self.isbn,
            "published_at": self.published_at,
        }

    def __str__(self):
        return f"{self.author}: {self.title}"
Enter fullscreen mode Exit fullscreen mode

They are then registered for the Django admin site in books/admin.py, like so:

# books/admin.py
class BookInline(admin.TabularInline):
    model = Book
    extra = 0


class AuthorAdmin(admin.ModelAdmin):
    list_display = ["full_name", "birth_date"]
    inlines = [BookInline]


class BookAdmin(admin.ModelAdmin):
    list_display = ["title", "author", "published_at"]


admin.site.register(Author, AuthorAdmin)
admin.site.register(Book, BookAdmin)
Enter fullscreen mode Exit fullscreen mode

Notice that AuthorAdmin uses BookInline to display the author's books within the author's admin page.

Views

The web app provides the following endpoints:

  1. /books/ returns the list of books
  2. /books/<book_id>/ returns a specific book
  3. /books/by-authors/ returns a list of books grouped by authors
  4. /books/authors/ returns the list of authors
  5. /books/authors/<author_id>/ returns a specific author

The links above are clickable if you have the development web server running.

And they're defined in books/views.py like so:

# books/views.py
def book_list_view(request):
    books = Book.objects.all()
    return JsonResponse(
        {
            "count": books.count(),
            "results": [book.to_dict() for book in books],
        }
    )


def book_details_view(request, book_id):
    try:
        book = Book.objects.get(id=book_id)
        return JsonResponse(book.to_dict())
    except Book.DoesNotExist:
        return JsonResponse({"error": "Book not found"}, status=404)


def book_by_author_list_view(request):
    try:
        authors = Author.objects.all()
        return JsonResponse(
            {
                "count": authors.count(),
                "results": [
                    {
                        "author": author.to_dict(),
                        "books": [book.to_dict() for book in author.books.all()],
                    }
                    for author in authors
                ],
            }
        )
    except Author.DoesNotExist:
        return JsonResponse({"error": "Author not found"}, status=404)


def author_list_view(request):
    authors = Author.objects.all()
    return JsonResponse(
        {
            "count": authors.count(),
            "results": [author.to_dict() for author in authors],
        }
    )


def author_details_view(request, author_id):
    try:
        author = Author.objects.get(id=author_id)
        return JsonResponse(author.to_dict())
    except Author.DoesNotExist:
        return JsonResponse({"error": "Author not found"}, status=404)
Enter fullscreen mode Exit fullscreen mode

Great, you now know how the web app works!

In the next section, we'll benchmark our app to detect N+1 queries with AppSignal and then modify the code to eliminate them.

Detect N+1 Queries in Your Django App with AppSignal

Detecting performance issues with AppSignal is easy. All you have to do is use/test the app as you normally would (for example, perform end-user testing by visiting all the endpoints and validating the responses).

When an endpoint is hit, AppSignal will create a performance report for it and group all related visits together. Each visit will be recorded as a sample in the endpoint's report.

Detect N+1 Queries in Views

Firstly, visit all your app's endpoints to generate the performance reports:

  1. /books/
  2. /books/<book_id>/
  3. /books/by-authors/
  4. /books/authors/
  5. /books/authors/<author_id>/

Next, let's use the AppSignal dashboard to analyze slow endpoints.

Example 1: One-To-One Relationship (select_related())

Navigate to your AppSignal app and select Performance > Issue list on the sidebar. Then click Mean to sort the issues by descending mean response time.

AppSignal Performance Issue List

Click on the slowest endpoint (books/) to view its details.

AppSignal Performance Issue List Details

Looking at the latest sample, we can see that this endpoint returns a response in 1090 milliseconds. The group breakdown shows that SQLite takes 651 milliseconds while Django takes 439.

This indicates a problem because an endpoint as simple as this shouldn't take as long.

To get more details on what happened, select Samples in the sidebar and then the latest sample.

AppSignal Performance Issue List Details Samples

Scroll down to the Event Timeline to see what SQL queries got executed.

AppSignal Performance Sample Events Timeline

Hovering over the query.sql text displays the actual SQL query.

More than 1000 queries were executed:

SELECT * FROM "books_book";
SELECT * FROM "books_author" WHERE "books_author"."id" = 3; -- 3= 1st book's author id
SELECT * FROM "books_author" WHERE "books_author"."id" = 6; -- 6= 2nd book's author id
...
SELECT * FROM "books_author" WHERE "books_author"."id" = n; -- n= n-th book's author id
Enter fullscreen mode Exit fullscreen mode

These are a clear sign of N+1 queries. The first query fetched a book (1), and each subsequent query fetched the book's author's details (N).

To fix it, navigate to books/views.py and modify book_list_view() like so:

# books/views.py
def book_list_view(request):
    books = Book.objects.all().select_related("author")  # modified
    return JsonResponse(
        {
            "count": books.count(),
            "results": [book.to_dict() for book in books],
        }
    )
Enter fullscreen mode Exit fullscreen mode

By utilizing Django's select_related() method, we select the additional related object data (i.e., author) in the initial query. The ORM will now leverage a SQL join, and the final query will look something like this:

SELECT * FROM "books_book"
    INNER JOIN "books_author" ON ("books_book"."author_id" = "books_author"."id")
Enter fullscreen mode Exit fullscreen mode

Wait for the development server to restart and retest the affected endpoint.

AppSignal After Fix Benchmark

After benchmarking again, the response time goes from 1090 to 45, and the number of queries lowers from 1024 to 2. This is a 24x and 512x improvement, respectively.

Example 2: Many-To-One Relationship (prefetch_related())

Next, let's look at the second slowest endpoint (books/by-authors/).

Use the dashboard as we did in the previous step to inspect the endpoint's SQL queries. You'll notice a similar but less severe N+1 pattern with this endpoint.

This endpoint's performance is less severe because Django is smart enough to cache the frequently executed SQL queries, i.e., repeatedly fetching the author of a book. Check out the official docs to learn more about Django caching.

Let's utilize prefetch_related() in books/views.py to speed up the endpoint:

# books/views.py
def book_by_author_list_view(request):
    try:
        authors = Author.objects.all().prefetch_related("books")  # modified
        return JsonResponse(
            {
                "count": authors.count(),
                "results": [
                    {
                        "author": author.to_dict(),
                        "books": [book.to_dict() for book in author.books.all()],
                    }
                    for author in authors
                ],
            }
        )
    except Author.DoesNotExist:
        return JsonResponse({"error": "Author not found"}, status=404)
Enter fullscreen mode Exit fullscreen mode

In the previous section, we used the select_related() method to handle a one-to-one relationship (each book has a single author). However, in this case, we're handling a one-to-many relationship (an author can have multiple books), so we must use prefetch_related().

The difference between these two methods is that select_related() works on the SQL level, while prefetch_related() optimizes on the Python level. The latter method can also be used for many-to-many relationships.

For more information, check out Django's official docs on prefetch_related().

After benchmarking, the response time goes from 90 to 44 milliseconds, and the number of queries lowers from 32 to 4.

Detect N+1 Queries in Django Admin

Discovering N+1 queries in the Django admin site works similarly.

First, log in to your admin site and generate performance reports (for example, create a few authors or books, update, and delete them).

Next, navigate to your AppSignal app dashboard, this time filtering the issues by admin:

AppSignal Performance Issue List Admin

In my case, the two slowest endpoints are:

  1. /admin/login
  2. /admin/books/author/<object_id>

We can't do much about /admin/login, since it's entirely handled by Django, so let's focus on the second slowest endpoint. Inspecting it will reveal an N+1 query problem. The author is fetched separately for each book.

To fix this, override get_queryset() in BookInline to fetch author details in the initial query:

# books/admin.py
class BookInline(admin.TabularInline):
    model = Book
    extra = 0

    def get_queryset(self, request):
        queryset = super().get_queryset(request)
        return queryset.select_related("author")
Enter fullscreen mode Exit fullscreen mode

Benchmark once again and verify that the number of queries has decreased.

Wrapping Up

In this post, we've discussed detecting and fixing N+1 queries in Django using AppSignal.

Leveraging what you've learned here can help you significantly speed up your Django web apps.

The two most essential methods to keep in mind are select_related() and prefetch_related(). The first is used for one-to-one relationships, and the second is for one-to-many and many-to-many relationships.

Happy coding!

P.S. If you'd like to read Python posts as soon as they get off the press, subscribe to our Python Wizardry newsletter and never miss a single post!

Top comments (0)