EXPLAIN ANALYZE: The PostgreSQL Command Every Django Developer Should Know

#django #database #postgres #backend

PROFESSIONAL DJANGO ENGINEERING SERIES #6

If you have never read a query execution plan, you are flying blind on database performance. Here is how to read one — and what the warning signs look like.

Django's ORM is excellent at writing SQL. It is not excellent at telling you whether that SQL is fast. For that, you need to go one level deeper — to the database itself, and to the execution plan it generates for every query you run.

EXPLAIN ANALYZE is PostgreSQL's most powerful diagnostic tool. It shows you not just what SQL your query produces, but how the database engine plans to execute it, how long it actually takes, and — most usefully — where it is spending that time.

Most Django performance problems are database problems. Most database problems are index problems. And most index problems are invisible until you look at the query plan.

Running EXPLAIN ANALYZE from Django

Django 3.2 added QuerySet.explain(), which gives you direct access to the database execution plan:

# In the Django shell (python manage.py shell_plus):
qs = Order.objects.filter(status='paid', user_id=42).select_related('user')
print(qs.explain(verbose=True, analyze=True))

# Or with raw SQL for more control:
from django.db import connection
with connection.cursor() as cursor:
    cursor.execute(
        'EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) '
        'SELECT * FROM orders_order WHERE status = %s AND user_id = %s',
        ['paid', 42]
    )
    for row in cursor.fetchall():
        print(row[0])

Reading the Output

A typical EXPLAIN ANALYZE output looks like this:

Seq Scan on orders_order  (cost=0.00..4821.00 rows=3912 width=52)
                          (actual time=0.05..28.4 rows=3912 loops=1)
  Filter: ((status = 'paid') AND (user_id = 42))
  Rows Removed by Filter: 48231
Planning Time: 0.8 ms
Execution Time: 28.9 ms

Here is what each part means:

Seq Scan — the most important warning sign
A Seq Scan means the database read every row in the table to find the ones matching your filter. It is sometimes unavoidable on very small tables, but on a table with tens of thousands of rows, a Seq Scan on a filtered column almost always means a missing index.

Rows Removed by Filter — measuring inefficiency
In the example above, 48,231 rows were read and discarded to find 3,912 matching rows. The database is reading 12 rows for every 1 it keeps. An index on status and user_id would let the database jump directly to the matching rows without reading the discarded ones.

cost=X..Y — the planner’s estimate
The cost values are the planner's estimates in arbitrary units. The first number is startup cost (getting the first row), the second is total cost. Lower is better, but absolute numbers are less useful than the ratio between different parts of your query plan.

actual time — ground truth
The actual time values are real milliseconds from the execution. If startup time is high, your query is doing expensive initialization. If total time is high but startup time is low, the bottleneck is in processing the rows.

The Warning Signs to Look For

Seq Scan on a large table — almost always means a missing index on the filter column.
High Rows Removed by Filter — large ratio of rows read to rows kept. An index on the filter column would prevent this.
Nested Loop with a large outer side — the SQL-level manifestation of N+1. The inner side is queried once per outer row.
Sort on an un-indexed column — if you order by this column frequently, add an index.

Adding the Right Index

Once you have identified a missing index from the query plan, adding it in Django is straightforward:

# models.py
class Order(models.Model):
    user   = models.ForeignKey(settings.AUTH_USER_MODEL, ...)
    status = models.CharField(max_length=20, ...)

    class Meta:
        indexes = [
            # Composite index: supports queries filtering on both columns
            # Also supports queries on 'status' alone (leftmost prefix rule)
            models.Index(fields=['status', 'user'], name='order_status_user_idx'),

            # Partial index: only index active orders
            # Smaller, faster, more selective than a full-table index
            models.Index(
                fields=['user', 'created_at'],
                condition=Q(status__in=['pending', 'paid']),
                name='order_active_user_created_idx',
            ),
        ]

Building the Habit

The habit worth building: after you write any query that will be called frequently in production, run it through EXPLAIN ANALYZE in the shell. Look for Seq Scans on large tables. Look at the Rows Removed by Filter ratio. Look at the actual time.
This takes two minutes. It catches the performance problems that will take two hours to diagnose in production after users start complaining. The two-minute habit is always worth it.

✓ Use django-debug-toolbar for everyday development: For development workflow, django-debug-toolbar's SQL panel gives you query counts and execution times on every page load without going to the shell. Reserve EXPLAIN ANALYZE for queries that the toolbar has flagged as slow or that appear suspiciously frequently.

Database performance is not a black art. It is a skill built on a small set of fundamentals: understanding what EXPLAIN ANALYZE tells you, knowing when a Seq Scan is a problem, understanding index selection, and measuring before optimizing. EXPLAIN ANALYZE is where all of that starts.

Found this useful? There’s a full book. This article is drawn from Professional Django Engineering (Chapter 7 — Database Performance). The book covers 16 chapters across every major aspect of production Django development: architecture, ORM performance, REST APIs, security, testing, caching, async, and deployment. Every chapter follows the same pattern: real anti-patterns, professional solutions, and a pitfall reference table you can bookmark. ▶ Available on Amazon KDP. Link in comments.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.