sribalu

Posted on Mar 31

Django REST API with PostgreSQL — Advanced Queries & Optimization

#django #python #postgres #backend

Introduction
Building a REST API with Django REST Framework (DRF) is straightforward — but making it fast and scalable with PostgreSQL requires understanding advanced query techniques. In this guide, you'll learn how to write optimized queries, avoid common pitfalls like the N+1 problem, use PostgreSQL-specific features, and measure performance.

Prerequisites

Django + DRF project set up
PostgreSQL connected via psycopg2
Basic understanding of Django ORM

Project Setup We'll use a simple blogging API with these models: python# models.py from django.db import models

class Author(models.Model):
name = models.CharField(max_length=100)
email = models.EmailField(unique=True)
bio = models.TextField(blank=True)

def __str__(self):
    return self.name

class Tag(models.Model):
name = models.CharField(max_length=50, unique=True)

def __str__(self):
    return self.name

class Post(models.Model):
title = models.CharField(max_length=200)
content = models.TextField()
author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name='posts')
tags = models.ManyToManyField(Tag, blank=True)
created_at = models.DateTimeField(auto_now_add=True)
published = models.BooleanField(default=False)
view_count = models.PositiveIntegerField(default=0)

class Meta:
    indexes = [
        models.Index(fields=['created_at']),
        models.Index(fields=['author', 'published']),
    ]

def __str__(self):
    return self.title

Notice the Meta.indexes — we're already thinking about query performance at the model level.

The N+1 Problem (And How to Kill It) The N+1 problem is the #1 performance killer in Django APIs. It happens when you fetch a list of objects and then make a separate database query for each one. The bad way: python# views.py — DON'T DO THIS class PostListView(generics.ListAPIView): queryset = Post.objects.all() serializer_class = PostSerializer

serializers.py — This triggers N+1

class PostSerializer(serializers.ModelSerializer):
author_name = serializers.SerializerMethodField()

def get_author_name(self, obj):
    return obj.author.name  # 1 query per post!

class Meta:
    model = Post
    fields = ['id', 'title', 'author_name']

With 100 posts, this runs 101 queries — 1 to fetch posts, then 1 per post to get the author.
The fix — select_related for ForeignKey:
python# views.py — DO THIS
class PostListView(generics.ListAPIView):
serializer_class = PostSerializer

def get_queryset(self):
    return Post.objects.select_related('author').all()

Now Django fetches everything in 1 SQL JOIN query.
Use prefetch_related for ManyToMany:
pythondef get_queryset(self):
return Post.objects.select_related('author').prefetch_related('tags').all()
This runs 2 queries total (posts + tags), regardless of how many rows are returned.

Filtering with django-filter Install the package: bashpip install django-filter Add to settings: pythonINSTALLED_APPS = [..., 'django_filters']

REST_FRAMEWORK = {
'DEFAULT_FILTER_BACKENDS': ['django_filters.rest_framework.DjangoFilterBackend'],
}
Create a filter class:
python# filters.py
import django_filters
from .models import Post

class PostFilter(django_filters.FilterSet):
author = django_filters.CharFilter(field_name='author_name', lookup_expr='icontains')
tag = django_filters.CharFilter(field_name='tags_name', lookup_expr='iexact')
published = django_filters.BooleanFilter()
created_after = django_filters.DateFilter(field_name='created_at', lookup_expr='gte')
created_before = django_filters.DateFilter(field_name='created_at', lookup_expr='lte')

class Meta:
    model = Post
    fields = ['author', 'tag', 'published', 'created_after', 'created_before']

Wire it up in the view:
pythonfrom .filters import PostFilter

class PostListView(generics.ListAPIView):
serializer_class = PostSerializer
filterset_class = PostFilter

def get_queryset(self):
    return Post.objects.select_related('author').prefetch_related('tags')


Now your API supports queries like:

GET /api/posts/?author=sri&published=true&created_after=2026-01-01

PostgreSQL-Specific Features This is where PostgreSQL really shines over SQLite or MySQL. Full-Text Search pythonfrom django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank

class PostSearchView(generics.ListAPIView):
serializer_class = PostSerializer

def get_queryset(self):
    query = self.request.query_params.get('q', '')
    if not query:
        return Post.objects.none()

    search_query = SearchQuery(query)
    search_vector = SearchVector('title', weight='A') + SearchVector('content', weight='B')

    return (
        Post.objects
        .annotate(rank=SearchRank(search_vector, search_query))
        .filter(rank__gte=0.1)
        .order_by('-rank')
        .select_related('author')
    )

Usage: GET /api/posts/search/?q=django+authentication
PostgreSQL ranks results by relevance — title matches score higher than body matches.
ArrayField for Tags (Alternative Approach)
pythonfrom django.contrib.postgres.fields import ArrayField

class Article(models.Model):
title = models.CharField(max_length=200)
keywords = ArrayField(models.CharField(max_length=50), blank=True, default=list)
Query it:
python# Posts where keywords contain 'django'
Article.objects.filter(keywords__contains=['django'])

Posts where keywords overlap with a list

Article.objects.filter(keywords__overlap=['django', 'python'])
JSONField for Flexible Metadata
pythonclass Post(models.Model):
# ... other fields
metadata = models.JSONField(default=dict, blank=True)

Store anything

post.metadata = {'reading_time': 5, 'difficulty': 'intermediate', 'version': '4.2'}
post.save()

Query into JSON

Post.objects.filter(metadata_difficulty='intermediate')
Post.objects.filter(metadatareading_time_gte=3)

Aggregations & Annotations Annotations let you add computed fields directly in the queryset — pushing work to the database instead of Python. pythonfrom django.db.models import Count, Avg, Sum, F, Q

class AuthorStatsView(generics.ListAPIView):
serializer_class = AuthorStatsSerializer

def get_queryset(self):
    return Author.objects.annotate(
        post_count=Count('posts'),
        published_count=Count('posts', filter=Q(posts__published=True)),
        total_views=Sum('posts__view_count'),
        avg_views=Avg('posts__view_count'),
    ).order_by('-total_views')

The serializer:
pythonclass AuthorStatsSerializer(serializers.ModelSerializer):
post_count = serializers.IntegerField()
published_count = serializers.IntegerField()
total_views = serializers.IntegerField()
avg_views = serializers.FloatField()

class Meta:
    model = Author
    fields = ['id', 'name', 'post_count', 'published_count', 'total_views', 'avg_views']

This is one SQL query that returns all stats — no Python loops needed.

Pagination for Large Datasets Never return unbounded querysets in production. DRF has three built-in strategies: python# settings.py REST_FRAMEWORK = { 'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.CursorPagination', 'PAGE_SIZE': 20, } Cursor pagination is the most efficient for large tables — it uses a database cursor instead of OFFSET, which gets slow on millions of rows. Custom cursor pagination: pythonfrom rest_framework.pagination import CursorPagination

class PostCursorPagination(CursorPagination):
page_size = 20
ordering = '-created_at'
cursor_query_param = 'cursor'

Measuring Query Performance Never guess — always measure. Django Debug Toolbar (Development) bashpip install django-debug-toolbar python# settings.py INSTALLED_APPS = [..., 'debug_toolbar'] MIDDLEWARE = ['debug_toolbar.middleware.DebugToolbarMiddleware', ...] INTERNAL_IPS = ['127.0.0.1'] This shows every SQL query, its duration, and a stack trace in the browser. Query Counting in Tests python# tests.py from django.test import TestCase from django.test.utils import override_settings

class PostListQueryTest(TestCase):
def setUp(self):
# Create test data
author = Author.objects.create(name='Sri', email='sri@example.com')
for i in range(10):
Post.objects.create(title=f'Post {i}', author=author, published=True)

def test_post_list_query_count(self):
    with self.assertNumQueries(2):  # 1 for posts, 1 for tags
        response = self.client.get('/api/posts/')
        self.assertEqual(response.status_code, 200)

If this test fails with 12 queries instead of 2, you know you have an N+1 somewhere.
Explain Analyze (PostgreSQL)
pythonfrom django.db import connection

queryset = Post.objects.select_related('author').filter(published=True)
print(queryset.query) # Raw SQL

Run EXPLAIN ANALYZE in psql

EXPLAIN ANALYZE SELECT * FROM blog_post WHERE published = true;

Look for Seq Scan on large tables — that's a signal you need an index.

Putting It All Together — Optimized ViewSet python# views.py from rest_framework import viewsets, filters from django_filters.rest_framework import DjangoFilterBackend from .models import Post from .serializers import PostSerializer from .filters import PostFilter from .pagination import PostCursorPagination

class PostViewSet(viewsets.ReadOnlyModelViewSet):
serializer_class = PostSerializer
filterset_class = PostFilter
pagination_class = PostCursorPagination
filter_backends = [DjangoFilterBackend, filters.SearchFilter, filters.OrderingFilter]
search_fields = ['title', 'content']
ordering_fields = ['created_at', 'view_count']
ordering = ['-created_at']

def get_queryset(self):
    return (
        Post.objects
        .select_related('author')
        .prefetch_related('tags')
        .filter(published=True)
        .only('id', 'title', 'created_at', 'view_count', 'author__name')
        # .only() fetches just the columns you need — saves bandwidth
    )

The .only() call is the final optimization — it tells PostgreSQL to return only the columns your serializer actually uses, reducing data transfer.

Key Takeaways
TechniqueWhen to Useselect_relatedForeignKey / OneToOne relationshipsprefetch_relatedManyToMany / reverse FK relationshipsannotate()Computed fields — push math to the DBonly() / defer()Large tables where you don't need all columnsFull-text searchSearch bars with relevance rankingCursor paginationLarge datasets (10k+ rows)assertNumQueriesCatch N+1 regressions in CI

What's Next
In Part 4, we'll set up a full CI/CD pipeline for Django using GitHub Actions — automated testing, linting, and deployment on every push.

Part of the Django Production Series. Part 1 — Authentication | Part 2 — Docker Deployment