DEV Community

Empiric Infotech LLP
Empiric Infotech LLP

Posted on

7 Django + DRF Patterns That Separate a Prototype From a Production Backend

Django gets you to a working CRUD API in an afternoon. The gap nobody warns you about is the distance between that and a backend that survives multi-tenancy, background jobs, payment webhooks, and an auditor asking "who changed this record on March 3rd?"

I have shipped Django backends for SaaS, EdTech, and fintech products since the Django 2.x days. Below are seven patterns that consistently come up when a prototype becomes a real system. Every one includes the code, the failure it prevents, and when it is overkill.

If you only take one thing from this post: most Django scaling pain is not about Django being slow. It is about defaults that were fine for a tutorial quietly becoming wrong at scale.

1. Thin views, fat services (not fat models)

The classic Django advice is "fat models, thin views." It falls apart the moment a single business action touches three models, an external API, and an email. Stuffing that into a model method makes it untestable and couples persistence to business logic.

Put orchestration in a service layer instead.

# services/subscriptions.py
from django.db import transaction
from .models import Subscription, Invoice
from .gateways import stripe_gateway

class SubscriptionError(Exception):
    pass

@transaction.atomic
def activate_subscription(*, user, plan, payment_method_id):
    if Subscription.objects.filter(user=user, status="active").exists():
        raise SubscriptionError("User already has an active subscription")

    charge = stripe_gateway.charge(
        amount=plan.price_cents,
        payment_method=payment_method_id,
        idempotency_key=f"sub-{user.id}-{plan.id}",
    )

    subscription = Subscription.objects.create(
        user=user, plan=plan, status="active",
        stripe_charge_id=charge.id,
    )
    Invoice.objects.create(subscription=subscription, amount_cents=plan.price_cents)
    return subscription
Enter fullscreen mode Exit fullscreen mode

Your view becomes boring, which is the goal:

class ActivateSubscriptionView(APIView):
    def post(self, request):
        try:
            sub = activate_subscription(
                user=request.user,
                plan=get_object_or_404(Plan, pk=request.data["plan_id"]),
                payment_method_id=request.data["payment_method_id"],
            )
        except SubscriptionError as e:
            return Response({"detail": str(e)}, status=409)
        return Response(SubscriptionSerializer(sub).data, status=201)
Enter fullscreen mode Exit fullscreen mode

When it is overkill: plain CRUD with no cross-model logic. Do not add a service layer to wrap a one-line .save().

2. Make the N+1 query impossible to forget

DRF serializers make N+1 queries silently. A list endpoint with a nested author field will issue one query per row, and you will not notice until the table has 50,000 rows.

Fix it at the queryset, and prove it in a test so it cannot regress.

class ArticleViewSet(ModelViewSet):
    serializer_class = ArticleSerializer

    def get_queryset(self):
        return (
            Article.objects
            .select_related("author")          # FK / one-to-one
            .prefetch_related("tags")          # M2M / reverse FK
        )
Enter fullscreen mode Exit fullscreen mode
# tests/test_queries.py
from django.test.utils import CaptureQueriesContext
from django.db import connection

def test_article_list_is_constant_queries(client, django_user_model):
    ArticleFactory.create_batch(25)
    with CaptureQueriesContext(connection) as ctx:
        client.get("/api/articles/")
    assert len(ctx.captured_queries) <= 4  # does not grow with row count
Enter fullscreen mode Exit fullscreen mode

That assertion is the actual value. Anyone can add select_related. The test stops a teammate from removing it six months later.

3. Multi-tenancy: scope every query at the boundary

Multi-tenant SaaS has one catastrophic bug class: tenant A reading tenant B's data because one queryset forgot to filter. Do not rely on developers remembering. Enforce it.

A pragmatic shared-database approach uses a base queryset plus a DRF mixin:

class TenantQuerySet(models.QuerySet):
    def for_tenant(self, tenant):
        return self.filter(tenant=tenant)

class TenantScopedViewSet(ModelViewSet):
    """Every subclass is tenant-scoped by default."""
    def get_queryset(self):
        return super().get_queryset().for_tenant(self.request.tenant)

    def perform_create(self, serializer):
        serializer.save(tenant=self.request.tenant)
Enter fullscreen mode Exit fullscreen mode

request.tenant gets resolved once in middleware (from subdomain, header, or JWT claim). Now a forgotten filter is a structural impossibility for any view that inherits the base class, instead of a code-review judgment call.

For hard isolation requirements (HIPAA, PCI), schema-per-tenant with django-tenants is the heavier alternative. Pick row-level scoping until compliance forces schemas, because schema-per-tenant complicates migrations significantly.

4. Treat Celery tasks as if they will run twice

Network blips, worker restarts, and broker redeliveries mean any task can run more than once. Non-idempotent tasks corrupt data quietly. Design every task to be safe on a second run.

@shared_task(bind=True, max_retries=3, default_retry_delay=30)
def send_invoice_email(self, invoice_id):
    invoice = Invoice.objects.select_for_update().get(pk=invoice_id)
    if invoice.email_sent_at:
        return  # already done, second run is a no-op

    try:
        email_gateway.send_invoice(invoice)
    except TransientEmailError as exc:
        raise self.retry(exc=exc)

    invoice.email_sent_at = timezone.now()
    invoice.save(update_fields=["email_sent_at"])
Enter fullscreen mode Exit fullscreen mode

Two cheap habits that save real incidents:

  • Pass IDs, never model instances. A pickled stale object overwrites fresh data.
  • Set a real default_retry_delay and max_retries. The defaults will hammer a struggling downstream service.

5. Payment webhooks: verify, dedupe, respond fast

Stripe (and every other gateway) retries webhooks. They arrive out of order. They can be forged if you skip signature verification. A webhook handler that does real work inline will also time out and trigger more retries, making things worse.

The reliable shape: verify, record, enqueue, return 200 immediately.

@csrf_exempt
def stripe_webhook(request):
    try:
        event = stripe.Webhook.construct_event(
            request.body,
            request.META["HTTP_STRIPE_SIGNATURE"],
            settings.STRIPE_WEBHOOK_SECRET,
        )
    except (ValueError, stripe.error.SignatureVerificationError):
        return HttpResponse(status=400)

    # Dedupe on Stripe's event id; get_or_create is the idempotency gate
    _, created = WebhookEvent.objects.get_or_create(
        stripe_event_id=event["id"],
        defaults={"type": event["type"], "payload": event},
    )
    if created:
        process_webhook_event.delay(event["id"])

    return HttpResponse(status=200)  # ack fast, work happens async
Enter fullscreen mode Exit fullscreen mode

The get_or_create on the gateway's event id is the whole trick. Duplicates hit the unique constraint and become no-ops. Out-of-order delivery is handled in the async processor where you have time to reconcile state.

6. Audit logging that an auditor will actually accept

"Who changed this, when, and what was the previous value" is a hard requirement in fintech and healthtech, and a nice-to-have everywhere else. Bolting it on after the fact means backfilling history you do not have. Build it in from the first model.

A lightweight version using Django signals:

class AuditLog(models.Model):
    actor = models.ForeignKey(User, null=True, on_delete=models.SET_NULL)
    action = models.CharField(max_length=16)   # created / updated / deleted
    model = models.CharField(max_length=100)
    object_id = models.CharField(max_length=64)
    changes = models.JSONField(default=dict)    # {field: [old, new]}
    created_at = models.DateTimeField(auto_now_add=True)

    class Meta:
        indexes = [models.Index(fields=["model", "object_id"])]
Enter fullscreen mode Exit fullscreen mode

Capture the diff with pre_save by comparing against the database row, and stamp the actor from a thread-local or request middleware. For anything regulated, append-only matters: revoke UPDATE and DELETE on the audit table at the database role level so even a buggy migration cannot rewrite history. An audit log you can edit is not an audit log.

7. Settings split + health checks, before you deploy anything

Two unglamorous things that prevent outages.

Split settings by environment so a debug flag can never reach production:

settings/
  base.py        # shared
  development.py  # DEBUG=True, console email
  production.py   # DEBUG=False, real secrets from env, security headers
Enter fullscreen mode Exit fullscreen mode
# settings/production.py
from .base import *  # noqa
DEBUG = False
SECURE_HSTS_SECONDS = 31536000
SECURE_SSL_REDIRECT = True
SESSION_COOKIE_SECURE = True
CSRF_COOKIE_SECURE = True
ALLOWED_HOSTS = env.list("ALLOWED_HOSTS")
Enter fullscreen mode Exit fullscreen mode

A real health endpoint that checks dependencies, so your load balancer pulls a broken instance before users see 500s:

def healthz(request):
    checks = {}
    try:
        connection.cursor().execute("SELECT 1")
        checks["db"] = "ok"
    except Exception:
        checks["db"] = "fail"
    checks["cache"] = "ok" if cache.set("hc", 1, 5) else "fail"
    status = 200 if all(v == "ok" for v in checks.values()) else 503
    return JsonResponse(checks, status=status)
Enter fullscreen mode Exit fullscreen mode

When Django is the wrong tool

Honesty serves readers better than evangelism. Django is a strong default for content-heavy SaaS, EdTech, marketplaces, and admin-heavy back offices where the ORM, the admin, and the auth system do real work for you. It is a weaker fit for:

  • Real-time, high-concurrency WebSocket apps. Django Channels exists, but a Node or Go service is often a cleaner fit.
  • Lean microservices where Django's batteries-included weight is dead load. FastAPI or Flask travels lighter.
  • JavaScript-first teams who would rather not context-switch languages.

Picking the framework that matches the workload is most of the architecture decision.

Closing

None of these patterns are exotic. They are the boring, repeatable habits that separate a Django app that demos well from one that runs payroll, processes payments, and passes an audit. Adopt them early and most "Django doesn't scale" complaints never materialize.

If your team needs Django and DRF engineers who already build with these patterns (multi-tenancy, Celery, Stripe, audit-ready compliance work), my team at Empiric Infotech does this full time. You can hire Django developers on an hourly or dedicated monthly basis with US time-zone overlap.

What patterns have saved you in production Django? Drop them in the comments. I am always collecting more.

Top comments (0)