DEV Community

Cover image for Escaping the Sync Trap: How I Slashed Latency by 10x in a Django-Rust API Gateway
Rahim Ranxx
Rahim Ranxx

Posted on

Escaping the Sync Trap: How I Slashed Latency by 10x in a Django-Rust API Gateway

How I diagnosed and eliminated synchronous bottlenecks in a Django-Rust API gateway, migrating to ASGI and pre-warming caches for millisecond responses.

When building a high-performance backend, the standard playbook is well-known: offload heavy computational tasks to faster microservices (like Rust) and implement an aggressive caching strategy.

Recently, I did exactly that. My architecture is built around a Django REST Framework gateway sitting behind Caddy, heavily monitored with Prometheus and Grafana.

But despite the raw speed of Rust and my caching layers, my dashboards were flashing red. Latency was spiking to brutal 10-second flatlines for my most critical endpoints. Worse, my observability itself started failing, creating silent blind spots exactly when I needed data the most.

Here is the detective story of how I used telemetry to hunt down synchronous traps, migrate to a non-blocking async architecture, and implement proactive pre-warming to bring response times down to the millisecond range — all while reclaiming 30% of my idle CPU.


The Architecture: A Gateway and its Heavy Lifters

Before diving into the problem, here is a quick look at my setup.

          ┌──────────────────────────────┐
          │          Nextcloud           │
          │ (Authenticated Client Calls) │
          └──────────────┬───────────────┘
                         │  JWT / API Key
                         ▼
               ┌────────────────────┐
               │  Django Gateway    │
               │ (ASGI, DRF, Caddy) │
               └──────┬─────────────┘
                      │
     ┌────────────────┴────────────────┐
     │                                 │
     ▼                                 ▼
┌───────────────┐              ┌────────────────┐
│ NDVI Service  │              │ Weather Service│
│ (Rust, 8081)  │              │ (Rust, 8090)   │
│  → Postgres    │              │  → MySQL       │
└───────────────┘              └────────────────┘

       ▲
       │ Prometheus & Grafana
       │ (Observability Stack)
       ▼
   System Telemetry + Metrics
Enter fullscreen mode Exit fullscreen mode

The Architecture

Django stays as the public API gateway, accepting requests authenticated via JWT or API keys.
It enforces a shared JSON response envelope containing status, message, data, and errors to keep all client interactions standardized.

Specific traffic routes — namely /api/v1/ndvi and /api/v1/weather/* — are forwarded directly to my Rust backends:

  • 🛰 NDVI microservice ingests satellite data into a dedicated Postgres database.
  • 🌦 Weather microservice relies on a MySQL database and communicates with external providers like Open-Meteo and NASA POWER.

My Nextcloud instance acts like any other client, presenting either an Authorization: Bearer token or an X-API-Key. Django manages this traffic using a specific nextcloud_hmac throttle configuration before passing the authorized call down to Rust with the original headers intact.


The False Cure: The Worker Starvation Anomaly

To protect the system, I implemented aggressive TTL caching (e.g., 1 hour for schema data, 5 minutes for API tokens). However, once I added traffic, my Grafana dashboard revealed a chaotic reality.

I saw brutal, perfectly flat 8–10 second latency spikes on key endpoints. Crucially, perfectly timed with these latency spikes, my internal /metrics request rate dropped to zero.


The Diagnosis: The Gateway Caching Itself to Death

The telemetry told a story of hidden synchronous bottlenecks:

  1. The Trigger: When a high-traffic endpoint like farm-weather-current/GET experienced a cache miss, the Django gateway had to fetch fresh data.
  2. The Trap: My Django deployment was running using standard synchronous workers. It called the Rust service, which then called the external weather API (taking 2.2+ seconds).
  3. The Impact (Worker Starvation): Because the Django worker was synchronous, it blocked entirely for those 2.2 seconds. All incoming traffic got stuck in a queue.

The Trap: Synchronous Gateway Routing

import httpx
from django.http import JsonResponse
from django.views import View

async def async_weather_proxy_view(request):
    async with httpx.AsyncClient(timeout=5.0) as client:
        resp = await client.get("http://weather-service:8090/api/v1/weather-current")
        resp.raise_for_status()
        data = resp.json()
    return JsonResponse({"status": "success", "data": data, "errors": None})
Enter fullscreen mode Exit fullscreen mode

I had successfully offloaded work to Rust — but my synchronous Django workers completely nullified the speed gains.


The First Fix: Embracing the Non-Blocking Gateway

I needed to decouple the speed of the gateway from the speed of the external API calls it was routing.
I migrated the Django deployment from synchronous workers to an ASGI (Asynchronous Server Gateway Interface) setup, allowing my gateway to handle requests asynchronously.

I rewrote my proxy views to use asynchronous HTTP clients like httpx:


The Fix: Asynchronous Non-Blocking Routing

import httpx
from django.http import JsonResponse

async def async_weather_proxy_view(request):
    # The event loop is freed! Django can serve other requests while waiting
    async with httpx.AsyncClient() as client:
        rust_response = await client.get("http://weather-service:8090/api/v1/weather-current")

    return JsonResponse({
        "status": "success",
        "data": rust_response.json(),
        "errors": None
    })
Enter fullscreen mode Exit fullscreen mode

The visual evidence on my dashboards was a massive, instant victory:

  • Observability Restored: The metrics scrape line remained unbroken. Django could finally pause a slow weather request, instantly answer the Prometheus scrape, and resume without blocking.
  • Instant Internal Routing: In my initial setup, a simple internal metrics scrape took ~84ms. After the ASGI migration, that duration dropped to 11ms.

The Second Fix: Proactive Caching

While the infrastructure was now bulletproof, the end-user experience was still occasionally sluggish.

With an “on-demand” caching strategy, the very first user to request the weather after a 1-hour cache expiration had to pay the Cache Miss Penalty (waiting ~2.2 seconds for the external API).

To eliminate this, I decoupled the data-fetching time from the user-request cycle entirely.
I implemented a Proactive Background Pre-warming pattern using a background task (like Celery) that runs every 55 minutes, independently fetching slow data and silently overwriting the cache before it expires.

caching


The Cache Pre-Warmer (Celery Example)

from celery import shared_task
from django.core.cache import cache
import requests

@shared_task
def pre_warm_weather_cache():
    # Runs in the background every 55 minutes, shielding the user from the 2.2s wait
    response = requests.get("http://weather-service:8090/api/v1/weather-current")
    if response.status_code == 200:
        cache.set("weather_current_data", response.json(), timeout=3600)
Enter fullscreen mode Exit fullscreen mode

The result? The average latency for critical weather endpoints plummeted from seconds to milliseconds as the hot cache permanently took over.


The Grand Slam: Optimizing the Rust Build Pipeline

The final victory of this new architecture came from pure server efficiency.
During deployments, compiling Rust crates (sqlx, syn) from scratch was pegging my 4-core server at 100% CPU, artificially causing timeouts.

To fix this, I implemented cargo-chef in a multi-stage Dockerfile to strictly cache Rust dependencies.


Multi-stage Dockerfile for the Rust Microservice

FROM rust:1.88-slim AS chef
USER root
RUN cargo install cargo-chef
WORKDIR /app

FROM chef AS planner
COPY . .
RUN cargo chef prepare --recipe-path recipe.json

FROM chef AS builder
COPY --from=planner /app/recipe.json recipe.json
# Docker caches this heavy dependency build!
RUN cargo chef cook --release --recipe-path recipe.json
COPY . .
RUN cargo build --release --bin weather-service

FROM debian:bookworm-slim AS runtime
WORKDIR /app
RUN apt-get update && apt-get install -y libssl-dev ca-certificates
COPY --from=builder /app/target/release/weather-service /usr/local/bin/
EXPOSE 8090
ENTRYPOINT ["/usr/local/bin/weather-service"]
Enter fullscreen mode Exit fullscreen mode

Between ASGI, background caching, and Docker layer caching, my total Node CPU now rests comfortably between 11% and 13%.
I fundamentally reclaimed 30% of my total server compute capacity.


Conclusion: Finding the Next Bottleneck

Building high-performance API gateways is an ongoing journey of shifting bottlenecks.

By relying strictly on my telemetry, I proved that synchronous workers nullify microservice speed, validated the immense power of ASGI, and eliminated cache miss penalties.

With the gateway running unburdened, my dashboards have revealed one final bottleneck — a 6-to-8 second delay on my token generation endpoint.
Because my CPU is mostly idle, I know exactly what this is: a database connection pool limitation in the Rust service.

And thanks to my new observability baseline, I know exactly where to strike next.


Top comments (0)