DEV Community

Cover image for FastAPI Under Load: 5 Production Issues Most Teams Discover Too Late
Zestminds Technologies
Zestminds Technologies

Posted on • Originally published at zestminds.com

FastAPI Under Load: 5 Production Issues Most Teams Discover Too Late

FastAPI is fast. Clean. Productive.

For MVPs, it’s excellent.

But once traffic increases, the bottlenecks start appearing, and most of them are architectural, not framework-related.

Here are 5 real production issues we’ve seen when FastAPI services start handling real concurrency.

1. Event Loop Blocking (Async Done Wrong)

Just because your endpoint is async def doesn’t mean your system is non-blocking.

Common mistakes:

  • CPU-heavy operations inside request handlers
  • Sync DB calls inside async endpoints
  • Large JSON serialization
  • Data processing (Pandas, ML inference)
  • Blocking third-party SDKs

Under light traffic → everything looks fine.
Under concurrency → latency increases across all endpoints.

Why?

Because the event loop is blocked.

What to do instead

  • Offload CPU-bound work to worker processes
  • Use async-native database drivers
  • Push heavy processing to a task queue
  • Test under realistic concurrency (Locust / k6)

Async is a tool, not magic.


2. Database Connection Pool Exhaustion

Default pool configurations are rarely production-ready.

Symptoms under load:

  • Requests hang
  • Timeout errors
  • Increased p95 latency
  • DB CPU spikes

The application appears “up” but becomes progressively slower.

Fix

  • Explicitly configure pool size
  • Monitor active vs idle connections
  • Avoid long-running transactions
  • Consider read replicas for heavy reads

Connection pools are capacity limits. Treat them like infrastructure planning, not defaults.


3. BackgroundTasks ≠ Distributed Queue

FastAPI’s BackgroundTasks works for small, quick tasks.

It does not scale well for:

  • Bulk email sending
  • File processing
  • Report generation
  • Long-running workflows

Under load, background tasks compete with incoming requests.

This reduces throughput.

Proper solution

Use a real queue:

  • Celery
  • RQ
  • Dramatiq
  • Redis / RabbitMQ backed workers

Separate request handling from asynchronous workload processing.


4. Uvicorn Defaults in Production

Many deployments run something like:

uvicorn main:app
Enter fullscreen mode Exit fullscreen mode

Single worker. Default config.

Under traffic:

  • CPU saturates
  • Requests queue
  • Latency spikes

Production approach

Use Gunicorn with Uvicorn workers:

gunicorn -k uvicorn.workers.UvicornWorker -w 4 main:app
Enter fullscreen mode Exit fullscreen mode

Tune workers based on CPU cores and workload type.

Measure:

  • p95 latency
  • p99 latency
  • Request throughput
  • Worker restarts

Production tuning is not optional.


5. Memory Growth Under Concurrency

This one is subtle.

Under concurrency:

  • Large response objects accumulate
  • Inefficient dependency injection patterns
  • In-memory caching misuse
  • Objects not released quickly

Symptoms:

  • Gradual memory increase
  • Higher GC pressure
  • Container restarts

Mitigation

  • Profile memory usage
  • Stream large responses
  • Keep request-scoped dependencies clean
  • Monitor container memory continuously

Scaling amplifies small inefficiencies.


The Core Insight

FastAPI is not the scalability layer.

It’s a framework.

Scalability comes from:

  • Architecture decisions
  • Load testing
  • Capacity planning
  • Observability
  • Separation of concerns

Most “FastAPI performance issues” are system design issues.


Before You Scale a FastAPI SaaS

Validate:

  • Async correctness
  • DB pool configuration
  • Worker strategy
  • Background processing separation
  • Load testing under realistic traffic
  • p95 / p99 latency tracking

Production problems don’t show up in development.

They show up when marketing works.


If you're building SaaS systems with FastAPI, we documented deeper production lessons and architectural breakdowns on our engineering blog.

Curious to hear what others have seen under load, what surprised you most?

Top comments (0)