PostgreSQL Performance Tuning for High-Traffic React Applications

#postgres #database #performance

PostgreSQL Performance Tuning for High-Traffic React Applications

I've watched CitizenApp's database go from snappy to sluggish in the span of a few weeks. No schema changes. Same queries. The culprit? Growth. And here's the uncomfortable truth: most developers don't touch database performance until the fire alarm goes off. I prefer to light small fires in staging first.

PostgreSQL is deceptively forgiving. It'll run bad queries at 100 users. It'll also run them at 10,000 users—just slower. The difference between optimization before you need it and optimization in production at 2 AM is usually the difference between a blog post and a PagerDuty incident.

Understanding Query Performance With EXPLAIN ANALYZE

The single most valuable skill I've developed is reading EXPLAIN ANALYZE output. Not EXPLAIN—ANALYZE. The difference: EXPLAIN shows the plan; ANALYZE actually runs it and shows you reality.

I burned three hours chasing a query I was sure was optimized. Turned out the planner was choosing a sequential scan instead of an index. Here's what that looked like:

-- BAD: This looked efficient. It was not.
EXPLAIN ANALYZE
SELECT u.id, u.email, COUNT(a.id) as activity_count
FROM users u
LEFT JOIN activities a ON u.id = a.user_id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.email
ORDER BY activity_count DESC;

-- Output snippet:
-- Sort  (cost=1024.50..1024.75 rows=100 width=40)
--   Sort Key: (count(a.id)) DESC
--   ->  HashAggregate  (cost=850.00..900.00 rows=100 width=40)
--         Group Key: u.id, u.email
--         ->  Hash Left Join  (cost=500.00..800.00 rows=1000 width=40)
--               Hash Cond: (a.user_id = u.id)
--               ->  Seq Scan on activities a  (cost=0.00..400.00 rows=50000 width=8)
--               ->  Hash  (cost=200.00..200.00 rows=100 width=40)
--                     ->  Seq Scan on users u  (cost=0.00..200.00 rows=100 width=40)
--                           Filter: (created_at > (now() - '30 days'::interval))

Notice the sequential scan on activities. With 50k rows, that's expensive. The fix:

-- GOOD: Index on the join column
CREATE INDEX CONCURRENTLY idx_activities_user_id ON activities(user_id);

-- Even better: Index on the filter + join
CREATE INDEX CONCURRENTLY idx_activities_user_id_created 
ON activities(user_id, created_at DESC)
WHERE deleted_at IS NULL;

Run EXPLAIN ANALYZE again. Now you'll likely see an Index Scan instead of a Seq Scan. The actual query time dropped from 340ms to 12ms in my case.

Why this matters: Every 100ms saved on a query used 50 times per second is 5 seconds of processing power freed up. That's not nothing.

Index Strategy: Beyond The Obvious

Most developers create indexes on primary keys and foreign keys, then stop. I prefer to be intentional about every index because they have a cost—write performance degrades.

Here's my decision tree:

Columns in WHERE clauses on high-traffic queries → Index them
Columns in JOIN conditions → Index them
Columns in ORDER BY → Index them (if they're selective)
Multiple columns together → Consider composite indexes

For CitizenApp, our tenants table gets filtered heavily:

-- Schema
CREATE TABLE organizations (
  id UUID PRIMARY KEY,
  user_id UUID NOT NULL,
  status VARCHAR(20) NOT NULL DEFAULT 'active',
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);

-- This query runs 100+ times per day
SELECT * FROM organizations 
WHERE user_id = $1 AND status = 'active'
ORDER BY created_at DESC;

-- Single best index for this query
CREATE INDEX CONCURRENTLY idx_org_user_status 
ON organizations(user_id, status, created_at DESC);

The index order matters. Put the most selective columns first (usually the exact-match columns), then the range/sort columns.

Connection Pooling: Don't Let This Sneak Up On You

This burned me badly. FastAPI connections to PostgreSQL were leaking. Not dramatically—just a handful lingering after each request. After a few hours under load, we hit connection limits and everything froze.

The fix requires discipline at the application level. Here's how I do it in FastAPI with SQLAlchemy:

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
import os

# Configure the engine with explicit pooling
DATABASE_URL = os.getenv("DATABASE_URL")

engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=20,  # Number of connections to keep in the pool
    max_overflow=10,  # Additional connections allowed when needed
    pool_recycle=3600,  # Recycle connections every hour (avoid stale connections)
    pool_pre_ping=True,  # Test connection before using it
    echo_pool=True,  # Log pool events (disable in production)
)

# Create a session factory
from sqlalchemy.orm import sessionmaker
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

# Use it in FastAPI
from fastapi import Depends

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()  # Critical: always close sessions

@app.get("/api/users/{user_id}")
async def get_user(user_id: str, db: Session = Depends(get_db)):
    return db.query(User).filter(User.id == user_id).first()

The pool_pre_ping=True is crucial. It validates the connection is alive before handing it to your code. Without it, stale connections can cause mysterious failures.

Why this matters: PostgreSQL has a hard limit on connections (usually 100 by default, configurable up to ~8000 depending on your machine). Connection leaks silently degrade performance until—boom—everything stops.

Identifying Bottlenecks Before Production

I use this query constantly:

-- Find slowest queries
SELECT 
  query,
  calls,
  total_time,
  mean_time,
  max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;

This requires pg_stat_statements extension:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

Run this in staging after load testing. The slowest queries are your optimization targets.

Gotcha: N+1 Queries

Here's what I missed early on: SQLAlchemy eager loading isn't automatic. This query runs 100 times:

# BAD: N+1 query problem
organizations = db.query(Organization).filter(
    Organization.user_id == user_id
).all()

for org in organizations:
    print(org.workspace.name)  # Runs a SELECT for EACH org

# GOOD: Eager load the relationship
from sqlalchemy.orm import joinedload

organizations = db.query(Organization).options(
    joinedload(Organization.workspace)
).filter(
    Organization.user_id == user_id
).all()

The second version runs 1 query instead of 101. Massive difference under load.

The Real Answer

Performance tuning isn't about cargo-culting best practices. It's about measuring where the time actually goes. Use EXPLAIN ANALYZE. Profile your queries. Monitor connection pools. Don't guess.

PostgreSQL scales beautifully when you respect it. CitizenApp handles 50k requests daily now with 99.5% query times under 50ms. Not because I'm a genius—because I learned to profile early and often.