The connection pool paradox: why more connections slow your database down

#architecture #backend #database #performance

When you hit connection timeouts under high load, the instinct is to bump max_connections to 100 or 200. Don't do it, it'll crash your server faster — CPU hits 100%, IOPS maxes out, everything freezes.

Here's why

Think of a grocery store with 4 checkout lanes (= 4 CPU cores). If customers queue up properly, throughput is great. But if 200 people rush all 4 lanes at once, cashiers spend more time context-switching between customers than actually checking anyone out. Nobody finishes faster.

Two things kill you at the OS level:

Context switching — a CPU core physically handles one thread at a time. With 200 open connections on a 4-core machine, the OS constantly saves and restores connection state (registers, cache) to keep everyone "running." At extreme counts, up to 80% of CPU time goes to switching — not actual work.

Disk contention — databases love sequential reads. Hundreds of concurrent connections destroy that pattern, turning it into random I/O even on SSDs. Your disk queue backs up, buffer cache stops being effective.

The right number is smaller than you think
max_connections = (cores × 2) + effective_spindle_count

For an 8-core server with SSD: 8 × 2 + 1 = 17

Modern stacks handle this differently

In microservices and serverless, you put a smart proxy (PgBouncer, RDS Proxy) on the database side. It maintains a small pool of real DB connections and multiplexes thousands of app connections onto them. Your app-side pool then only exists to reuse TCP connections — the heavy lifting moves to the proxy.

To sum up: connection pools aren't more better. They're about matching concurrency to what your hardware can actually parallelize.

Curious how others handle this in production:
Do you cap connections aggressively, or rely on proxies like PgBouncer / RDS Proxy?