Connection Pool Sizing in 2026: The Formula and the Footguns

#postgres #database #backend #performance

Book: Database Playbook: Choosing the Right Store for Every System You Build
Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

The database is at 30% CPU. The app is timing out. Someone says "the pool is too small, bump it." You go from 50 connections to 200. Latency gets worse. Now the database is at 30% CPU and throughput dropped.

This is the most common database-tuning mistake in production, and it survives because the fix feels backwards. Slow under load looks like "not enough connections." It's usually the opposite. The HikariCP team wrote the canonical version of this finding years ago, and it still surprises people: the smaller pool was faster. The page cites an Oracle Real-World Performance demo where response time dropped from roughly 100ms to about 2ms as the connection count fell, and a PostgreSQL benchmark where throughput flattens out around 50 connections and climbing past that buys nothing.

The reason is physics, not config. Here's the formula, why bigger hurts, and what serverless does to all of it.

Why a bigger pool is slower

A Postgres backend is a process. When it's running a query, it wants a CPU core. You have a fixed number of cores. If you have 8 cores and 200 active connections all trying to run queries, the OS time-slices them. Each query gets a sliver of CPU, then gets parked while the scheduler runs the next one. Context switches pile up. Cache lines get evicted. Disk and lock contention climb because more transactions are open at once.

The work doesn't go faster because you asked for it harder. The hardware does the same amount of work either way. A large pool just adds queueing inside the database instead of in your app, where you can't see it and can't control it.

Think of it like a grocery store. Eight checkout lanes, eight cashiers. Opening 40 lanes doesn't help if you still have eight cashiers. You just get 40 half-staffed lanes and confused customers. The right number of lanes is close to the number of cashiers, with a small buffer.

The cores-based formula

The PostgreSQL community formula, the same one HikariCP cites, is:

connections = (core_count * 2) + effective_spindle_count

core_count is real cores, not counting hyperthreads. effective_spindle_count is roughly the number of disks that can serve concurrent I/O. On SSD or cloud block storage, treat it as a small constant rather than a literal spindle count, because a query waiting on I/O frees its core for another query. That's why the multiplier is 2 and not 1: while some connections wait on disk, others can use the CPU.

A worked example. You're on a db.r6g.2xlarge with 8 vCPUs and gp3 storage:

core_count            = 8   (vCPUs; see note below)
effective_spindle_count ~ 4 (SSD, allow some I/O overlap)
connections = (8 * 2) + 4 = 20

Twenty. Not two hundred. For most OLTP workloads, a pool in the low tens per database is correct, and the instinct to set it in the hundreds is the bug.

One honest caveat on cloud: a vCPU is usually a hyperthread, not a full physical core. Some managed instances expose half as many physical cores as vCPUs. If you want to be conservative, size against physical cores. The formula gives you a starting point, not a final answer. You confirm it with a load test.

This is a ceiling per database, not per app

The formula sizes the connections that actually reach Postgres. If you run 10 app instances and each opens its own pool of 20, you've authorized 200 backend connections, and you're back where you started.

So the real budget is:

total_backends = pool_size_per_instance * instance_count

This has to stay under max_connections, with headroom for migrations, admin sessions, and replication. If max_connections = 100 and you autoscale to 10 instances, a per-instance pool of 20 blows the ceiling and you get:

FATAL: sorry, too many clients already

Two ways out. Shrink the per-instance pool so the product fits. Or put a pooler in front so the app's pool count and Postgres's backend count stop being the same number. That second option is what the rest of this post is about.

The serverless connection storm

Lambda, Cloud Run, Fly Machines: each concurrent request can spin a fresh worker, and each worker wants a connection. There's no long-lived process holding a tidy pool. A traffic spike to 4,000 concurrent invocations becomes 4,000 connection attempts hitting Postgres at once.

Postgres handles this badly. Every connection costs memory and a backend process before it runs a single query. The handshake (TCP, TLS, SCRAM auth) is not free, and 4,000 of them arriving together is a thundering herd. You hit max_connections, requests start failing, the platform retries, and the retries make the storm worse.

A pooler is the standard fix. It holds a small set of real backends and multiplexes many short-lived clients across them. The serverless function connects to the pooler, which is cheap, and the pooler talks to Postgres with a stable, sized pool. AWS positions RDS Proxy as managing connection bursts that would otherwise overwhelm the database's connection limits, pooling and sharing backends so a spike doesn't translate into a spike of new Postgres connections.

How PgBouncer's modes change the math

PgBouncer is the common pooler, and its pool_mode decides how aggressively a backend gets reused. The mode changes what number you should pick.

Session mode. A client holds a backend for its whole session. There's no multiplexing during the session, so your effective backend count is close to your client count. The density win is small. Size it like a direct connection.

Transaction mode. A backend is borrowed for one transaction and returned on COMMIT or ROLLBACK. This is where density comes from: a pool of 20 backends can serve thousands of clients if their transactions are short. The catch is that anything outside a transaction (server-side prepared statements, LISTEN/NOTIFY, SET without SET LOCAL, session advisory locks) breaks when the backend rotates.

A transaction-mode config that respects the formula:

[databases]
orders = host=db.internal port=5432 dbname=orders

[pgbouncer]
listen_port = 6432
auth_type = scram-sha-256

pool_mode = transaction
default_pool_size = 20
reserve_pool_size = 5
reserve_pool_timeout = 3
max_client_conn = 5000

The trap is default_pool_size. It's per (database, user) pair, not a global cap. Four users on two databases at default_pool_size = 20 authorizes 160 backend connections, not 20. Do the multiplication every time.

max_client_conn is the front door: how many app or serverless clients PgBouncer accepts. It's fine for this to be large (5,000+) because clients are cheap on the PgBouncer side. default_pool_size is the back door to Postgres, and that's the one the cores formula governs.

The shape to internalize: in transaction mode you set max_client_conn high and default_pool_size low. The pooler absorbs the storm at the front and feeds Postgres a small, formula-sized stream at the back.

A sizing worksheet

Run this top to bottom. It takes about ten minutes and saves the bad bump.

1. Real cores
   - Look up the instance's physical cores.
   - If only vCPUs are listed, halve for a safe estimate.
   - core_count = ___

2. Baseline pool
   - connections = (core_count * 2) + 4
   - baseline = ___

3. Fan-out
   - How many app instances or workers connect?
   - instance_count = ___

4. Total backend demand
   - If NO pooler: baseline * instance_count = ___
     -> must stay under max_connections minus headroom.
   - If pooler in transaction mode:
     default_pool_size = baseline (per db,user pair)
     total = baseline * (db_count * user_count) = ___

5. Headroom
   - Reserve ~10-15 connections for migrations,
     admin, monitoring, replication.
   - usable max_connections = max_connections - headroom

6. Check
   - total backend demand <= usable max_connections?
   - If no: shrink the pool, or add a pooler, or both.

7. Load test
   - Run pgbench / k6 at real concurrency.
   - Watch p99 latency AND throughput as you raise
     the pool by 5 at a time.
   - Stop at the point where throughput stops
     rising. That point is usually near the formula,
     not far above it.

Step 7 is the one people skip, and it's the one that turns a guess into a number. The formula gives you a starting pool. The load test tells you where the throughput curve flattens. They're usually close, and when they aren't, trust the measurement.

The footguns, collected

Quick list of what bites teams after they think they're done.

Sizing per app, ignoring fan-out. The pool is fine on one box and lethal across twenty. Always multiply.
No headroom. You sized to exactly max_connections, then a migration or a monitoring agent needs a connection and can't get one. Leave a margin.
Idle-in-transaction connections. A connection that opens a transaction and sits there holds a backend hostage and blocks VACUUM. Set idle_in_transaction_session_timeout on Postgres so these get killed.
Long pool, short timeout. A big pool with an aggressive client-side acquire timeout means requests queue, time out, retry, and pile more load on. Smaller pool, patient queue, usually wins.
Counting vCPUs as cores. Doubles your formula output on hyperthreaded instances. Size against physical cores when you can.
Forgetting the pooler is also a process. It has its own limits and its own failure mode. Monitor SHOW POOLS on PgBouncer; don't treat the pooler as infinite.

The one-line version

Connections are not throughput. Past the point where every core is busy, each extra connection adds queueing, context switches, and lock contention, and the database does less work, not more. Start at (cores * 2) + spindles, multiply by your fan-out, keep it under max_connections with headroom, and put a pooler in front the moment you go serverless. Then load test, because the formula is a starting line, not a finish line.

If this was useful

This post pulls from the connection-management chapter of the Database Playbook: Choosing the Right Store for Every System You Build. The book covers pool sizing alongside the bigger questions: when to add a read replica, when partitioning beats sharding, and which managed Postgres provider fits which workload. The connection chapter goes deeper on pooler modes, pinning, and the serverless storm than this post had room for.