Sharafath Ali VK

Posted on Jun 3

PostgreSQL Connection Pooling Explained: How It Works and Why It Matters

#postgres #database #sql #webdev

Opening a new database connection for every query is one of the most expensive things your backend can do. Connection pooling is how you stop doing that.

Why connections are expensive
What exactly happens when you connect
The problem at scale
What connection pooling actually is
Pool modes — the most important decision
PgBouncer — the standard solution
PgBouncer setup and configuration
Application-level pooling vs PgBouncer
How to size your pool correctly
What happens when the pool is exhausted
Monitoring your pool
Common mistakes
The mental model

Why connections are expensive

Most developers think of a database connection as roughly equivalent to opening a file — a cheap OS-level handle. It is not.

A PostgreSQL client connection is typically served by a dedicated backend OS process. When you connect to Postgres, the server forks a new backend process specifically for your connection. That process:

Allocates its own memory segment (around 5–10 MB per connection by default)
Loads shared catalog caches into local memory
Establishes a TCP socket
Negotiates the protocol handshake
Authenticates the connecting user
Sets up session state (search path, timezone, encoding) This takes somewhere between 20 and 100 milliseconds depending on your hardware, network, and authentication method. On a fast local network with md5 auth, you might get 20ms. With SSL and scram-sha-256 over a real network, it is closer to 80–100ms.

Now consider a typical web request that needs 3 database queries. If you open a fresh connection for each request:

Request arrives
  → Connect to Postgres (80ms)
  → Query 1 (2ms)
  → Query 2 (3ms)
  → Query 3 (1ms)
  → Disconnect
Total: 86ms

Without connection overhead:
  → Query 1 (2ms)
  → Query 2 (3ms)
  → Query 3 (1ms)
Total: 6ms

The connection setup is 93% of the request time. The actual database work is 6ms. You are paying a massive tax for something that is entirely avoidable.

What exactly happens when you connect

Understanding the full connection lifecycle makes it clear why this is expensive and why pooling is not just an optimization but a necessity at any real scale.

1. TCP handshake

The client initiates a TCP connection to Postgres (default port 5432). Three-way handshake: SYN → SYN-ACK → ACK. On localhost this is sub-millisecond. Over a network, it depends on latency.

2. SSL negotiation (if enabled)

If SSL is configured — and it should be in production — there is a TLS handshake on top of the TCP connection. This involves certificate exchange, cipher negotiation, and key derivation. Typically adds 10–30ms on a real network.

3. Postgres startup message

The client sends a startup packet containing the protocol version, database name, and username. Postgres reads this and decides how to proceed.

4. Authentication

Postgres runs the configured authentication method:

trust — no authentication (development only, never production)
md5 — password hashed with MD5, fast
scram-sha-256 — modern challenge-response, slower but cryptographically sound
peer — Unix socket auth against OS user, fast
ldap / radius / cert — external auth, varies

5. Process fork

Postgres forks a new backend process (postgres: username dbname in ps output). This is a real fork() system call. The postmaster (Postgres's master process) maintains a listener and forks a child for every accepted connection. The fork itself is relatively fast due to copy-on-write semantics, but the child process then needs to set up its own memory structures.

6. Shared catalog load

The new backend process loads catalog caches — system table data that it needs to execute queries. Table schemas, type OIDs, function definitions. Some of this is shared memory, some is per-process. First queries in a new connection are slightly slower because the cache is cold.

7. Session parameter setup

search_path, TimeZone, client_encoding, DateStyle, and any other session-level parameters are applied. If your application sets SET parameters on connect (common in multi-tenant apps), those run here.

8. Ready for query

Only now does the connection enter the idle state, ready to accept your first query. Everything above happened before a single byte of your actual query was sent.

The problem at scale

The per-connection cost compounds quickly under load.

Max connections ceiling

Postgres has a hard limit on simultaneous connections configured by max_connections in postgresql.conf. The default is 100. You can increase it, but there is a real ceiling because each connection is a process consuming memory and file descriptors.

At max_connections = 100, with each connection carrying roughly 10 MB of process overhead:

100 connections × 10 MB = 1 GB just for connection memory

That is before any query actually runs. Increase max_connections to 500:

500 connections × 10 MB = 5 GB

On a server with 16 GB of RAM, 5 GB consumed by idle connection overhead before any real work is done is a significant problem.

The thundering herd

Without connection pooling, connection counts track request counts directly. At 500 concurrent requests, you need 500 database connections. At 1,000 concurrent requests, you need 1,000. Each spike in traffic causes a spike in connection count.

If a deploy causes a restart, or if your database becomes briefly unavailable, every application server tries to reconnect simultaneously. Thousands of fork() calls hit the Postgres postmaster at once. This is called the thundering herd — it can take a restarted Postgres longer to recover from the reconnection storm than the original restart took.

Context switching overhead

The OS scheduler has to manage all running processes. At 200 active connections, Postgres has 200 backend processes. Even if most are idle (waiting for the next query), they exist as schedulable entities. Context switching between 200 processes costs CPU. At high connection counts this becomes measurable — Postgres spends CPU cycles managing connections rather than executing queries.

What connection pooling actually is

A connection pool is a cache of pre-established database connections that are reused across multiple application requests.

Instead of:

App → [connect] → Postgres → [query] → [disconnect]
App → [connect] → Postgres → [query] → [disconnect]
App → [connect] → Postgres → [query] → [disconnect]

With a pool:

Pool → [connect × 20] → Postgres  (done once at startup)

App → Pool → [borrow connection] → [query] → [return connection]
App → Pool → [borrow connection] → [query] → [return connection]
App → Pool → [borrow connection] → [query] → [return connection]

The pool maintains a fixed set of open connections. Requests borrow a connection, use it, and return it. The connection itself is never closed between requests — it stays open and ready. The next request picks it up instantly with zero handshake overhead.

The critical insight: the number of real Postgres connections is now decoupled from the number of concurrent application requests. 1,000 application requests can be served by 20 actual Postgres connections if the queries are fast enough, because most connections are returned to the pool before the next request needs one.

Pool modes — the most important decision

This is the decision that most developers get wrong or do not think about at all. Pool mode determines when a server-side connection is considered "available" to be assigned to the next client.

Session mode

A server connection is held for the entire duration of a client session — from client connect to client disconnect.

Client connects → gets a server connection
Client sends queries → uses that connection
Client disconnects → server connection returned to pool

This is the safest mode. Everything that works with a direct Postgres connection works in session mode — prepared statements, advisory locks, SET session variables, LISTEN/NOTIFY, everything.

The downside: the pool size directly limits concurrent clients. If you have 20 server connections in session mode and 21 clients connect, the 21st waits until someone disconnects. You get multiplexing across time (reconnections are cheap) but not across concurrent sessions.

Use session mode when: your application uses session-level features (prepared statements, advisory locks, temp tables, LISTEN) and cannot be refactored to avoid them. Also the correct default if you are not sure.

Transaction mode

A server connection is held only for the duration of a transaction — from BEGIN to COMMIT or ROLLBACK. Between transactions, the server connection is returned to the pool and can be given to a different client.

Client 1: BEGIN → [gets server conn A] → COMMIT → [returns conn A]
Client 2 starts after Client 1: BEGIN → [gets server conn A] → COMMIT
Client 3: BEGIN → [gets server conn B] → COMMIT

Two clients sharing a single server connection across different transactions. This is where real multiplexing happens.

In transaction mode, 20 server connections can genuinely serve hundreds of clients simultaneously as long as each transaction is short. This is the mode that lets you run 1,000 application connections through 20 Postgres connections.

The trade-off: session-level features break. Prepared statements created in one transaction are not visible in the next (different server connection). SET variables apply per-connection and get reset. Advisory locks taken in one transaction may be held by the server connection even after your transaction ends. LISTEN does not work at all.

Use transaction mode when: your application does not use any session-level features, or you explicitly design around the limitations. This is the correct mode for most high-throughput web applications using an ORM.

Statement mode

A server connection is held only for a single statement. Even explicit transactions are broken into individual statements, each potentially on a different server connection.

This means multi-statement transactions are impossible — each statement might go to a different connection, so BEGIN/COMMIT wrapping multiple statements does not work.

Use statement mode when: every operation is a single atomic statement and you need absolute maximum multiplexing. Rare in practice. Most applications need transactions.

PgBouncer — the standard solution

PgBouncer is the industry-standard connection pooler for PostgreSQL. It is a lightweight proxy that sits between your application and Postgres.

Application servers
    ↓
  PgBouncer (port 6432)
    ↓
  PostgreSQL (port 5432)

Your application connects to PgBouncer instead of directly to Postgres. PgBouncer maintains a pool of real Postgres connections and assigns them to your application connections according to the pool mode.

PgBouncer is:

Written in C, extremely low overhead
Single-process, uses libevent for async I/O (handles thousands of client connections with minimal CPU)
Battle-tested — used at companies running millions of queries per second
Not a query router or load balancer. It only pools connections to a single Postgres instance (or replica). For multiple replicas you run multiple PgBouncers or put a load balancer in front.

PgBouncer setup and configuration

Installation

# Ubuntu / Debian
sudo apt install pgbouncer

# macOS
brew install pgbouncer

# From source
git clone https://github.com/pgbouncer/pgbouncer.git
cd pgbouncer && ./configure && make && make install

pgbouncer.ini — the main config file

[databases]
; Format: alias = host=... port=... dbname=...
; Your app connects to PgBouncer using the alias (myapp)
; PgBouncer connects to Postgres using the real connection details
myapp = host=127.0.0.1 port=5432 dbname=myapp_production

[pgbouncer]
; PgBouncer listens on this port
listen_port = 6432
listen_addr = 127.0.0.1

; Pool mode — this is the most important setting
; Options: session | transaction | statement
pool_mode = transaction

; Max connections FROM applications TO PgBouncer
; This is how many client connections PgBouncer accepts
max_client_conn = 1000

; Max connections FROM PgBouncer TO Postgres PER database+user pair
; This is the real Postgres connection count
default_pool_size = 25

; Minimum connections kept open even when idle
min_pool_size = 5

; Extra connections opened when pool is fully in use
reserve_pool_size = 5

; How long (seconds) a client can wait for a connection before error
reserve_pool_timeout = 3

; Auth method
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

; Logging
logfile = /var/log/pgbouncer/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid

; Admin access
admin_users = pgbouncer_admin
stats_users = pgbouncer_stats

userlist.txt — credentials

; Format: "username" "password"
; Password must be scram-sha-256 or md5 hashed, not plaintext
"myapp_user" "SCRAM-SHA-256$4096:..."
"pgbouncer_admin" "SCRAM-SHA-256$4096:..."

To generate the hashed password:

-- Run this in psql, copy the output into userlist.txt
SELECT concat('"', usename, '" "', passwd, '"')
FROM pg_shadow
WHERE usename = 'myapp_user';

Start PgBouncer

pgbouncer -d /etc/pgbouncer/pgbouncer.ini

Update your application connection string

# Before (connecting directly to Postgres)
DATABASE_URL=postgresql://myapp_user:password@localhost:5432/myapp_production

# After (connecting through PgBouncer)
DATABASE_URL=postgresql://myapp_user:password@localhost:6432/myapp
#                                                        ^^^^  ^^^^^
#                                                    PgBouncer  alias from
#                                                       port    pgbouncer.ini

Your application code does not change. Only the connection string changes.

Application-level pooling vs PgBouncer

Most database libraries and ORMs have built-in connection pooling. Node.js pg has Pool, Python's psycopg2 has connection pools, Java has HikariCP. These are application-level pools.

The difference matters:

Application-level pool

App server 1: [pool: 10 connections] → Postgres
App server 2: [pool: 10 connections] → Postgres
App server 3: [pool: 10 connections] → Postgres
App server 4: [pool: 10 connections] → Postgres

Total Postgres connections: 40

Each application server maintains its own pool. With 4 servers each holding 10 connections, Postgres sees 40 connections. Scale to 20 servers and Postgres sees 200 connections. The Postgres connection count scales with the number of application servers.

PgBouncer

App server 1: [pool: 10 client conns] → PgBouncer
App server 2: [pool: 10 client conns] → PgBouncer
App server 3: [pool: 10 client conns] → PgBouncer
App server 4: [pool: 10 client conns] → PgBouncer
                                            ↓
                                  [25 real connections]
                                            ↓
                                        Postgres

Total Postgres connections: 25 (regardless of app server count)

PgBouncer is a shared pool across all application servers. Add more application servers and Postgres still sees 25 connections. The Postgres connection count is now fixed and independent of how many servers you run.

This is the critical advantage of a dedicated pooler over application-level pooling for multi-server deployments.

When application-level pooling is enough

Single application server (connection count is bounded anyway)
Small scale where Postgres connection count never approaches max_connections
You need session-level features and cannot use transaction mode
You want simpler infrastructure

When you need PgBouncer

Multiple application servers (connection count grows with server count)
Serverless or auto-scaling environments where connection count is unpredictable
High traffic where raw connection overhead is measurable
You are hitting or approaching max_connections

How to size your pool correctly

The most common mistake is setting default_pool_size too high. More connections does not mean more throughput beyond a certain point.

The formula

Postgres performance research (including the widely-cited PgCon paper) suggests:

optimal_connections = (number of CPU cores × 2) + number of effective spindles

For a Postgres server with 4 CPU cores and SSD (treat as 1 spindle):

optimal_connections = (4 × 2) + 1 = 9

That seems very low. But it reflects a real phenomenon: beyond a certain number of concurrent connections, Postgres spends more time on context switching, lock contention, and shared buffer management than it does executing queries. Adding more connections actually reduces throughput.

For most production systems, a reasonable starting range is:

Postgres server size	Starting pool size
2 CPU cores	5–10
4 CPU cores	10–20
8 CPU cores	20–40
16 CPU cores	40–80

These are starting points, not absolute rules. Profile under your actual load.

The `max_client_conn` vs `default_pool_size` relationship

max_client_conn = 1000   ; clients (your app) can open up to 1000 connections to PgBouncer
default_pool_size = 25   ; PgBouncer uses at most 25 real Postgres connections

The ratio here (1000:25 = 40:1) means 40 application connections share each Postgres connection. This works in transaction mode because each transaction is short — a connection is free within milliseconds and available to the next client.

If your transactions are long (seconds), the ratio needs to be smaller. A connection held for 2 seconds cannot serve 40 clients in that window.

Reserve pool

reserve_pool_size = 5
reserve_pool_timeout = 3

When the main pool (default_pool_size) is fully in use, PgBouncer can open up to reserve_pool_size additional connections. These are only used when a client has been waiting reserve_pool_timeout seconds. This is a safety valve for traffic spikes, not a replacement for correctly sizing the main pool.

What happens when the pool is exhausted

When all connections in the pool are in use and a new client request arrives, PgBouncer does not fail immediately. It queues the client and waits.

The wait behavior is controlled by:

; How long a client waits before receiving an error
server_connect_timeout = 15    ; timeout connecting to Postgres
query_wait_timeout = 120       ; how long client waits in queue before error

If query_wait_timeout elapses before a connection becomes available, the client gets:

ERROR: pgbouncer cannot connect to server

ERROR: no more connections allowed (max_client_conn)

From the application's perspective this looks like a database error. Monitor the PgBouncer queue length. If clients are regularly waiting for connections, your pool is undersized for your load — either increase default_pool_size (and make sure Postgres can handle it) or optimize query duration so connections are freed faster.

Monitoring your pool

PgBouncer exposes a virtual admin database called pgbouncer. Connect to it using psql:

psql -h 127.0.0.1 -p 6432 -U pgbouncer_admin pgbouncer

SHOW POOLS

The most important view:

SHOW POOLS;

 database  |   user    | cl_active | cl_waiting | sv_active | sv_idle | sv_used | maxwait
-----------+-----------+-----------+------------+-----------+---------+---------+---------
 myapp     | myapp_user|        18 |          2 |        20 |       5 |       0 |       1

Column	What it means
`cl_active`	Clients currently executing a query
`cl_waiting`	Clients waiting for a free server connection
`sv_active`	Server connections currently in use
`sv_idle`	Server connections open but idle, ready for next client
`sv_used`	Server connections that were used and returned but not yet tested
`maxwait`	Longest wait time (seconds) for a client in the queue

cl_waiting > 0 means clients are being queued — pool pressure. maxwait climbing above 1-2 seconds means clients are noticeably delayed waiting for a connection.

SHOW STATS

SHOW STATS;

 database  | total_xact_count | total_query_count | total_wait_time | avg_xact_time | avg_query_time
-----------+------------------+-------------------+-----------------+---------------+---------------
 myapp     |          1284910 |           3921043 |           14280 |            12 |              4

avg_query_time in microseconds. If this climbs, queries are getting slower — could be missing indexes, lock contention, or degraded Postgres performance, not necessarily a pooling issue.

total_wait_time accumulating — clients spending time waiting for connections. Compare against total_xact_count to get average wait per transaction.

SHOW CLIENTS

SHOW CLIENTS;

Shows each individual client connection — which database, which user, current state, how long it has been connected, and how long the current query has been running. Useful for finding stuck connections that are holding a server connection without doing anything.

SHOW SERVERS

SHOW SERVERS;

Shows each individual server (Postgres) connection PgBouncer is maintaining — state, age, bytes sent/received. A server connection in active state for an unusually long time usually means a long-running transaction or a stuck query on the Postgres side.

Common mistakes

1. Using transaction mode with session-level features

Transaction mode breaks prepared statements, advisory locks, temp tables, and SET variables. If your ORM uses prepared statements (most do by default), you must either:

Switch to session mode
Disable prepared statements in your ORM config
Use a library that is aware of PgBouncer transaction mode For Node.js pg, disable prepared statements:

// prepared statements (default) — breaks in transaction mode
const client = await pool.connect();
const res = await client.query('SELECT $1::text', ['value']); // uses prepared statement

// parameterized query without prepared statement — works in transaction mode
const res = await pool.query({
  text: 'SELECT $1::text',
  values: ['value'],
  // no 'name' field = not a prepared statement
});

For Prisma, set pgbouncer=true in the connection string:

DATABASE_URL="postgresql://user:pass@localhost:6432/db?pgbouncer=true"

For SQLAlchemy (Python):

engine = create_engine(
    DATABASE_URL,
    pool_pre_ping=True,
    # disable prepared statements for PgBouncer transaction mode
    connect_args={"options": "-c statement_timeout=30000"},
    execution_options={"no_parameters": True}
)

2. Setting pool size too high

A common first instinct: "queries are slow, add more connections." Beyond the optimal connection count, adding more connections increases contention on Postgres's internal lock manager, shared buffers, and WAL writer. Throughput plateaus then degrades.

Profile first. If EXPLAIN ANALYZE shows slow queries, the problem is probably missing indexes or inefficient queries — not pool size.

3. Not handling connection errors in the application

Connections in the pool can go stale — Postgres restarts, network blips, idle connection timeouts. Without pool_pre_ping or equivalent, your application gets a dead connection from the pool and the query fails with a confusing error.

Always configure your pool or ORM to validate connections before use:

// node-postgres: use pool.connect() which handles reconnection
// or set idleTimeoutMillis shorter than Postgres's idle timeout

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  idleTimeoutMillis: 30000,  // close idle connections after 30s
  connectionTimeoutMillis: 2000, // error if no connection in 2s
});

; PgBouncer side: test server connections before giving to client
server_check_query = SELECT 1
server_check_delay = 30

4. Not monitoring cl_waiting

cl_waiting > 0 is a real-time signal that your pool is undersized. Most teams only discover this when users start complaining about slow requests or timeouts. Set up an alert on cl_waiting in your monitoring stack before that happens.

5. Ignoring `server_lifetime` and `server_idle_timeout`

PgBouncer keeps server connections open for efficiency, but long-lived connections can accumulate bloat (Postgres's per-connection memory) and can get into bad state after schema changes or Postgres reloads.

; Close and reopen server connections after this many seconds
server_lifetime = 3600      ; 1 hour

; Close idle server connections that have been idle for this long
server_idle_timeout = 600   ; 10 minutes

These ensure connections are periodically recycled without closing them constantly.

The mental model

Think of PgBouncer as a front desk at a hotel.

Postgres has 25 rooms (server connections). The hotel can have 1,000 guests (application clients). Not all 1,000 guests need a room simultaneously — most are out during the day. The front desk (PgBouncer) manages who has a room key at any given moment.

In session mode: a guest keeps their room key the entire stay. 25 rooms = 25 simultaneous guests maximum.

In transaction mode: a guest gets a key, drops off their bags (executes a transaction), returns the key. The next guest gets that same key immediately. 25 rooms can serve hundreds of guests throughout the day.

The front desk does not create new rooms. It manages existing ones efficiently.

Without pooling:
  1,000 requests = 1,000 Postgres connections = Postgres overwhelmed

With PgBouncer (transaction mode):
  1,000 requests → 1,000 PgBouncer client connections
               → 25 real Postgres connections
               → Postgres handles it comfortably

The number of Postgres connections is now a knob you control, independent of traffic volume.

Quick reference

Key pgbouncer.ini settings

Setting	What it controls	Typical value
`pool_mode`	session / transaction / statement	`transaction`
`default_pool_size`	real Postgres connections per db+user	20–40
`max_client_conn`	max app connections to PgBouncer	500–1000
`min_pool_size`	connections kept open when idle	5
`reserve_pool_size`	extra connections for spikes	5
`server_lifetime`	recycle connections after N seconds	3600
`server_idle_timeout`	close idle connections after N seconds	600
`query_wait_timeout`	client wait limit before error	30–120

Key monitoring queries

-- pool health: look for cl_waiting > 0 and maxwait > 1
SHOW POOLS;

-- throughput and timing
SHOW STATS;

-- individual stuck connections
SHOW CLIENTS;

-- real Postgres connections
SHOW SERVERS;

-- reload config without restart
RELOAD;

-- gracefully close idle server connections
RECONNECT myapp;

Table of Contents