PostgreSQL Connection Pooling in FastAPI: PgBouncer vs. SQLAlchemy QueuePool vs. Render's Native Pools
Connection pooling sounds boring until your SaaS is down at 2 AM because someone deployed with pool_size=5 and 47 concurrent requests are waiting for a connection. I've been there. I've shipped CitizenApp with three different pooling strategies, and each taught me something painful.
Here's the truth: connection pooling isn't just about performance—it's about preventing silent failures. A misconfigured pool doesn't scream; it quietly drops requests, exhausts your database connection limit, or leaks connections like a rotting pipe. This post is the methodology and configurations I use now.
Why Connection Pooling Matters More Than You Think
FastAPI is async, which means one worker can juggle hundreds of concurrent tasks. But PostgreSQL connections are synchronous and expensive—opening a new connection costs 5-50ms depending on your setup. Without pooling, each database query opens and closes a connection. At scale, you're spending more time handshaking than executing SQL.
Here's the silent killer: idle connections. If you set pool_size=50 and your app idles, you're holding 50 open PostgreSQL connections, each consuming memory on the database server. On Render's shared tier, you get ~100 connections total. Waste 50 of them, and you're one code deploy away from "too many connections" errors.
I prefer SQLAlchemy QueuePool in development because I can see pool exhaustion immediately (it blocks loudly). In production, I use PgBouncer in transaction mode on managed infrastructure, or Render's native pools if available. Each has tradeoffs.
The Three Strategies: Configuration & Tradeoffs
1. SQLAlchemy QueuePool (Development + Small Production Setups)
QueuePool is what you get out of the box with SQLAlchemy. It's not magic—it's a thread-safe queue of connections.
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
DATABASE_URL = "postgresql://user:pass@localhost/citizenapp"
engine = create_engine(
DATABASE_URL,
poolclass=QueuePool,
pool_size=10, # Connections actively maintained
max_overflow=20, # Extra connections if pool exhausted
pool_recycle=3600, # Recycle connections after 1 hour
pool_pre_ping=True, # Verify connection before use
echo=False,
)
Why this configuration:
-
pool_size=10: For a small team or single-tenant dev environment, 10 is safe. Each FastAPI worker with 2-4 workers = 20-40 total connections. -
max_overflow=20: If load spikes, allow 20 extra connections. But they'll be closed after use—this is your escape hatch, not your baseline. -
pool_recycle=3600: PostgreSQL closes idle connections after 30 minutes by default. Recycling at 1 hour prevents stale connection errors. -
pool_pre_ping=True: Before handing out a connection, run aSELECT 1. Costs ~1ms per query, but saves cascading failures if the connection died.
When to use this:
- Development environments (you want loud failures).
- Single-tenant apps on Vercel/Render with <50 concurrent users.
- Internal tools where you control load.
When NOT to use:
- Multi-tenant SaaS with variable load (you'll hit max_overflow and get connection timeouts).
- Shared database infrastructure with strict connection limits.
2. PgBouncer in Transaction Mode (Multi-Tenant SaaS)
This is my go-to for CitizenApp. PgBouncer sits between FastAPI and PostgreSQL, multiplexing connections. In transaction mode, PgBouncer returns a connection to the pool after each transaction completes—not when the client disconnects.
; pgbouncer.ini
[databases]
citizenapp = host=prod-postgres.render.ondigitalocean.com port=5432 dbname=citizenapp
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
min_pool_size = 10
reserve_pool_size = 5
reserve_pool_timeout = 3
max_db_connections = 100
max_user_connections = 100
idle_in_transaction_session_timeout = 60000 ; 60s
FastAPI config stays simple:
from sqlalchemy import create_engine
# Connect to PgBouncer, not PostgreSQL directly
DATABASE_URL = "postgresql://user:pass@localhost:6432/citizenapp"
engine = create_engine(
DATABASE_URL,
poolclass=QueuePool,
pool_size=5, # Small—PgBouncer does the real pooling
max_overflow=5,
pool_pre_ping=True,
echo=False,
)
Why this configuration:
-
pool_mode = transaction: After each query, the connection returns to PgBouncer's pool. This lets one PostgreSQL connection serve 10+ concurrent clients. -
max_client_conn = 1000: FastAPI can open 1000 connections to PgBouncer. -
default_pool_size = 25: PgBouncer maintains 25 connections to PostgreSQL per database. -
idle_in_transaction_session_timeout = 60000: If a client opens a transaction and goes silent, kill it after 60s. Prevents zombie connections. -
pool_size=5in FastAPI: Why 5? PgBouncer does connection pooling. FastAPI's pool is just a safety buffer for your application layer.
The gotcha with transaction mode: You cannot use PostgreSQL transactions that span multiple queries unless PgBouncer is in session mode (slower, uses more database connections). For CitizenApp, we structure queries to be transaction-less where possible, or use explicit savepoint blocks:
from sqlalchemy import text
async def create_tenant_with_settings(db, tenant_data, settings):
async with db.begin(): # Explicit transaction
result = await db.execute(
insert(Tenant).values(**tenant_data).returning(Tenant.id)
)
tenant_id = result.scalar()
# All queries here are in the transaction
await db.execute(
insert(TenantSettings).values(tenant_id=tenant_id, **settings)
)
# Transaction commits, connection returns to PgBouncer's pool
When to use PgBouncer:
- Multi-tenant SaaS with unpredictable load (like CitizenApp).
- Shared database infrastructure with strict connection limits.
- When you want to decouple application connection pools from database connection limits.
When NOT to use:
- If you need long-lived database transactions (PgBouncer transaction mode breaks this).
- Simple setups where overhead isn't justified.
3. Render's Native Connection Pooling
If you're on Render (I host CitizenApp here), Render offers managed connection pooling built into the PostgreSQL database.
# No special config needed—Render handles it
from sqlalchemy import create_engine
DATABASE_URL = os.getenv("DATABASE_URL") # Render injects this
engine = create_engine(
DATABASE_URL,
poolclass=QueuePool,
pool_size=5,
max_overflow=5,
pool_pre_ping=True,
)
When you create a PostgreSQL instance on Render, you get:
- Built-in PgBouncer in transaction mode.
- Connection limit automatically sized based on your plan.
- Monitoring and metrics in the Render dashboard.
The advantage: zero ops. Render manages PgBouncer for you. The disadvantage: you can't tune it per your needs, and you're trusting Render's defaults (which are usually fine).
Testing & Monitoring: Catch Pool Exhaustion Before It Happens
Here's my testing methodology:
import asyncio
import time
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
async def load_test_pool(url: str, concurrent_queries: int = 100):
"""Simulate concurrent load and measure pool behavior."""
engine = create_async_engine(
url,
pool_size=10,
max_overflow=20,
pool_pre_ping=True,
)
async def run_query(session_id: int):
async with AsyncSession(engine) as session:
start = time.time()
await session.execute(text("SELECT 1"))
await asyncio.sleep(0.01) # Simulate work
elapsed = time.time() - start
return elapsed
start = time.time()
tasks = [run_query(i) for i in range(concurrent_queries)]
results = await asyncio.gather(*tasks, return_exceptions=True)
total = time.time() - start
errors = [r for r in results if isinstance(r, Exception)]
latencies = [r for r in results if not isinstance(r, Exception)]
print(f"Total time: {total:.2f}s")
print(f"Successful queries: {len(latencies)}")
print(f"Errors: {len(errors)}")
print(f"P50 latency: {sorted(latencies)[len(latencies)//2]:.3f}s")
print(f"P99 latency: {sorted(latencies)[int(len(latencies)*0.99)]:.3f}s")
await engine.dispose()
# Run: asyncio.run(load_test_pool("postgresql://...", 100))
Monitoring query for pool exhaustion:
sql
-- On your PostgreSQL server, run this every 30s
SELECT
datname,
count(*) as total_connections,
sum(case when state = 'active' then 1 else 0 end) as active,
sum(case when state = 'idle' then 1 else 0 end) as idle,
sum(case when
Top comments (0)