DEV Community

HashCoder
HashCoder

Posted on

The Developer's Guide to Taming PostgreSQL: How We Tripled Throughput with PgBouncer

The Developer's Guide to Taming PostgreSQL: How We Tripled Throughput with PgBouncer

Last month, our card transaction processing service began failing during peak hours. The system, responsible for authorizing millions of dollars in transactions, was throwing ConnectionTimeoutException errors. The culprit? Our PostgreSQL database was drowning under the sheer volume of connection requests from our fleet of Spring Boot microservices.

After a deep dive, we implemented PgBouncer as a centralized connection pooler. The results were immediate and dramatic: we slashed our p95 latency by 75% (from 1.2s to 300ms) and tripled our peak transaction authorization throughput, completely eliminating connection timeouts.

This isn't just another "how-to." This is the production-grade blueprint, complete with the configurations, JVM tuning, and architectural reasoning that actually matters when the system cannot fail.

The Problem: Connection Pool Anarchy

Our architecture consisted of 8 Spring Boot microservice instances, each configured with HikariCP's default maximum-pool-size of 10. In theory, that's 80 concurrent connections. In reality, during traffic spikes and deployments, connection churn was causing massive contention on PostgreSQL.

The symptoms were a classic sign of database saturation:

  • High ConnectionTimeoutException error rates during peak traffic.
  • Excessive memory usage on the PostgreSQL server from managing hundreds of idle processes.
  • Slow query performance, even for indexed SELECT statements.
  • Cascading failures as upstream services timed out waiting for our service.

The Solution: PgBouncer as the Master Connection Manager

PgBouncer is a lightweight, highly-optimized proxy that sits between your applications and PostgreSQL. It maintains a master pool of physical connections to the database and serves thousands of client connections by multiplexing them through this small, efficient pool. It's the traffic controller your database desperately needs.

Here is our complete setup, ready to run in Docker.

Docker Compose Configuration

version: '3.8'

services:
  postgresql:
    image: postgres:16-alpine
    container_name: postgres-db
    environment:
      POSTGRES_DB: card_transactions_db
      POSTGRES_USER: app_user
      POSTGRES_PASSWORD: secure_password
      # Enforce modern, secure authentication from the start
      POSTGRES_INITDB_ARGS: "--auth-host=scram-sha-256"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./postgresql.conf:/etc/postgresql/postgresql.conf
    command: postgres -c config_file=/etc/postgresql/postgresql.conf
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app_user -d card_transactions_db"]
      interval: 10s
      timeout: 5s
      retries: 5

  pgbouncer:
    image: pgbouncer/pgbouncer:1.21.0
    container_name: pgbouncer-proxy
    ports:
      - "6432:6432"
    depends_on:
      postgresql:
        condition: service_healthy
    volumes:
      # We mount our specific configs for production-like control
      - ./pgbouncer.ini:/etc/pgbouncer/pgbouncer.ini
      - ./userlist.txt:/etc/pgbouncer/userlist.txt

  card-transaction-service:
    build: .
    container_name: card-transaction-service
    environment:
      # Note: We connect to pgbouncer, not postgresql directly
      SPRING_DATASOURCE_URL: jdbc:postgresql://pgbouncer:6432/card_transactions_db
      SPRING_DATASOURCE_USERNAME: app_user
      SPRING_DATASOURCE_PASSWORD: secure_password
      SPRING_PROFILES_ACTIVE: docker
    ports:
      - "8080:8080"
    depends_on:
      - pgbouncer
    mem_limit: 1g
    mem_reservation: 512m

volumes:
  postgres_data:
Enter fullscreen mode Exit fullscreen mode

Architect's Note on Startup: Docker's depends_on only ensures the container has started, not that the PgBouncer process within is ready. For true production resilience, use a wrapper script with a tool like wait-for-it.sh or build connection retry logic into your Spring application's startup sequence.

PgBouncer Configuration (pgbouncer.ini)

This is where the magic happens. The defaults are not sufficient for production.

[databases]
card_transactions_db = host=postgresql port=5432 dbname=card_transactions_db

[pgbouncer]
# Network and Auth
listen_addr = *
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

# Pooling Mode - CRITICAL
pool_mode = transaction

# Connection Limits
max_client_conn = 1000      # Max inbound connections from all apps
default_pool_size = 20      # Connections per DB pool kept open to Postgres
min_pool_size = 5           # Keep at least this many connections ready
reserve_pool_size = 5       # Emergency connections for admin use
max_db_connections = 40     # A hard cap on physical connections to this PG instance

# Timeouts (seconds)
server_lifetime = 3600            # Max life of a server connection
server_idle_timeout = 600         # Reconnect if idle for 10 mins
server_connect_timeout = 15
server_login_retry = 15
client_login_timeout = 60

# Query and State Management
server_reset_query = DISCARD ALL
ignore_startup_parameters = extra_float_digits

# Logging
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
stats_period = 60
Enter fullscreen mode Exit fullscreen mode

Let me explain the most critical settings:

  • pool_mode = transaction: This is the key to high throughput. A physical connection is assigned to a client for the duration of a single transaction and returned to the pool immediately upon COMMIT or ROLLBACK. This provides maximum reuse.
  • server_reset_query = DISCARD ALL: This is a non-negotiable security and stability setting. It ensures that any session-specific state (e.g., SET variables, temporary tables) is wiped clean before a connection is reused by another client, preventing state leakage between transactions.
  • Connection Limits: We've set max_client_conn high to accept traffic from all service instances, but default_pool_size low to protect PostgreSQL. The database now only ever sees a maximum of ~40 connections from our entire application fleet, regardless of how many microservices we scale up to.

The "Double Pooling" Dilemma: Why a Pool in Front of a Pool?

You're right to ask: "If PgBouncer is my pooler, why do I still need HikariCP?" This isn't redundancy; it's a separation of concerns.

  1. The Application Pool (HikariCP): Its job is to manage a pool of connections to PgBouncer. These are cheap, fast, local network connections. Its purpose is to eliminate thread contention inside your Java application. When a thread needs to run a transaction, it gets a connection instantly from HikariCP without blocking. This pool should be sized for your application's peak internal concurrency.

  2. The Master Pool (PgBouncer): Its job is to manage the truly scarce resource: physical connections to PostgreSQL. It acts as the gatekeeper, taking the high volume of requests from all your service instances and efficiently funneling them through a small, controlled number of real database connections.

By separating these concerns, you get the best of both worlds: no thread starvation in your application and no connection exhaustion on your database.

Spring Boot Configuration (application-docker.yml)

The key is to configure HikariCP to be a good citizen in this new world. It no longer needs to be stingy with connections.

spring:
  datasource:
    url: jdbc:postgresql://pgbouncer:6432/card_transactions_db
    username: app_user
    password: secure_password
    driver-class-name: org.postgresql.Driver
    hikari:
      pool-name: Hikari-Card-Service
      # Size this pool for your app's internal threads, not the DB.
      minimum-idle: 10
      maximum-pool-size: 20
      # Aggressive timeouts because connections are to the local PgBouncer proxy.
      idle-timeout: 30000
      max-lifetime: 600000 # 10 minutes
      connection-timeout: 3000
      validation-timeout: 2000
      connection-test-query: SELECT 1
  jpa:
    database-platform: org.hibernate.dialect.PostgreSQLDialect
    hibernate:
      ddl-auto: validate
    properties:
      hibernate:
        jdbc.batch_size: 30
        order_inserts: true
        order_updates: true
Enter fullscreen mode Exit fullscreen mode

Architect's Note: We've removed explicit autocommit properties. In a modern Spring Boot application using @Transactional, the Spring DataSourceTransactionManager is the definitive authority. It correctly handles transaction boundaries, making these lower-level settings redundant and potentially confusing.

JVM Tuning for Containerized Performance (Dockerfile)

Your application code runs on the JVM, and the JVM needs to be told it's running in a resource-constrained container.

FROM openjdk:21-jre-slim

# Tell the JVM it's in a container and tune the GC for low-latency service work
ENV JAVA_OPTS="-Xms512m -Xmx768m \
    -XX:+UseG1GC \
    -XX:MaxGCPauseMillis=100 \
    -XX:+UseStringDeduplication"

COPY target/card-transaction-service-*.jar app.jar

EXPOSE 8080

ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar /app.jar"]
Enter fullscreen mode Exit fullscreen mode

Performance Testing Results

The numbers speak for themselves.

Metric Before PgBouncer After PgBouncer Improvement
Avg. Response Time 450ms 150ms 67% reduction
95th Percentile 1.2s 300ms 75% reduction
Connection Errors ~15-20% 0% Eliminated
Active DB Connections 80-120 20-25 ~75% reduction

Load Testing with Artillery

Verifying performance requires a realistic load test.

# load-test.yml
config:
  target: 'http://localhost:8080'
  phases:
    - duration: 300
      arrivalRate: 100
      name: "Ramp-up"
    - duration: 600
      arrivalRate: 250
      name: "Sustained Peak Load"

scenarios:
  - name: "Authorize Card Transaction"
    flow:
      - post:
          url: "/api/transactions/authorize"
          json:
            cardNumber: "4{{ $randomDigits(15) }}"
            expiryMonth: "{{ $randomInt(1, 12) }}"
            expiryYear: 2026
            cvv: "{{ $randomDigits(3) }}"
            amount: "{{ $randomInt(10, 500) }}.00"
            currency: "USD"
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls and How to Avoid Them

  1. Pool Sizing Mismatch: The most common mistake is sizing the application pool larger than PgBouncer's pool, creating a bottleneck.

    • Rule: Sum of all HikariCP pools should be comfortably less than pgbouncer.max_client_conn. A single HikariCP pool (maximum-pool-size) should be less than or equal to pgbouncer.default_pool_size.
  2. Using SESSION Pool Mode: Unless you are using features that require a persistent connection for the entire session (like advisory locks), avoid session mode. It defeats the purpose of high-throughput pooling. transaction mode is almost always what you want.

  3. Authentication Errors: Ensure PostgreSQL's pg_hba.conf and PgBouncer's userlist.txt use the same authentication method (scram-sha-256 is the modern standard).

  4. Prepared Statements (PREPARE): By default, PgBouncer in transaction mode doesn't support named prepared statements. Spring/Hibernate uses these under the hood. The PostgreSQL JDBC driver is smart enough to detect a proxy like PgBouncer and will fall back to unnamed statements, but you must use a recent version of the driver.

Key Takeaways

  1. Centralize Connection Pooling: Don't let individual applications manage scarce database connections. Use a dedicated tool like PgBouncer to act as a central gatekeeper.
  2. Tune for a Layered World: Configure your application pool (HikariCP) for in-app thread performance and your master pool (PgBouncer) for database resource protection.
  3. transaction Mode is Your Default: It provides the best balance of performance and compatibility for microservice architectures.
  4. State Management is Critical: Use DISCARD ALL to guarantee transactional isolation when connections are reused.

The performance gains we achieved weren't magic. They were the result of a systematic approach to identifying a bottleneck and applying the right architectural pattern—with the right configuration—to solve it.

How have you tackled database connection scaling in your environment? Share your war stories in the comments.

Top comments (0)