DEV Community

Cover image for Find Your Optimal PostgreSQL Connection Pool Size with k6 Load Testing
Manash Jyoti Baruah
Manash Jyoti Baruah

Posted on • Originally published at manash.dev

Find Your Optimal PostgreSQL Connection Pool Size with k6 Load Testing

A step-by-step guide to identifying and fixing connection pool bottlenecks in Node.js applications

How I reduced API response time from 1,221ms to 280ms using systematic load testing and pool optimization


What You'll Learn

In this guide, you'll discover how to:

  • Identify connection pool bottlenecks in your Node.js application
  • Calculate the optimal pool size using load testing with k6
  • Monitor pool health with simple metrics
  • Fix performance issues that cause 1000ms+ response times

Let's start with a real production issue I encountered and how I solved it systematically.


The Problem: Slow Response Times with Low CPU Usage

During a routine load test of my Node.js API with 150 concurrent users, I observed these concerning metrics:

Node.js CPU: 45%
RAM: 60%
PostgreSQL CPU: 15%
Average Response Time: 1,221ms
P95 Latency: 2,890ms
Enter fullscreen mode Exit fullscreen mode

The symptom was clear: Database CPU at 15% while response times exceeded 1 second indicated a bottleneck between the application and database layers, not a database performance issue.


Step 1: Add Connection Pool Monitoring

First, I added monitoring to understand what's happening inside the connection pool. This single line of code revealed everything:

// Monitor pool health every 5 seconds
setInterval(() => {
  console.log(
    `[POOL] Total: ${pool.totalCount} | Idle: ${pool.idleCount} | Waiting: ${pool.waitingCount}`
  );
}, 5000);
Enter fullscreen mode Exit fullscreen mode

Key Metrics Explained:

  • Total: Number of connections in the pool
  • Idle: Available connections ready to use
  • Waiting: Requests queued because all connections are busy

Basic Server Setup with Monitoring

import dotenv from "dotenv";
import express from "express";
import { Pool } from "pg";

dotenv.config();

const app = express();
const PORT = process.env.PORT || 3000;

const poolConfig = {
  host: process.env.DB_HOST,
  port: process.env.DB_PORT,
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,

  // These are what we're testing
  min: parseInt(process.env.POOL_MIN) || 2,
  max: parseInt(process.env.POOL_MAX) || 10,

  // Important timeouts
  idleTimeoutMillis: parseInt(process.env.POOL_IDLE_TIMEOUT) || 30000,
  connectionTimeoutMillis: parseInt(process.env.POOL_CONNECTION_TIMEOUT) || 2000,
};

const pool = new Pool(poolConfig);
const WORKER_ID = process.pid;

console.log("=================================");
console.log(`[Worker-${WORKER_ID}]`);
console.log(`  Host: ${poolConfig.host}`);
console.log(`  Database: ${poolConfig.database}`);
console.log(`  Pool Min: ${poolConfig.min}`);
console.log(`  Pool Max: ${poolConfig.max}`);
console.log("=================================\n");

// Test database connection
pool.query("SELECT NOW()", (err, res) => {
  if (err) {
    console.error(`[Worker-${WORKER_ID}] Database connection failed:`, err.message);
    process.exit(1);
  }
  console.log(`[Worker-${WORKER_ID}] Database connected at:`, res.rows[0].now);
});

// API endpoint for testing
app.get("/users", async (req, res) => {
  const startTime = Date.now();

  try {
    const result = await pool.query("SELECT * FROM users LIMIT 100");
    const duration = Date.now() - startTime;

    res.json({
      success: true,
      count: result.rows.length,
      data: result.rows,
      query_time_ms: duration,
    });
  } catch (err) {
    const duration = Date.now() - startTime;
    res.status(500).json({
      success: false,
      error: err.message,
      query_time_ms: duration,
    });
  }
});

// The critical monitoring - logs every 5 seconds
setInterval(() => {
  console.log(
    `[Worker-${WORKER_ID}] [POOL] Total: ${pool.totalCount} | Idle: ${pool.idleCount} | Waiting: ${pool.waitingCount}`
  );
}, 5000);

// Graceful shutdown
process.on("SIGTERM", async () => {
  console.log("\nSIGTERM received, closing pool...");
  await pool.end();
  process.exit(0);
});

process.on("SIGINT", async () => {
  console.log("\nSIGINT received, closing pool...");
  await pool.end();
  process.exit(0);
});

app.listen(PORT, () => {
  console.log(`\n[Worker-${WORKER_ID}] Server running on http://localhost:${PORT}`);
  console.log(`[Worker-${WORKER_ID}] Users endpoint: http://localhost:${PORT}/users\n`);
});
Enter fullscreen mode Exit fullscreen mode

Step 2: Create k6 Load Test

Create a realistic load test that simulates production traffic:

// load-test.js
import http from "k6/http";
import { check, sleep } from "k6";

export const options = {
  stages: [
    { duration: "10s", target: 10 },   // Warm up
    { duration: "20s", target: 30 },   // Ramp up
    { duration: "30s", target: 60 },   // Increase load
    { duration: "10s", target: 100 },  // Push harder
    { duration: "15s", target: 150 },  // Peak traffic
    { duration: "10s", target: 0 },    // Cool down
  ],
  thresholds: {
    http_req_duration: ["p(95)<2000"], // 95% of requests under 2s
    http_req_failed: ["rate<0.1"],     // Less than 10% errors
  },
};

export default function () {
  const res = http.get("http://localhost:3000/users");

  check(res, {
    "status is 200": (r) => r.status === 200,
    "response time < 500ms": (r) => r.timings.duration < 500,
  });

  sleep(0.1); // Each user waits 100ms between requests
}

export function handleSummary(data) {
  const metrics = data.metrics;
  console.log("\n========================================");
  console.log("LOAD TEST RESULTS");
  console.log("========================================");
  console.log(`Total Requests: ${metrics.http_reqs.values.count}`);
  console.log(`Average Response: ${metrics.http_req_duration.values.avg.toFixed(2)}ms`);
  console.log(`P95 Latency: ${metrics.http_req_duration.values["p(95)"].toFixed(2)}ms`);
  console.log(`Failed Requests: ${(metrics.http_req_failed.values.rate * 100).toFixed(2)}%`);
  console.log("========================================\n");
  return { stdout: "" };
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Test Different Pool Sizes Systematically

I tested each pool size by editing the .env file and restarting the server:

.env Configuration

# Database configuration
DB_HOST=localhost
DB_PORT=5432
DB_NAME=testdb
DB_USER=postgres
DB_PASSWORD=password

# Pool configuration - Edit POOL_MAX for each test
POOL_MIN=1
POOL_MAX=10  # Change to: 10, 15, 20, 40
POOL_IDLE_TIMEOUT=30000
POOL_CONNECTION_TIMEOUT=10000
Enter fullscreen mode Exit fullscreen mode

Testing Workflow

For each pool size test:

  1. Edit .env file - Change POOL_MAX value
  2. Start the server - npm run dev
  3. Run load test - k6 run load-tests/test.js (in another terminal)
  4. Collect results - Watch the console logs and k6 output
  5. Stop server - Ctrl+C and repeat with next pool size
# Step 1: Set POOL_MAX=10 in .env file
# Step 2: Start server
npm run dev

# Step 3: In another terminal, run k6
k6 run load-tests/test.js

# Step 4: Note the results
# Step 5: Stop server (Ctrl+C)

# Repeat: Edit .env to POOL_MAX=15
npm run dev
k6 run load-tests/test.js

# Repeat: Edit .env to POOL_MAX=20
npm run dev
k6 run load-tests/test.js

# Repeat: Edit .env to POOL_MAX=40
npm run dev
k6 run load-tests/test.js
Enter fullscreen mode Exit fullscreen mode

Step 4: Analyse Test Results

Test #1: Pool Size 10 (Baseline)

Server Logs During Load:

[Worker-24064] [POOL] Total: 1 | Idle: 0 | Waiting: 0
[Worker-24064] [POOL] Total: 6 | Idle: 1 | Waiting: 0
[Worker-24064] [POOL] Total: 10 | Idle: 3 | Waiting: 0
[Worker-24064] [POOL] Total: 10 | Idle: 1 | Waiting: 0
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 11
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 14
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 22
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 24
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 36
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 37
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 58
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 53
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 69
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 72
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 105
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 123
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 130   PEAK QUEUE
[Worker-24064] [POOL] Total: 10 | Idle: 0 | Waiting: 54
[Worker-24064] [POOL] Total: 10 | Idle: 6 | Waiting: 0
[Worker-24064] [POOL] Total: 10 | Idle: 10 | Waiting: 0
[Worker-24064] [POOL] Total: 0 | Idle: 0 | Waiting: 0
Enter fullscreen mode Exit fullscreen mode

k6 Results:

Grafana k6  load testing results with pg connection pool limit 10

Test #2: Pool Size 15

Server Logs:

[Worker-29532] [POOL] Total: 1 | Idle: 1 | Waiting: 0
[Worker-29532] [POOL] Total: 5 | Idle: 1 | Waiting: 0
[Worker-29532] [POOL] Total: 8 | Idle: 3 | Waiting: 0
[Worker-29532] [POOL] Total: 11 | Idle: 1 | Waiting: 0
[Worker-29532] [POOL] Total: 15 | Idle: 4 | Waiting: 0
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 0
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 3
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 11
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 16
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 20
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 24
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 30
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 34
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 55
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 72
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 89
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 104
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 120   PEAK QUEUE
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 81
[Worker-29532] [POOL] Total: 15 | Idle: 0 | Waiting: 44
[Worker-29532] [POOL] Total: 15 | Idle: 15 | Waiting: 0
[Worker-29532] [POOL] Total: 14 | Idle: 14 | Waiting: 0
[Worker-29532] [POOL] Total: 0 | Idle: 0 | Waiting: 0
Enter fullscreen mode Exit fullscreen mode

k6 Results:

Grafana k6  load testing results with pg connection pool limit 15

Test #3: Pool Size 20

Server Logs:

[Worker-31608] [POOL] Total: 1 | Idle: 1 | Waiting: 0
[Worker-31608] [POOL] Total: 4 | Idle: 2 | Waiting: 0
[Worker-31608] [POOL] Total: 8 | Idle: 4 | Waiting: 0
[Worker-31608] [POOL] Total: 14 | Idle: 2 | Waiting: 0
[Worker-31608] [POOL] Total: 18 | Idle: 1 | Waiting: 0
[Worker-31608] [POOL] Total: 20 | Idle: 8 | Waiting: 0
[Worker-31608] [POOL] Total: 20 | Idle: 3 | Waiting: 0
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 2
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 10
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 18
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 24
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 36
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 48
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 66
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 86
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 99
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 114   PEAK
[Worker-31608] [POOL] Total: 20 | Idle: 0 | Waiting: 60
[Worker-31608] [POOL] Total: 20 | Idle: 8 | Waiting: 0
[Worker-31608] [POOL] Total: 20 | Idle: 20 | Waiting: 0    RECOVERED
[Worker-31608] [POOL] Total: 8 | Idle: 8 | Waiting: 0
[Worker-31608] [POOL] Total: 0 | Idle: 0 | Waiting: 0
Enter fullscreen mode Exit fullscreen mode

k6 Results:

Grafana k6  load testing results with pg connection pool limit 20

Test #4: Pool Size 40

Server Logs:

[Worker-31644] [POOL] Total: 1 | Idle: 1 | Waiting: 0
[Worker-31644] [POOL] Total: 4 | Idle: 0 | Waiting: 0
[Worker-31644] [POOL] Total: 9 | Idle: 0 | Waiting: 0
[Worker-31644] [POOL] Total: 13 | Idle: 2 | Waiting: 0
[Worker-31644] [POOL] Total: 21 | Idle: 4 | Waiting: 0
[Worker-31644] [POOL] Total: 27 | Idle: 8 | Waiting: 0
[Worker-31644] [POOL] Total: 32 | Idle: 12 | Waiting: 0
[Worker-31644] [POOL] Total: 38 | Idle: 1 | Waiting: 0
[Worker-31644] [POOL] Total: 40 | Idle: 4 | Waiting: 0
[Worker-31644] [POOL] Total: 40 | Idle: 6 | Waiting: 0
[Worker-31644] [POOL] Total: 40 | Idle: 13 | Waiting: 0
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 0
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 12
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 36
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 51
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 65
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 88   PEAK (still queuing!)
[Worker-31644] [POOL] Total: 40 | Idle: 0 | Waiting: 19
[Worker-31644] [POOL] Total: 40 | Idle: 36 | Waiting: 0
[Worker-31644] [POOL] Total: 40 | Idle: 40 | Waiting: 0   RECOVERED
[Worker-31644] [POOL] Total: 7 | Idle: 7 | Waiting: 0
[Worker-31644] [POOL] Total: 0 | Idle: 0 | Waiting: 0
Enter fullscreen mode Exit fullscreen mode

k6 Results:

Grafana k6  load testing results with pg connection pool limit 40

Analysis:

  • Best response times: 280ms average (77% improvement from baseline)
  • Highest throughput: 13,680 requests (239% increase)
  • Still experienced queuing: Peak of 88 waiting despite 40 connections
  • Diminishing returns: Only 59% improvement over Pool 20

Step 5: Performance Analysis

Results Summary

Pool Size Total Requests Avg Response P95 Latency Max Queue Improvement
10 4,040 1,221ms 2,890ms 130 Baseline
15 6,969 654ms 1,545ms 120 +72.5%
20 8,605 508ms 1,171ms 114 +113% (Optimal)
40 13,680 280ms 529ms 88 +239%

Key Insights

  1. 10 → 15: Significant improvement (72.5% throughput increase, 46% latency reduction)
  2. 15 → 20: Strong continued gains (23.5% more throughput, 22% latency reduction)
  3. 20 → 40: Diminishing returns (59% more throughput but 2x the connections)

VERDICT: Pool Size 20 is Optimal

While Pool Size 40 delivers the best raw performance, Pool Size 20 provides the best balance:

  • Pool 20 advantages:

    • 113% throughput improvement with just 20 connections
    • Clears queue completely (recovers to 0)
    • Uses 50% fewer database resources than Pool 40
    • Already achieves sub-600ms response times
  • Pool 40 considerations:

    • Only 59% better than Pool 20 despite 100% more connections
    • Still experiences queuing (88 peak) during high load
    • Higher database resource consumption
    • Better suited if you expect >200 concurrent users

Step 6: Calculate Your Optimal Pool Size

Based on the testing, here's a formula to calculate your starting point:

function calculateOptimalPoolSize(concurrentUsers, avgQueryTimeMs) {
  // Base calculation
  const baseSize = (concurrentUsers * avgQueryTimeMs) / 1000;

  // Add 20% safety buffer
  const withBuffer = baseSize * 1.2;

  return Math.ceil(withBuffer);
}

// Example for our case
const poolSize = calculateOptimalPoolSize(150, 150);
console.log(poolSize); // Returns 27

// Testing showed 20 was sufficient because actual query time was faster
Enter fullscreen mode Exit fullscreen mode

Important Constraints

// Check database limits
const dbMaxConnections = 100;  // from SHOW max_connections;
const reserveForAdmin = 10;
const availableForApp = dbMaxConnections - reserveForAdmin;

// If using multiple workers
const workers = 4;  // PM2 cluster mode
const poolPerWorker = Math.floor(availableForApp / workers);
console.log(`Each worker gets ${poolPerWorker} connections`);
Enter fullscreen mode Exit fullscreen mode

Step 7: Production Implementation

Enhanced Monitoring Setup

const ALERT_THRESHOLDS = {
  waitingQueue: 20,
  idlePercentage: 10,
};

setInterval(() => {
  const total = pool.totalCount;
  const idle = pool.idleCount;
  const waiting = pool.waitingCount;
  const idlePercentage = total > 0 ? (idle / total) * 100 : 0;

  // Log metrics
  console.log(`[POOL] Total: ${total} | Idle: ${idle} | Waiting: ${waiting}`);

  // Alert conditions
  if (waiting > ALERT_THRESHOLDS.waitingQueue) {
    console.error(` [ALERT] High queue: ${waiting} requests waiting`);
    // Send to monitoring service
  }

  if (idlePercentage < ALERT_THRESHOLDS.idlePercentage && total > 0) {
    console.warn(` [ALERT] Low idle connections: ${idlePercentage.toFixed(1)}%`);
  }
}, 5000);
Enter fullscreen mode Exit fullscreen mode

Database Monitoring Queries

-- Check current connections
SELECT count(*) as total,
       count(*) FILTER (WHERE state = 'active') as active,
       count(*) FILTER (WHERE state = 'idle') as idle
FROM pg_stat_activity
WHERE datname = current_database();

-- Find connection hogs
SELECT pid, usename, application_name, state,
       NOW() - query_start as duration,
       substring(query, 1, 50) as query_preview
FROM pg_stat_activity
WHERE state != 'idle'
  AND query_start < NOW() - INTERVAL '1 minute'
ORDER BY query_start;
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls to Avoid

1. Not Releasing Connections

//  BAD: Connection leak
const client = await pool.connect();
const result = await client.query("SELECT * FROM users");
res.json(result.rows);
// Forgot to release!

//  GOOD: Always release
const client = await pool.connect();
try {
  const result = await client.query("SELECT * FROM users");
  res.json(result.rows);
} finally {
  client.release();
}

//  BETTER: Use pool.query() for simple queries
const result = await pool.query("SELECT * FROM users");
res.json(result.rows);
Enter fullscreen mode Exit fullscreen mode

2. Holding Connections During External Calls

//  BAD: Blocks connection for 2+ seconds
const client = await pool.connect();
const user = await client.query("SELECT * FROM users WHERE id = $1", [id]);
const apiData = await fetch("https://slow-api.com/data"); // 2 seconds!
await client.query("UPDATE users SET data = $1", [apiData]);
client.release();

//  GOOD: Release before external call
const user = await pool.query("SELECT * FROM users WHERE id = $1", [id]);
const apiData = await fetch("https://slow-api.com/data");
await pool.query("UPDATE users SET data = $1", [apiData]);
Enter fullscreen mode Exit fullscreen mode

Quick Reference

Healthy Pool Indicators

  • Waiting < 10 consistently
  • Idle > 0 (at least 10% of total)
  • Response time P95 < 1000ms
  • No connection timeouts

When to Increase Pool Size

  • Waiting > 30 consistently
  • Idle always 0
  • Response times > 1000ms
  • Database CPU < 50% under load

Useful Commands

# Start server (uses .env file)
npm run dev

# Run load test (in another terminal)
k6 run load-tests/test.js

# PostgreSQL connections check
psql -c "SELECT count(*) FROM pg_stat_activity"
Enter fullscreen mode Exit fullscreen mode

Conclusion

Connection pool optimization reduced our response time from 1,221ms to 280ms - a 77% improvement - through systematic testing with k6 load tests.

The Complete Performance Picture:

Metric Pool 10 Pool 20 (Recommended) Pool 40
Avg Response 1,221ms 508ms (-58%) 280ms (-77%)
P95 Latency 2,890ms 1,171ms (-59%) 529ms (-82%)
Throughput 4,040 req 8,605 req (+113%) 13,680 req (+239%)
Max Queue 130 114 88
Efficiency Poor Excellent Good

Final Verdict: Choose Pool Size 20

Why Pool 20 over Pool 40?

  1. Resource efficiency: 113% performance gain with just 20 connections
  2. Diminishing returns: Pool 40 only adds 59% more throughput for 100% more connections
  3. Cost-benefit: Pool 20 achieves the "good enough" threshold (<600ms response)
  4. Database overhead: Fewer connections = less memory and CPU on PostgreSQL

Key Takeaways:

  1. Default pool (10) is severely undersized - caused 1.2s response times
  2. Pool 20 hits the sweet spot - doubles throughput, halves latency
  3. Pool 40 shows diminishing returns - consider only for >200 concurrent users
  4. Queue monitoring is critical - even Pool 40 had 88 requests waiting at peak

Remember: The default pool size of 10 resulted in 130 requests queuing. Simply changing to Pool 20 delivered a 2x throughput improvement with sub-second response times.


Resources:

Questions? Drop them in the comments. What pool size works for your application?


💡 Found this helpful?

Drop a ❤️ and 🦄 if this helped you. Follow for more practical guides and real-world dev experiences.

🤝 Let's connect:

💼 LinkedInlinkedin.com/in/manashjb

📺 YouTubeyoutube.com/@CoderInSpace

🚀 Portfoliomanash.dev


Top comments (0)