PostgreSQL Performance Tuning: A Practical Guide for Developers
PostgreSQL is a powerhouse — but out of the box, its default configuration is deliberately conservative, tuned for a machine with 256MB of RAM running multiple services. When you inherit a slow database or a query that crawls under production load, you need a systematic toolkit: read execution plans accurately, choose the right index for the job, configure memory correctly, and keep the vacuum process from becoming a crisis.
This guide works through each layer of the stack with real SQL. By the end you will be able to diagnose a slow query in under five minutes and have a concrete plan to fix it.
1. Reading EXPLAIN ANALYZE Output
EXPLAIN shows the planned cost. EXPLAIN ANALYZE actually executes the query and shows measured timing. Always use both flags in development:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.email;
A typical output section looks like this:
HashAggregate (cost=4821.33..4923.44 rows=8170 width=40)
(actual time=38.412..40.891 rows=7842 loops=1)
Buffers: shared hit=2310 read=512
-> Hash Left Join (cost=1840.00..4719.22 rows=20420 width=32)
(actual time=18.041..34.102 rows=20420 loops=1)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders (cost=0.00..2180.20 rows=118020 width=8)
(actual time=0.009..9.811 rows=118020 loops=1)
-> Hash (cost=1738.00..1738.00 rows=8200 width=28)
(actual time=17.921..17.922 rows=8170 loops=1)
-> Index Scan using idx_users_created_at on users
(cost=0.42..1738.00 rows=8200 width=28)
(actual time=0.041..14.102 rows=8170 loops=1)
Key numbers to internalize:
- cost=start..total — planner's estimate in arbitrary units. The first number is startup cost (before the first row), the second is total cost. These are estimates; actual times tell the real story.
-
actual time=first..total rows=N loops=L — measured milliseconds. If
loopsis large, multiplyactual timebyloopsto get the true cost of that node. -
Buffers: shared hit=X read=Y —
hitmeans served from PostgreSQL's shared buffer cache;readmeans fetched from disk. A highreadcount on a hot table means yourshared_buffersis too small or your working set doesn't fit in memory. -
rows estimate vs actual rows — a large discrepancy (e.g., planner expects 100 rows, reality is 50,000) indicates stale statistics. Run
ANALYZE table_nameto update them, or lowerdefault_statistics_targetfor the affected column.
Red flags: Seq Scan on large tables, Nested Loop with a large outer row count, Hash Batches > 1 (hash join spilled to disk due to insufficient work_mem).
2. Index Types: Choosing the Right Tool
PostgreSQL ships with six index types. Picking the wrong one is one of the most common performance mistakes.
B-tree (default)
Best for equality (=) and range (<, >, BETWEEN) on sortable data. This covers 90% of use cases.
-- Standard case: filter on a timestamp range
CREATE INDEX idx_orders_created ON orders (created_at DESC);
-- Partial index: only index recent active records (smaller, faster)
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';
Use a partial index whenever your queries consistently filter on a low-cardinality column and a second column. The index is smaller and fits more easily in cache.
Hash
Equality-only. Faster than B-tree for pure = lookups on very high cardinality columns, but it cannot support range scans and is not WAL-logged before PostgreSQL 10. In practice, B-tree is almost always the safer choice unless you have benchmarks showing a measurable win.
CREATE INDEX idx_sessions_token ON sessions USING HASH (token);
GIN (Generalized Inverted Index)
Best for multi-valued column types: jsonb, arrays, tsvector (full-text search). GIN stores a separate entry for each element in the value, making containment queries (@>, ?, @@) fast.
-- Full-text search
CREATE INDEX idx_articles_body_fts ON articles
USING GIN (to_tsvector('english', body));
SELECT id, title
FROM articles
WHERE to_tsvector('english', body) @@ to_tsquery('english', 'postgresql & performance');
-- JSONB containment
CREATE INDEX idx_events_payload ON events USING GIN (payload jsonb_path_ops);
SELECT * FROM events WHERE payload @> '{"type": "purchase"}';
jsonb_path_ops is a GIN operator class optimized for @> — it produces a smaller index than the default operator class, at the cost of not supporting key-existence operators (?, ?|, ?&).
GiST (Generalized Search Tree)
Extensible framework used for geometric types, ranges, and nearest-neighbor searches. PostGIS uses GiST for spatial indexes. The pg_trgm extension uses GiST (and GIN) for fuzzy string matching.
-- Enable trigram extension for LIKE/ILIKE acceleration
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_products_name_trgm ON products
USING GIN (name gin_trgm_ops);
-- Now this query uses the index instead of Seq Scan
SELECT * FROM products WHERE name ILIKE '%wireless%';
-- Range type index
CREATE INDEX idx_reservations_during ON reservations USING GiST (during);
SELECT * FROM reservations WHERE during && '[2026-04-01, 2026-04-07)'::daterange;
BRIN (Block Range Index)
Tiny footprint index for naturally ordered data, such as append-only log tables where id or created_at correlates with physical storage order. BRIN stores only the min/max values per block range (128 pages by default). Queries touching a large range that spans many blocks benefit enormously; point lookups do not.
-- Metrics table appended chronologically — ideal for BRIN
CREATE INDEX idx_metrics_ts_brin ON metrics USING BRIN (recorded_at);
-- Check correlation before creating BRIN:
SELECT attname, correlation
FROM pg_stats
WHERE tablename = 'metrics'
AND attname = 'recorded_at';
-- correlation near 1.0 or -1.0 = BRIN is appropriate
A BRIN index on a 500 million row table can be as small as 48 KB. The trade-off is that matching blocks still require a page scan; if correlation is poor, PostgreSQL may end up reading more pages than a sequential scan.
3. Composite Indexes and Index Coverage
Column Order Matters
A composite index (a, b) supports queries on a alone or (a, b) together, but not b alone. Put the most selective column first — unless you have an equality filter on a low-cardinality column and a range filter on a high-cardinality one, in which case put the equality column first so PostgreSQL can jump directly to the matching block range.
-- Supports: WHERE status = 'active' AND created_at > ...
-- Also supports: WHERE status = 'active'
-- Does NOT support: WHERE created_at > ... (alone)
CREATE INDEX idx_users_status_created ON users (status, created_at DESC);
Covering Indexes (Index-Only Scans)
When all columns referenced in a query are stored in the index, PostgreSQL can answer the query without touching the table heap at all — an index-only scan. Use the INCLUDE clause (PostgreSQL 11+) to add non-key columns:
CREATE INDEX idx_orders_user_covering ON orders (user_id, created_at DESC)
INCLUDE (status, total_amount);
-- This query now performs an index-only scan
SELECT user_id, created_at, status, total_amount
FROM orders
WHERE user_id = 42
ORDER BY created_at DESC
LIMIT 20;
Verify with EXPLAIN: look for Index Only Scan and Heap Fetches: 0. A non-zero heap fetch count means the visibility map has stale entries for some pages — running VACUUM on the table resolves this.
4. Avoiding Sequential Scans
A sequential scan is not always wrong — on a 200-row lookup table, a Seq Scan is faster than an index lookup. The planner uses statistics to decide. When the planner makes the wrong call, investigate before forcing an index.
Why the planner chooses Seq Scan:
-
Statistics are stale. Run
ANALYZE table_nameor decreasedefault_statistics_targetfor the column. - Table is too small. Correct — do nothing.
-
Cost constants are wrong. On SSDs,
random_page_costshould be 1.1–1.5, not the default 4.0. - No suitable index exists. Create one.
-- Tune cost constants in postgresql.conf for SSD storage
random_page_cost = 1.1
effective_cache_size = 12GB -- total OS + PostgreSQL cache estimate
-- Force the planner to reconsider (session level, for testing only)
SET enable_seqscan = OFF;
EXPLAIN ANALYZE SELECT ...;
SET enable_seqscan = ON;
Common patterns that kill index usage:
-- BAD: function wrapping the indexed column
WHERE LOWER(email) = 'user@example.com'
-- GOOD: functional index or store data normalized
CREATE INDEX idx_users_email_lower ON users (LOWER(email));
-- BAD: implicit cast
WHERE user_id = '42' -- user_id is integer, '42' is text
-- GOOD: explicit cast or correct type
WHERE user_id = 42
5. Connection Pooling with PgBouncer
PostgreSQL spawns a process per connection. At 500 concurrent connections, you are spending more time on process management than actual queries. PgBouncer sits between your app and PostgreSQL and multiplexes clients onto a smaller pool of real database connections.
Install and configure PgBouncer (/etc/pgbouncer/pgbouncer.ini):
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_port = 6432
listen_addr = 127.0.0.1
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction ; session | transaction | statement
max_client_conn = 1000
default_pool_size = 25 ; actual connections to PostgreSQL
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
server_idle_timeout = 600
log_connections = 0
log_disconnections = 0
Pool mode guidance:
-
session— one server connection held for the entire client session. Safest, least efficient. -
transaction— connection returned to pool after each transaction. Works for 95% of apps; breaksSETstatements and advisory locks across transactions. -
statement— connection returned after each statement. Breaks multi-statement transactions entirely. Avoid unless specifically needed.
Monitor pool health:
-- Connect to PgBouncer's admin console
psql -p 6432 -U pgbouncer pgbouncer
SHOW POOLS;
SHOW STATS;
SHOW CLIENTS;
Watch cl_waiting: if consistently above 0, increase default_pool_size or reduce query duration.
6. Key postgresql.conf Settings
The defaults ship tuned for 256MB RAM. Adjust these for a dedicated database server:
# Memory
shared_buffers = 4GB # ~25% of total RAM
effective_cache_size = 12GB # ~75% of RAM (planner hint, not allocated)
work_mem = 64MB # per sort/hash operation; multiply by
# max_connections * 2 for peak RSS estimate
maintenance_work_mem = 512MB # for VACUUM, CREATE INDEX, ALTER TABLE
# Write performance
wal_buffers = 64MB # default auto (-1) is usually fine
checkpoint_completion_target = 0.9
max_wal_size = 4GB # allow longer checkpoints
# Planner
random_page_cost = 1.1 # SSD: 1.1; spinning disk: 4.0
effective_io_concurrency = 200 # SSD: 100-200; spinning disk: 2
# Parallelism
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
max_worker_processes = 16
# Logging
log_min_duration_statement = 200 # log queries slower than 200ms
log_checkpoints = on
log_lock_waits = on
Reload without restart:
# Most parameters only require reload
SELECT pg_reload_conf();
-- or
pg_ctl reload -D /var/lib/postgresql/data
Parameters requiring a restart are marked postmaster in pg_settings. Check with:
SELECT name, setting, unit, context
FROM pg_settings
WHERE name IN ('shared_buffers', 'work_mem', 'max_connections');
7. Vacuuming and Autovacuum Tuning
PostgreSQL uses MVCC: deleted and updated rows are not immediately removed. They become dead tuples. VACUUM reclaims that space and advances the visibility map, enabling index-only scans. Without it, tables bloat and queries slow down.
Manual vacuum for immediate relief:
-- Standard vacuum: reclaim dead tuples, update visibility map
VACUUM ANALYZE orders;
-- Full vacuum: rewrites table, reclaims OS-level disk space
-- Warning: takes exclusive lock, blocks all reads and writes
VACUUM FULL orders;
-- Check bloat before deciding
SELECT
schemaname,
tablename,
n_dead_tup,
n_live_tup,
round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2) AS dead_pct,
last_vacuum,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
Autovacuum tuning for high-churn tables:
The default autovacuum threshold is 50 + 0.2 * n_live_tup. On a 10 million row orders table that means waiting until 2,000,050 dead tuples accumulate. Override per table:
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01, -- trigger at 1% dead tuples
autovacuum_vacuum_cost_delay = 2, -- ms between cost-limit pauses
autovacuum_vacuum_cost_limit = 400 -- higher = faster vacuum, more I/O
);
Monitor autovacuum workers:
SELECT pid, now() - xact_start AS duration, query
FROM pg_stat_activity
WHERE query LIKE 'autovacuum:%'
ORDER BY duration DESC;
If autovacuum can never keep up, increase autovacuum_max_workers (default 3) in postgresql.conf.
8. Partitioning Large Tables
Declarative partitioning (PostgreSQL 10+) splits a logical table into physical child tables, enabling partition pruning: the planner only scans partitions that could contain matching rows.
Range partitioning by month:
CREATE TABLE events (
id BIGSERIAL,
occurred_at TIMESTAMPTZ NOT NULL,
type TEXT NOT NULL,
payload JSONB
) PARTITION BY RANGE (occurred_at);
CREATE TABLE events_2026_01
PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02
PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
-- Automate monthly partition creation with a function:
CREATE OR REPLACE FUNCTION create_monthly_partition(tbl TEXT, yr INT, mo INT)
RETURNS VOID LANGUAGE plpgsql AS $$
DECLARE
partition_name TEXT;
start_date DATE;
end_date DATE;
BEGIN
start_date := make_date(yr, mo, 1);
end_date := start_date + INTERVAL '1 month';
partition_name := tbl || '_' || to_char(start_date, 'YYYY_MM');
EXECUTE format(
'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
partition_name, tbl, start_date, end_date
);
END;
$$;
Verify pruning is working:
EXPLAIN SELECT * FROM events WHERE occurred_at >= '2026-02-01' AND occurred_at < '2026-03-01';
-- Should show: Append -> Seq Scan on events_2026_02 (only one child)
Hash partitioning is better for evenly distributing write load without a natural range key:
CREATE TABLE user_sessions (
session_id UUID NOT NULL,
user_id BIGINT NOT NULL,
data JSONB
) PARTITION BY HASH (user_id);
CREATE TABLE user_sessions_0 PARTITION OF user_sessions FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE user_sessions_1 PARTITION OF user_sessions FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE user_sessions_2 PARTITION OF user_sessions FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE user_sessions_3 PARTITION OF user_sessions FOR VALUES WITH (MODULUS 4, REMAINDER 3);
Add indexes on partition keys and frequently queried columns on the parent table — PostgreSQL 11+ propagates indexes to all partitions automatically.
9. Caching Strategies
pg_prewarm
After a restart, shared_buffers is empty and the first queries are slow as PostgreSQL reads from disk. pg_prewarm loads table or index pages into the buffer cache on demand.
CREATE EXTENSION IF NOT EXISTS pg_prewarm;
-- Warm the entire table
SELECT pg_prewarm('orders');
-- Warm just the index
SELECT pg_prewarm('idx_orders_user_covering');
-- Check buffer cache occupancy
CREATE EXTENSION IF NOT EXISTS pg_buffercache;
SELECT
c.relname,
count(*) AS buffers,
round(count(*) * 8192 / 1024.0 / 1024.0, 2) AS mb_cached
FROM pg_buffercache b
JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid)
WHERE b.reldatabase = (SELECT oid FROM pg_database WHERE datname = current_database())
GROUP BY c.relname
ORDER BY buffers DESC
LIMIT 20;
Enable pg_prewarm.autoprewarm = on in postgresql.conf to automatically restore the buffer state across restarts (PostgreSQL 11+).
Materialized Views
For expensive aggregation queries that run frequently but can tolerate slightly stale data, materialized views pre-compute and store the result set.
CREATE MATERIALIZED VIEW daily_revenue AS
SELECT
date_trunc('day', created_at) AS day,
SUM(total_amount) AS revenue,
COUNT(*) AS order_count
FROM orders
WHERE status = 'completed'
GROUP BY 1
ORDER BY 1;
CREATE UNIQUE INDEX ON daily_revenue (day);
-- Refresh incrementally (requires unique index)
REFRESH MATERIALIZED VIEW CONCURRENTLY daily_revenue;
CONCURRENTLY allows reads during the refresh but requires a unique index and takes longer than a plain refresh. Schedule it with pg_cron or your application scheduler. Without CONCURRENTLY, the refresh takes a brief exclusive lock.
For near-real-time use cases, consider a REFRESH triggered by a logical replication slot or a background worker rather than a cron job.
10. Monitoring: pg_stat_statements and Slow Query Log
pg_stat_statements
This extension aggregates statistics for every distinct query shape across all executions — the single most useful tool in a PostgreSQL performance toolkit.
-- Enable in postgresql.conf, then restart
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
-- After restart:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Top 10 queries by total time
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
round((100 * total_exec_time /
sum(total_exec_time) OVER ())::numeric, 2) AS pct_total,
left(query, 80) AS query_snippet
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- Find queries with high variance (inconsistent performance)
SELECT
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
calls,
left(query, 100) AS query
FROM pg_stat_statements
WHERE calls > 100
AND stddev_exec_time > mean_exec_time
ORDER BY stddev_exec_time DESC
LIMIT 10;
-- Reset statistics after optimization work
SELECT pg_stat_statements_reset();
Slow Query Log
The slow query log catches queries in real time and is essential on systems where pg_stat_statements was not enabled from the start.
# postgresql.conf
log_min_duration_statement = 200 # ms; set to 0 to log everything (verbose!)
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_statement = 'none' # don't log all statements, only slow ones
Parse the log with pgBadger for a human-readable HTML report:
pgbadger /var/log/postgresql/postgresql-*.log -o report.html --outdir /var/www/html/pgbadger
Or query live from within PostgreSQL using file_fdw if your log format is CSV:
log_destination = 'csvlog'
logging_collector = on
CREATE EXTENSION file_fdw;
CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE postgres_log (
log_time TIMESTAMPTZ, user_name TEXT, database_name TEXT,
process_id INT, connection_from TEXT, session_id TEXT,
session_line_num BIGINT, command_tag TEXT, session_start_time TIMESTAMPTZ,
virtual_transaction_id TEXT, transaction_id BIGINT, error_severity TEXT,
sql_state_code TEXT, message TEXT, detail TEXT, hint TEXT,
internal_query TEXT, internal_query_pos INT, context TEXT,
query TEXT, query_pos INT, location TEXT, application_name TEXT
) SERVER pglog
OPTIONS (filename '/var/log/postgresql/postgresql.csv', format 'csv');
SELECT log_time, user_name, duration_text, message
FROM postgres_log
WHERE message LIKE 'duration:%'
ORDER BY log_time DESC
LIMIT 50;
Combining Both Tools
Use pg_stat_statements to identify the query shapes with the highest cumulative cost, then correlate with the slow query log to find the specific parameter values that produce the worst execution plans. That combination — aggregate profile plus concrete examples — gives you everything you need to write a targeted fix.
Putting It Together: A Performance Audit Checklist
When you inherit a slow PostgreSQL instance, work through this list in order:
-
Enable
pg_stat_statementsif not already active. Restart required. -
Run
ANALYZEon high-traffic tables to refresh planner statistics. -
Pull the top-10 slow queries from
pg_stat_statements. RunEXPLAIN (ANALYZE, BUFFERS)on each. -
Check
pg_stat_user_tablesfor tables with highn_dead_tupor long time since last vacuum. -
Review
postgresql.conf:shared_buffers,work_mem,random_page_cost,checkpoint_completion_target. -
Check connection count:
SELECT count(*) FROM pg_stat_activity. If consistently above 100, deploy PgBouncer. -
Look for missing indexes: queries with
Seq Scanon tables larger than 10,000 rows that are not covered by an existing index. -
Check for index bloat with
pgstattupleor thecheck_postgresscript. - Consider partitioning any append-heavy table exceeding 50 million rows or 10GB.
-
Schedule
REFRESH MATERIALIZED VIEW CONCURRENTLYfor any expensive dashboard queries running more than once per minute.
PostgreSQL rewards systematic investigation. The tools are all built in — the discipline is reading what they tell you before reaching for a quick fix.
Wilson Xu is a software engineer specializing in backend systems and database performance. He writes about PostgreSQL, Node.js, and developer tooling.
Top comments (0)