PostgreSQL Parallel Query: Configuration & Performance Tuning
Your analytical query scans a 50 GB table, aggregates 200 million rows, and takes 25 seconds. Your server has 16 CPU cores. PostgreSQL uses... 2 of them. The other 14 sit idle. The max_parallel_workers_per_gather default of 2 is leaving 7x potential speedup on the table. Let's fix that -- and understand when you should not.
How Parallel Query Works
PostgreSQL divides large operations across multiple CPU cores. Worker processes each scan a portion of the data, feed results through a Gather node to the leader process, which combines them. Sequential scans, hash joins, aggregates, and B-tree index scans all support parallel execution.
The key defaults:
-
max_parallel_workers_per_gather = 2-- max workers per parallel operation -
max_parallel_workers = 8-- total parallel workers across all sessions -
max_worker_processes = 8-- total background workers (shared with other subsystems) -
min_parallel_table_scan_size = 8MB-- minimum table size for parallel scan -
parallel_setup_cost = 1000-- planner's estimate for starting a worker -
parallel_tuple_cost = 0.1-- per-tuple transfer cost estimate
These are tuned for general-purpose workloads. For analytical queries on large tables, they're far too conservative.
Detecting the Problem
Check your current settings:
SELECT name, setting, unit, short_desc
FROM pg_settings
WHERE name LIKE '%parallel%'
OR name = 'max_worker_processes'
ORDER BY name;
Check whether queries actually use parallel workers:
EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT customer_region, count(*), sum(order_total)
FROM orders
WHERE created_at >= '2026-01-01'
GROUP BY customer_region;
Look for:
-
GatherorGather Merge-- parallel execution is happening -
Workers Planned: 2andWorkers Launched: 2-- how many workers -
Workers Launched < Workers Planned-- system ran out of workers
Check if workers are being exhausted:
-- Currently active parallel workers
SELECT count(*) AS active_parallel_workers
FROM pg_stat_activity
WHERE backend_type = 'parallel worker';
-- The limit
SELECT setting AS max_parallel_workers
FROM pg_settings
WHERE name = 'max_parallel_workers';
If active workers frequently approach the limit, queries are competing for workers and some run with fewer than planned.
Tuning for Analytical Workloads
If your database runs analytical queries on large tables:
-- More workers per query
SET max_parallel_workers_per_gather = 4;
-- More total workers
ALTER SYSTEM SET max_parallel_workers = 16;
-- Enough background worker slots
ALTER SYSTEM SET max_worker_processes = 20;
-- Lower table size threshold
ALTER SYSTEM SET min_parallel_table_scan_size = '1MB';
-- Apply (max_worker_processes requires restart)
SELECT pg_reload_conf();
Rule of thumb: max_parallel_workers_per_gather = half your CPU cores, max_parallel_workers = total cores. On a 16-core server: 8 and 16 respectively.
Lower Cost Thresholds
If medium-sized tables aren't getting parallelized despite adequate configuration:
ALTER SYSTEM SET parallel_setup_cost = 100; -- default: 1000
ALTER SYSTEM SET parallel_tuple_cost = 0.01; -- default: 0.1
SELECT pg_reload_conf();
Lower parallel_setup_cost makes the planner consider parallelism for smaller operations. Lower parallel_tuple_cost makes parallel plans look cheaper.
Per-Session Overrides
For mixed workloads, set parallelism based on the connection type:
-- Reporting query: maximum parallelism
SET LOCAL max_parallel_workers_per_gather = 8;
SET LOCAL parallel_setup_cost = 0;
SET LOCAL parallel_tuple_cost = 0;
SELECT region, count(*), avg(order_total)
FROM orders
GROUP BY region;
-- OLTP session: disable parallelism
SET LOCAL max_parallel_workers_per_gather = 0;
Per-Table Settings
For critical large tables:
-- Guarantee up to 8 workers for scans on this table
ALTER TABLE orders SET (parallel_workers = 8);
This overrides the planner's automatic worker count calculation.
What Parallelizes (and What Doesn't)
| Operation | Parallel? |
|---|---|
| Sequential Scan | Yes |
| B-tree Index Scan | Yes |
| Bitmap Heap Scan | Yes |
| Hash Join | Yes |
| Merge Join | Yes |
| Nested Loop | Yes (outer side) |
| Aggregate (count, sum, avg) | Yes |
| CREATE INDEX (B-tree) | Yes |
| Append (UNION ALL, partitions) | Yes |
| UPDATE, DELETE | No |
| CTEs (WITH queries) | No |
| Cursors | No |
| FOR UPDATE/SHARE | No |
The Memory Multiplication Trap
work_mem applies per worker. A query with work_mem = 256MB and 4 parallel workers can consume 1.28 GB for sorting and hashing. Budget accordingly:
max_connections * max_parallel_workers_per_gather * work_mem < available RAM
This catches people who increase parallelism without accounting for memory.
Verify the Impact
Compare sequential vs parallel execution:
-- Disable parallelism
SET LOCAL max_parallel_workers_per_gather = 0;
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT customer_region, count(*), sum(order_total)
FROM orders
GROUP BY customer_region;
-- Enable parallelism
SET LOCAL max_parallel_workers_per_gather = 4;
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT customer_region, count(*), sum(order_total)
FROM orders
GROUP BY customer_region;
The parallel plan should show roughly sequential_time / (1 + num_workers) execution time, with 60-80% of theoretical speedup typical due to Gather overhead.
Prevention Strategy
OLAP databases: aggressive parallelism. max_parallel_workers_per_gather = CPU_cores / 2, max_parallel_workers = CPU_cores, lower cost thresholds.
OLTP databases: keep defaults or disable. Many short concurrent queries don't benefit -- worker overhead exceeds speedup on small queries.
Mixed workloads: per-connection settings. Reporting connections get high parallelism. App connections get zero.
Monitor Workers Launched vs Workers Planned. Consistent shortfall means you need more max_parallel_workers. If CPU hits 100% during parallel queries and other sessions slow down, reduce max_parallel_workers_per_gather.
Originally published at mydba.dev/blog/postgres-parallel-query
Top comments (0)