DEV Community

Cover image for PostgreSQL Parallel Query: Configuration & Performance Tuning
Philip McClarence
Philip McClarence

Posted on

PostgreSQL Parallel Query: Configuration & Performance Tuning

PostgreSQL Parallel Query: Configuration & Performance Tuning

Your analytical query scans a 50 GB table, aggregates 200 million rows, and takes 25 seconds. Your server has 16 CPU cores. PostgreSQL uses... 2 of them. The other 14 sit idle. The max_parallel_workers_per_gather default of 2 is leaving 7x potential speedup on the table. Let's fix that -- and understand when you should not.

How Parallel Query Works

PostgreSQL divides large operations across multiple CPU cores. Worker processes each scan a portion of the data, feed results through a Gather node to the leader process, which combines them. Sequential scans, hash joins, aggregates, and B-tree index scans all support parallel execution.

The key defaults:

  • max_parallel_workers_per_gather = 2 -- max workers per parallel operation
  • max_parallel_workers = 8 -- total parallel workers across all sessions
  • max_worker_processes = 8 -- total background workers (shared with other subsystems)
  • min_parallel_table_scan_size = 8MB -- minimum table size for parallel scan
  • parallel_setup_cost = 1000 -- planner's estimate for starting a worker
  • parallel_tuple_cost = 0.1 -- per-tuple transfer cost estimate

These are tuned for general-purpose workloads. For analytical queries on large tables, they're far too conservative.

Detecting the Problem

Check your current settings:

SELECT name, setting, unit, short_desc
FROM pg_settings
WHERE name LIKE '%parallel%'
   OR name = 'max_worker_processes'
ORDER BY name;
Enter fullscreen mode Exit fullscreen mode

Check whether queries actually use parallel workers:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT customer_region, count(*), sum(order_total)
FROM orders
WHERE created_at >= '2026-01-01'
GROUP BY customer_region;
Enter fullscreen mode Exit fullscreen mode

Look for:

  • Gather or Gather Merge -- parallel execution is happening
  • Workers Planned: 2 and Workers Launched: 2 -- how many workers
  • Workers Launched < Workers Planned -- system ran out of workers

Check if workers are being exhausted:

-- Currently active parallel workers
SELECT count(*) AS active_parallel_workers
FROM pg_stat_activity
WHERE backend_type = 'parallel worker';

-- The limit
SELECT setting AS max_parallel_workers
FROM pg_settings
WHERE name = 'max_parallel_workers';
Enter fullscreen mode Exit fullscreen mode

If active workers frequently approach the limit, queries are competing for workers and some run with fewer than planned.

Tuning for Analytical Workloads

If your database runs analytical queries on large tables:

-- More workers per query
SET max_parallel_workers_per_gather = 4;

-- More total workers
ALTER SYSTEM SET max_parallel_workers = 16;

-- Enough background worker slots
ALTER SYSTEM SET max_worker_processes = 20;

-- Lower table size threshold
ALTER SYSTEM SET min_parallel_table_scan_size = '1MB';

-- Apply (max_worker_processes requires restart)
SELECT pg_reload_conf();
Enter fullscreen mode Exit fullscreen mode

Rule of thumb: max_parallel_workers_per_gather = half your CPU cores, max_parallel_workers = total cores. On a 16-core server: 8 and 16 respectively.

Lower Cost Thresholds

If medium-sized tables aren't getting parallelized despite adequate configuration:

ALTER SYSTEM SET parallel_setup_cost = 100;    -- default: 1000
ALTER SYSTEM SET parallel_tuple_cost = 0.01;   -- default: 0.1
SELECT pg_reload_conf();
Enter fullscreen mode Exit fullscreen mode

Lower parallel_setup_cost makes the planner consider parallelism for smaller operations. Lower parallel_tuple_cost makes parallel plans look cheaper.

Per-Session Overrides

For mixed workloads, set parallelism based on the connection type:

-- Reporting query: maximum parallelism
SET LOCAL max_parallel_workers_per_gather = 8;
SET LOCAL parallel_setup_cost = 0;
SET LOCAL parallel_tuple_cost = 0;

SELECT region, count(*), avg(order_total)
FROM orders
GROUP BY region;

-- OLTP session: disable parallelism
SET LOCAL max_parallel_workers_per_gather = 0;
Enter fullscreen mode Exit fullscreen mode

Per-Table Settings

For critical large tables:

-- Guarantee up to 8 workers for scans on this table
ALTER TABLE orders SET (parallel_workers = 8);
Enter fullscreen mode Exit fullscreen mode

This overrides the planner's automatic worker count calculation.

What Parallelizes (and What Doesn't)

Operation Parallel?
Sequential Scan Yes
B-tree Index Scan Yes
Bitmap Heap Scan Yes
Hash Join Yes
Merge Join Yes
Nested Loop Yes (outer side)
Aggregate (count, sum, avg) Yes
CREATE INDEX (B-tree) Yes
Append (UNION ALL, partitions) Yes
UPDATE, DELETE No
CTEs (WITH queries) No
Cursors No
FOR UPDATE/SHARE No

The Memory Multiplication Trap

work_mem applies per worker. A query with work_mem = 256MB and 4 parallel workers can consume 1.28 GB for sorting and hashing. Budget accordingly:

max_connections * max_parallel_workers_per_gather * work_mem < available RAM
Enter fullscreen mode Exit fullscreen mode

This catches people who increase parallelism without accounting for memory.

Verify the Impact

Compare sequential vs parallel execution:

-- Disable parallelism
SET LOCAL max_parallel_workers_per_gather = 0;
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT customer_region, count(*), sum(order_total)
FROM orders
GROUP BY customer_region;

-- Enable parallelism
SET LOCAL max_parallel_workers_per_gather = 4;
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT customer_region, count(*), sum(order_total)
FROM orders
GROUP BY customer_region;
Enter fullscreen mode Exit fullscreen mode

The parallel plan should show roughly sequential_time / (1 + num_workers) execution time, with 60-80% of theoretical speedup typical due to Gather overhead.

Prevention Strategy

OLAP databases: aggressive parallelism. max_parallel_workers_per_gather = CPU_cores / 2, max_parallel_workers = CPU_cores, lower cost thresholds.

OLTP databases: keep defaults or disable. Many short concurrent queries don't benefit -- worker overhead exceeds speedup on small queries.

Mixed workloads: per-connection settings. Reporting connections get high parallelism. App connections get zero.

Monitor Workers Launched vs Workers Planned. Consistent shortfall means you need more max_parallel_workers. If CPU hits 100% during parallel queries and other sessions slow down, reduce max_parallel_workers_per_gather.


Originally published at mydba.dev/blog/postgres-parallel-query

Top comments (0)