Piter Adyson

Posted on Feb 4

10 PostgreSQL performance tuning tips that actually work in production

#database #postgres

Performance tuning isn't about following a checklist. It's about understanding what's actually slowing down your database and fixing those specific problems. These are techniques that consistently deliver real improvements in production environments. Some of them are obvious but frequently misconfigured. Others are less known but surprisingly effective.

The tips in this article cover:

Memory configuration (shared_buffers, work_mem)
Index strategy and maintenance
Connection management
Vacuum and maintenance tuning
Query optimization techniques

1. Configure shared_buffers properly

PostgreSQL uses shared_buffers to cache frequently accessed data in memory. The default setting is usually way too low for production workloads. Setting this value correctly can dramatically reduce disk I/O and improve query performance.

The general recommendation is to set shared_buffers to about 25% of your total system RAM. If you have 16 GB of RAM, start with 4 GB. If you're on a dedicated database server with lots of memory, you can go higher, but there are diminishing returns above 8-10 GB because PostgreSQL also relies on the operating system's file cache.

-- In postgresql.conf
shared_buffers = 4GB

After changing this setting, you need to restart PostgreSQL. Monitor your cache hit ratio to see if the change helped. A cache hit ratio above 99% is good. You can check it with:

SELECT
  sum(heap_blks_read) as heap_read,
  sum(heap_blks_hit) as heap_hit,
  sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM pg_statio_user_tables;

2. Tune work_mem for complex queries

The work_mem setting controls how much memory PostgreSQL can use for internal sort operations and hash tables before it has to write to disk. If you're running complex queries with sorts, joins or aggregations, increasing work_mem can prevent expensive disk operations.

Be careful though. work_mem is allocated per operation, not per query. A complex query with multiple sorts can use work_mem several times over. If you set it too high and have many concurrent queries, you can run out of memory.

Start conservative. The default is usually 4 MB. Try 16-64 MB for analytical workloads. For specific heavy queries, you can increase it temporarily in the session:

SET work_mem = '256MB';
SELECT * FROM large_table ORDER BY some_column;
RESET work_mem;

Monitor with pg_stat_statements to see which queries are doing disk sorts (you'll see "external sort" in EXPLAIN output). Those are candidates for work_mem tuning.

3. Add the right indexes

Indexes speed up reads but slow down writes. The trick is finding the right balance. Start by identifying slow queries using pg_stat_statements or your query logs. Look at queries with high execution time or high call counts.

For most cases, B-tree indexes work well. Create indexes on columns used in WHERE clauses, JOIN conditions and ORDER BY statements. But don't go overboard. Every index adds overhead during INSERTs and UPDATEs.

Index Type	Best For	When to Use
B-tree	General purpose, equality and range queries	Most common scenarios, default choice
GIN	Full-text search, JSONB, arrays	Searching within complex data types
GiST	Geometric data, full-text search	Spatial queries, complex searches
BRIN	Very large tables with natural ordering	Time-series data, append-only tables

Use EXPLAIN ANALYZE to verify your indexes are actually being used:

EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

If you see a Seq Scan when you expected an Index Scan, something's wrong. Maybe the index doesn't exist, or PostgreSQL thinks it's not worth using (which happens on small tables or when selecting most of the table).

4. Use connection pooling

Every PostgreSQL connection has overhead. Opening and closing connections repeatedly wastes resources. If your application creates a new database connection for each request, you're probably experiencing unnecessary latency and resource consumption.

Connection poolers like PgBouncer sit between your application and PostgreSQL. They maintain a pool of connections and reuse them across multiple client requests. This reduces connection overhead significantly.

PgBouncer supports three pooling modes:

Session pooling keeps a connection for the entire client session
Transaction pooling releases connections after each transaction (more efficient for web apps)
Statement pooling releases after each statement (use with caution, limited functionality)

For most web applications, transaction pooling works well. Install PgBouncer, point your application to it instead of directly to PostgreSQL and configure the pool size based on your workload. A good starting point is 2-3 connections per CPU core on your database server.

5. Analyze and vacuum regularly

PostgreSQL uses MVCC (Multi-Version Concurrency Control) which creates row versions when you update or delete data. Over time, dead rows accumulate. VACUUM removes these dead rows and frees up space. ANALYZE updates statistics that the query planner uses to make decisions.

Modern PostgreSQL versions have autovacuum enabled by default, but it might not be aggressive enough for high-write workloads. If you're seeing table bloat or degraded query performance over time, you probably need to tune autovacuum settings.

-- In postgresql.conf
autovacuum_vacuum_scale_factor = 0.1  -- Vacuum when 10% of table is dead rows
autovacuum_analyze_scale_factor = 0.05  -- Analyze when 5% has changed
autovacuum_naptime = 30s  -- Check for work every 30 seconds

For very active tables, you can also set table-specific settings:

ALTER TABLE your_busy_table SET (autovacuum_vacuum_scale_factor = 0.05);

Check for bloat using queries from pg_stat_user_tables. If you see tables with high n_dead_tup, autovacuum isn't keeping up.

6. Optimize your queries

Sometimes the database configuration is fine, but the queries themselves are inefficient. Use EXPLAIN ANALYZE to understand query execution plans. Look for sequential scans on large tables, nested loops with high costs or sorts that spill to disk.

Common query optimizations include:

Adding WHERE clauses to filter data early
Using JOIN instead of subqueries when appropriate
Avoiding SELECT * and only fetching columns you need
Using LIMIT when you don't need all results
Avoiding functions on indexed columns in WHERE clauses

Here's an example of a problematic query pattern:

-- Bad: Function on indexed column prevents index usage
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2026;

-- Good: Range comparison allows index usage
SELECT * FROM orders WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01';

Also consider using prepared statements. They're parsed and planned once, then executed multiple times with different parameters. This reduces overhead for frequently executed queries.

7. Partition large tables

If you have tables with millions or billions of rows, partitioning can improve performance and manageability. PostgreSQL's declarative partitioning splits a large table into smaller physical pieces based on ranges, lists or hash values.

Time-based partitioning is common for logs or event data. You create partitions by month or year, and older partitions can be archived or dropped easily. Queries that filter by the partition key only scan relevant partitions, not the entire table.

CREATE TABLE events (
    id BIGSERIAL,
    event_type TEXT,
    created_at TIMESTAMPTZ,
    data JSONB
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2026_01 PARTITION OF events
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE events_2026_02 PARTITION OF events
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

Partitioning also makes backups more manageable. Instead of backing up one massive table, you can backup or restore individual partitions. Tools like PostgreSQL backup handle partitioned tables automatically, treating each partition appropriately during backup and restore operations.

8. Enable query logging for slow queries

You can't optimize what you can't measure. PostgreSQL's slow query log captures queries that exceed a specified duration. This helps you identify problematic queries in production without impacting performance significantly.

-- In postgresql.conf
log_min_duration_statement = 1000  -- Log queries taking more than 1 second
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '

The log will show you the full query text, execution time and context. Combine this with pg_stat_statements for aggregated statistics across all queries. You'll quickly see which queries are consuming the most resources.

For production systems, start with a higher threshold (1-5 seconds) to avoid excessive logging. Once you've optimized the obvious slow queries, you can lower it to catch smaller issues.

9. Use read replicas for reporting workloads

If you're running heavy analytical queries or reports on your primary database, they can interfere with transactional workloads. Read replicas solve this by offloading read-only queries to separate servers.

PostgreSQL's streaming replication creates one or more standby servers that continuously apply changes from the primary. Your application can send SELECT queries to these replicas, distributing the load.

Setting up replication requires some configuration but it's straightforward:

Configuration	Primary Server	Replica Server
wal_level	replica or logical	N/A
max_wal_senders	Number of replicas + 1	N/A
hot_standby	N/A	on

The replica will lag slightly behind the primary (typically milliseconds to seconds). If your application can tolerate this, replicas are a cheap way to scale read capacity.

You can also use replicas for backup purposes. Taking backups from a replica instead of the primary reduces load on your production database.

10. Monitor and adjust autovacuum costs

Autovacuum runs in the background to clean up dead rows, but it can consume I/O and CPU resources. If autovacuum runs too aggressively, it can slow down your application queries. If it doesn't run enough, tables bloat and performance degrades.

The cost-based vacuum delay system controls how aggressively autovacuum uses resources. By default, it's fairly conservative. On modern hardware with SSDs, you can usually make it more aggressive:

-- In postgresql.conf
autovacuum_vacuum_cost_delay = 2ms  -- Lower = faster vacuum
autovacuum_vacuum_cost_limit = 2000  -- Higher = more work per cycle

For specific high-write tables, you might need custom settings:

ALTER TABLE busy_table SET (autovacuum_vacuum_cost_delay = 0);

Setting cost_delay to 0 removes throttling entirely for that table. Use this carefully and monitor I/O.

Watch the pg_stat_all_tables view for tables where autovacuum is falling behind (last_autovacuum is old and n_dead_tup is high). Those tables need tuning.

Putting it all together

Performance tuning is iterative. Start by measuring your current state with pg_stat_statements and query logs. Identify the biggest bottlenecks first. A few slow queries might account for 80% of your database load.

Apply one change at a time and measure the results. What works for one workload might not work for another. OLTP systems (lots of small transactions) need different tuning than OLAP systems (complex analytical queries).

Before making any changes, establish a baseline:

Current query response times (p50, p95, p99)
Cache hit ratio and buffer usage
Connection counts and wait times
Disk I/O and CPU utilization

Keep your PostgreSQL version updated. Each release includes performance improvements and better defaults. PostgreSQL 17 and 18 have significantly better query planning and execution than older versions.

Tuning Area	Impact	Difficulty	When to Do It
Indexes	High	Low	Early, based on query patterns
shared_buffers	High	Low	During initial setup
Connection pooling	High	Medium	When connections become bottleneck
Partitioning	Medium	High	When tables exceed 50-100 million rows
Autovacuum tuning	Medium	Medium	When seeing table bloat
Read replicas	High	High	When reads exceed write capacity

And remember: backups don't fix performance problems, but they let you experiment safely. Before making major changes, ensure you have reliable backups. Databasus is an industry standard for PostgreSQL backup tools, offering automated backups with flexible scheduling and multiple storage options for both small projects and large enterprises.

These tuning techniques work because they address real bottlenecks: memory usage, disk I/O, connection overhead and query efficiency. Apply them based on your specific bottlenecks, not just because they're on a list.

DEV Community