DEV Community

Cover image for Running Postgres at Scale: Lessons Learned
Samson Tanimawo
Samson Tanimawo

Posted on

Running Postgres at Scale: Lessons Learned

We run Postgres for a product with millions of users. Along the way I've broken it in every possible way. Here are the lessons I wish I'd known on day one.

Autovacuum is not optional

You can ignore autovacuum for a while. You cannot ignore it forever. Dead tuples accumulate. Query plans go bad. Eventually a query that used to take 10ms takes 3 seconds and nobody knows why.

Tune autovacuum earlier than you think. autovacuum_vacuum_scale_factor = 0.05 on big tables is a good default.

Connection pooling is not optional

Postgres connections are expensive. Every connection holds memory and a worker process. You will run out.

Use PgBouncer or equivalent. Set pool size conservatively. Your app might want 500 connections; Postgres can happily handle 50 if you pool properly.

Long-running transactions are silent killers

A transaction that's been open for 2 hours prevents vacuum from cleaning tuples newer than its start time. Your table bloats. Your queries slow down. You blame the database.

Alert on pg_stat_activity.xact_start < now() - interval '10 minutes'. Hunt and kill long transactions before they bite you.

The query planner is not magic

It's a cost estimator. It can be wrong. When you see a query doing a sequential scan that should use an index, the planner chose sequential because its estimate said it was cheaper. Sometimes the estimate is wrong.

Fix: ANALYZE regularly, increase default_statistics_target for large tables, and don't be afraid to use SET enable_seqscan = off as a debug tool.

Backups you haven't restored are not backups

Practice the restore. Monthly. On real data volume. The first time I tried to restore our 800GB production backup, it took 11 hours. That's a useful thing to know before the outage.

Postgres is incredibly forgiving. But only to people who respect it.


Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com

Top comments (0)