PostgreSQL VACUUM explained — 5 things you must understand about database maintenance

#postgres #database

PostgreSQL handles concurrent reads and writes through a system called MVCC (Multi-Version Concurrency Control). It's what allows multiple transactions to see consistent snapshots of data without blocking each other. But this comes with a cost: deleted and updated rows don't just disappear. They leave behind dead tuples that accumulate over time, wasting space and slowing down queries.

VACUUM is PostgreSQL's way of cleaning that up. It's not optional, and it's not something you can ignore in production. If you misunderstand how it works, you'll eventually run into table bloat, query degradation, or in extreme cases a full database shutdown.

Here are 5 things that matter most about VACUUM.

1. What VACUUM actually does

When you delete or update a row in PostgreSQL, the old version isn't removed immediately. Instead, it's marked as dead but left in place on the disk. This is because other transactions might still need that version, depending on when they started. MVCC needs those old versions to exist until no active transaction could possibly reference them.

VACUUM's job is to scan tables, identify tuples that are no longer visible to any transaction and mark that space as reusable. It doesn't return space to the operating system. It just makes those pages available for future inserts and updates within the same table.

This distinction matters. After running VACUUM on a heavily-updated table, the file size on disk often stays the same. The space is reclaimed logically, not physically.

Operation	What happens to dead rows	Space returned to OS?
DELETE	Marked as dead, stays on disk	No
VACUUM	Marks dead space as reusable	No
VACUUM FULL	Rewrites the table entirely	Yes

A table with heavy churn — lots of updates and deletes — will accumulate dead tuples fast. VACUUM needs to run frequently enough to keep that under control.

2. VACUUM vs VACUUM FULL

Regular VACUUM is lightweight. It runs concurrently with reads and writes, marks dead space as reusable and updates internal statistics like the visibility map and free space map. It holds minimal locks, so it usually won't block your application.

VACUUM FULL is different. It rewrites the entire table into a new file, compacting everything and returning unused space to the OS. The catch is that it acquires an exclusive lock on the table for the entire duration. Your application cannot read or write to that table while VACUUM FULL is running.

For most tables, you should never need VACUUM FULL. If your table stabilizes at some size after bloat and regular VACUUM keeps up with dead tuple removal, there's no reason to run it. The space that VACUUM marks as reusable will be used again by future writes.

VACUUM FULL makes sense in specific situations: after a bulk delete that removes a large percentage of rows, when you need to actually shrink the file on disk, or when a table has become so bloated that query performance is noticeably degraded and regular VACUUM can't fix it fast enough.

If you need VACUUM FULL's compaction effect without the full table lock, look at pg_repack. It's an extension that rebuilds tables and indexes online, keeping the table available the entire time.

3. How autovacuum works and why you should tune it

PostgreSQL ships with autovacuum enabled by default. It runs in the background and automatically vacuums tables when they accumulate enough dead tuples. For most workloads it works well out of the box, but it has limits worth knowing.

Autovacuum decides when to vacuum a table based on two thresholds:

autovacuum_vacuum_threshold — minimum number of dead tuples before triggering (default 50)
autovacuum_vacuum_scale_factor — fraction of table size added on top (default 0.2, meaning 20%)

So for a table with 1 million rows, autovacuum triggers after 50 + (1,000,000 × 0.2) = 200,050 dead tuples. That's a lot of bloat to let build up before cleaning.

For large, high-churn tables you'll want to lower these values at the table level:

ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,
  autovacuum_vacuum_threshold = 100
);

The number of autovacuum worker processes is controlled by autovacuum_max_workers (default 3). If you have many tables with heavy write load, those 3 workers can fall behind. Increasing this helps, but each worker uses memory and CPU.

One more thing: autovacuum won't run while a long-running transaction holds an old snapshot. If you have queries or idle transactions open for hours, they block VACUUM from reclaiming space. That dead tuple pile just keeps growing.

4. VACUUM ANALYZE and the visibility map

Running VACUUM ANALYZE is more useful than running each separately. VACUUM cleans dead tuples; ANALYZE collects statistics about column distributions that the query planner uses to choose execution plans. Tables with stale statistics get bad plans, which usually shows up as sequential scans where index scans should be used.

The visibility map is a structure PostgreSQL maintains alongside each table. It tracks which pages contain only tuples that are visible to all active transactions. Once a page is marked all-visible, index-only scans can skip fetching the actual heap page to check visibility. That's a significant performance win for read-heavy workloads.

VACUUM updates the visibility map as it scans pages. Tables that haven't been vacuumed recently won't have an up-to-date visibility map, so index-only scans fall back to heap fetches.

A few useful queries for checking vacuum status:

-- Check last vacuum and autovacuum times per table
SELECT
  schemaname,
  tablename,
  last_vacuum,
  last_autovacuum,
  n_dead_tup,
  n_live_tup
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;

Tables with high n_dead_tup relative to n_live_tup are candidates for manual VACUUM or threshold tuning.

5. Transaction ID wraparound

This one can actually take down your database if you ignore it long enough.

PostgreSQL assigns a transaction ID (XID) to every transaction. These IDs are used to determine which tuples are visible to which transactions. The problem is that XIDs are 32-bit integers — they wrap around after about 2 billion transactions.

PostgreSQL handles this by "freezing" old tuples. Frozen tuples are considered visible to all future transactions regardless of XID. VACUUM is responsible for freezing tuples that are old enough.

The parameters involved:

Parameter	Default	Meaning
`vacuum_freeze_min_age`	50,000,000	Minimum XID age before a tuple can be frozen
`vacuum_freeze_table_age`	150,000,000	Table age at which VACUUM does a full freeze scan
`autovacuum_freeze_max_age`	200,000,000	Forces autovacuum to prevent wraparound

When a database gets within 3 million transactions of wraparound, PostgreSQL enters a read-only safety mode to prevent data corruption. It will refuse all writes until you manually run VACUUM FREEZE. That's not a situation you want to deal with at 2 AM.

You can check where your databases stand:

SELECT
  datname,
  age(datfrozenxid) AS xid_age,
  2147483648 - age(datfrozenxid) AS xids_remaining
FROM pg_database
ORDER BY xid_age DESC;

If xid_age is climbing toward 200 million, your autovacuum settings may be too conservative, or autovacuum is being blocked by long-running transactions. Both need investigation.

Keeping your PostgreSQL healthy

VACUUM isn't a fire-and-forget thing. It's an ongoing maintenance task that requires attention as your data volume and write patterns change. Regular VACUUM handles day-to-day dead tuple cleanup. Autovacuum tuning ensures the right tables get vacuumed often enough. And monitoring XID age keeps you away from the one failure mode that's truly painful to deal with.

Database health includes more than just VACUUM. Automated PostgreSQL backup is another layer of that maintenance — Databasus is the industry standard tool for this, widely used across both individual developers and enterprise teams. It handles scheduled backups to S3, Google Drive, FTP and other storages, with encryption, retention policies and failure notifications.

For VACUUM specifically, the main takeaways are: don't disable autovacuum, tune thresholds for tables with high write volume, watch for long-running transactions that block vacuum, and keep an eye on XID age in production.