Franck Pachot

Posted on May 26

The Autovacuum Scale Factor Problem at Scale - Know Your Defaults

#postgres #sql #database #performane

In PostgreSQL, autovacuum and autoanalyze exist to clean up dead tuples (old versions of updated/deleted rows) and update query planner statistics, respectively. The challenge is running them frequently enough so that query plans and execution do not degrade after data modifications, but not so frequently as to cause excessive I/O overhead.

Databases often maintain a counter of the number of modifications to trigger these background jobs. Oracle Database and MySQL use a stale percentage (the ratio of modifications to total rows) for statistics gathering. SQL Server uses a dynamically decreasing percentage to ensure statistics do not remain stale for too long on massive tables. PostgreSQL uses a hybrid approach: a fixed base threshold combined with a scale factor (a percentage) that grows proportionally with the table size.

This hybrid approach hits the sweet spot for most workloads, but it often requires tuning based on your specific data. The key factor to watch is the amount of static, "cold" data in your tables. Because the scale factor is calculated against the total table size, a large volume of cold data will significantly inflate the threshold. This can delay maintenance on the active working set—the "hot" data actually used by your queries—leaving it vulnerable to stale statistics or bloat.

Here are the default base thresholds:

postgres=# \dconfig *autovacuum*threshold
        List of configuration parameters
             Parameter              |   Value
------------------------------------+-----------
 autovacuum_analyze_threshold       | 50
 autovacuum_vacuum_insert_threshold | 1000
 autovacuum_vacuum_max_threshold    | 100000000
 autovacuum_vacuum_threshold        | 50
(4 rows)

At first glance, this suggests tables are analyzed when 50 rows are modified, and vacuumed when 50 dead tuples accumulate (from deletes or updates) or 1,000 rows are inserted. But this is only true without the scale factor—10% for statistics, 20% for vacuum:

postgres=# \dconfig *autovacuum*scale_factor
       List of configuration parameters
               Parameter               | Value
---------------------------------------+-------
 autovacuum_analyze_scale_factor       | 0.1
 autovacuum_vacuum_insert_scale_factor | 0.2
 autovacuum_vacuum_scale_factor        | 0.2
(3 rows)

Because of the scale factor, the actual trigger thresholds increase with the size of the table. For the default settings, the formulas are:

Analyze when inserts or modifications > 10% table row count plus 50 rows
Vacuum when dead tuples (from DELETE or UPDATE) > 20% table row count plus 50 rows
Vacuum when inserts > 20% table row count plus 1000 rows

As these formulas show, a larger table requires a much larger accumulation of changes before maintenance fires. This is perfectly acceptable if data churn is uniformly distributed, as small changes across a massive dataset will not drastically impact query cost estimations.

However, data distribution is rarely uniform and evolves over time (e.g., seasonal sales spikes, market expanding to new countries). Because static data inflates the table row count in the formulas above, your database waits too long to trigger maintenance on the active working set.

This is the core problem with default autovacuum settings at scale: a table with 5 million rows can accumulate half a million stale modifications before the planner statistics are refreshed, and over a million dead tuples before bloat is cleaned up. The larger the table grows, the longer it waits, and the worse the situation becomes:

Query planner statistics become increasingly stale between analyzes.
The visibility map is stale for longer and index-only scans become less efficient.
Dead tuple bloat accumulates more between cleanups, wasting storage and slowing scans.

To demonstrate this, I have run the following script to simulate this kind of activity, constantly inserting 100 rows and then updating them. We delete nothing because we want to keep the history, but queries operate on those recent rows. Think of it like orders being entered, then processed, and remaining stored:

\c
\o tmp.log

-- run autovacuum frequently for the demo
alter system set autovacuum_naptime = '1s';
select pg_reload_conf();

-- create a table
drop table demo;
create table demo (
 id bigserial primary key, n int default 0
);
vacuum analyze demo
;

-- show the vacuum and analyze statistics,
-- insert 100 rows and update them
-- run that in a loop every 5 seconds
select relname, n_tup_ins, n_tup_upd , n_mod_since_analyze, n_ins_since_vacuum
 , autovacuum_count,  last_autovacuum  --, vacuum_count,  last_vacuum
 , autoanalyze_count, last_autoanalyze --, analyze_count, last_analyze
from pg_stat_user_tables where relid='demo'::regclass
\;
insert into demo select from generate_series(1,100)
\;
update demo set n=n+1 where id in (
select id from demo order by id desc limit 100
)
\watch i=5 c=100000

For each iteration, the total number of rows inserted (n_tup_ins) and updated (n_tup_upd), as visible in pg_stat_user_tables, increases by 100. It is the X-axis on this diagram (n_tup):

The Y-axis shows the staleness of statistics (n_mod_since_analyze) and the accumulation of dead tuples (n_mod_since_vacuum) until the auto vacuum/analyze kicks in.

With 5 million rows, the last million inserted rows accumulated dead tuples. That is 20% of the total table, as defined by the default vacuum scale factor, but it likely represents 100% of the data actively read by your queries (for example, if the application processes the last year or less of a 5-year history). Furthermore, the last 500,000 rows have completely stale statistics, the 10% default analyze scale factor, and the past months may not have the same data distribution as the previous years.

Think about the impact this has on the maximum value for an ID sequence or a created_at timestamp. It also completely skews the query planner's understanding of your data distribution (such as querying by country or day of the week). I have seen this cause severe performance issues in the real world: a retail company where shops only open on Sundays during the summer, or a trading platform suddenly processing entirely new market trends. Because the statistics are stale, the planner assumes your new, active data looks exactly like your old, historical data.

As the table grows, the impact of this bloat and staleness compounds, and performance will no longer scale. Eventually, your execution plans will flip—not because the queries changed, but simply because the estimations of the query planner are completely wrong.

For very large tables where the total size increases but the active working set is a small, predictable number of rows, you can effectively disable the scale factor and rely almost entirely on the fixed threshold:

ALTER TABLE demo SET (  
    autovacuum_analyze_scale_factor = 0.001,  
    autovacuum_analyze_threshold    = 10000,  
    autovacuum_vacuum_scale_factor  = 0.001,  
    autovacuum_vacuum_threshold     = 10000  
);

This sets a nearly flat threshold that does not grow with the table size. The right threshold value depends on how many rows your active working set changes per hour and how much staleness you can tolerate. However, you must monitor the consequences of running autovacuum frequently on a growing table to ensure it does not cause localized I/O spikes.

Here is how the same run starts with the new table settings:

Auto analyze never left more than ~10,000 modified rows without refreshing statistics. This threshold grows slightly with the table (at 10 million rows it doubles to 20,000), but remains vastly better than the default. Auto vacuum follows the same pattern for dead tuples, but runs more frequently here because the insert-specific vacuum trigger was left at its default (1,000 rows + 20% scale factor), which only triggers the analyze threshold beyond 45,000 rows.

To address this unbounded growth natively, recent PostgreSQL versions introduced autovacuum_vacuum_max_threshold (with a default of 100 million—which is too high for my example). Rather than letting the scale factor dictate an endlessly growing target, this parameter imposes a hard ceiling on the vacuum threshold calculation. This means that even on a massive 1-billion-row table, autovacuum will forcibly trigger once dead tuples reach the configured maximum, serving as a built-in safety net. You can even adjust this globally via a simple config reload, or set it as a per-table storage parameter to enforce a strict upper limit on staleness without micromanaging scale factors across your entire schema.

Naturally, enforcing stricter thresholds—whether through these new maximum caps or manual table-level tuning—means autovacuum will run more frequently, which demands more background worker capacity. Historically, increasing autovacuum_max_workers to handle this extra load required a full database restart. However, PostgreSQL now thoughtfully splits this architecture: autovacuum_worker_slots reserves the hard upper bound of backend slots at postmaster startup, while autovacuum_max_workers dictates how many of those slots are actively used. This allows you to dynamically scale up your active workers on the fly (ALTER SYSTEM SET autovacuum_max_workers = 8; SELECT pg_reload_conf();) to absorb heavy maintenance workloads without incurring any downtime.

Alternatively, if your table has a clear time-based or categorical boundary between hot and cold data, partitioning is worth considering. Autovacuum operates per partition, so a current_year partition with 100,000 rows will trigger maintenance far sooner than a monolithic 5-million-row table, meaning the default scale factor will naturally behave exactly as intended.

Top comments (2)

Yunetzi • May 27

Autovacuum scales with defaults—are we optimizing or just hoping?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.