Charity Jelimo

Posted on May 14

Refreshing PostgreSQL Materialized Views Without Downtime

#sql #database #postgres #performance

Materialized views are one of PostgreSQL’s most useful features for analytics and reporting workloads.

They solve a very common problem: some queries are simply too expensive to run repeatedly in real time.

Imagine a dashboard query that joins multiple large tables, aggregates millions of rows, and calculates metrics per customer. Running that query on every request can quickly become slow, expensive, and unpredictable under load.

A materialized view solves this by storing the query result physically on disk.

Instead of recalculating the query every time, PostgreSQL precomputes the result once and serves future reads directly from the stored data.

CREATE MATERIALIZED VIEW mv_ordersummary AS
SELECT
    company_id,
    COUNT(*) AS total_orders,
    SUM(amount) AS revenue
FROM orders
GROUP BY company_id;

Querying the materialized view is fast because the expensive computation already happened earlier.

But materialized views introduce a new problem:

the data becomes stale.

Unlike normal views, materialized views do not automatically update when source tables change. You must refresh them manually.

That is where things become complicated.

Refreshing a materialized view sounds simple at first:

REFRESH MATERIALIZED VIEW mv_ordersummary;

But in production systems, especially multi-tenant systems with continuous traffic, refresh behavior becomes an operational challenge.

Large refreshes can take minutes.
Refresh operations acquire locks.
Users continue reading during refreshes.
One failed refresh should not impact other tenants.

And at scale, dozens or hundreds of materialized views may refresh continuously.

In our case, every tenant has its own set of materialized views:

mv_ordersummary_123
mv_ordersummary_456
mv_ordersummary_789

Why Refreshing Materialized Views Is Harder Than It Looks

The problem is not creating the materialized view.

The problem is replacing old data with new data while traffic is actively reading from it.

A few realities make this difficult:

Refreshes can take minutes on large datasets
PostgreSQL refresh operations acquire locks
Readers expect consistent results
Multi-tenant systems amplify operational problems
One bad tenant refresh should not break the entire refresh cycle

At small scale, REFRESH MATERIALIZED VIEW works fine.

At larger scale, you start caring about:

lock duration
transaction scope
bloat
validation
observability
failure isolation
rollback safety

The strategy you choose depends on your workload characteristics.
Let’s walk through the main approaches PostgreSQL engineers typically use.

Approach 1: Plain REFRESH MATERIALIZED VIEW

The simplest option is the default PostgreSQL refresh behavior.

REFRESH MATERIALIZED VIEW mv_ordersummary_123;

How It Works

PostgreSQL rebuilds the materialized view contents in place.

During the refresh, PostgreSQL acquires an ACCESS EXCLUSIVE lock on the materialized view.

That lock blocks:

reads
writes
concurrent refreshes

Readers wait until the refresh completes.

Pros

Very simple.
No special schema design required.
No additional storage overhead.
Usually fast for smaller views.
Cons
The lock behavior is a dealbreaker for high-concurrency systems.
If your refresh takes 3 minutes, readers block for 3 minutes.
In practice, that means:
API latency spikes
request timeouts
connection pool exhaustion
cascading failures

This becomes especially painful in multi-tenant systems where refreshes happen continuously.

Even if each refresh is “only” 30 seconds, enough concurrent tenants create constant lock pressure.

Verdict

Good for:

internal tools
small datasets
low read concurrency
maintenance windows

Not acceptable for continuously queried production workloads.

Approach 2: REFRESH MATERIALIZED VIEW CONCURRENTLY

PostgreSQL provides a safer option:

REFRESH MATERIALIZED VIEW CONCURRENTLY mv_ordersummary_123;

This is the first thing most engineers try after discovering lock contention.

How It Works

Instead of replacing the materialized view contents directly, PostgreSQL builds updated contents alongside the existing data and swaps them internally.

Readers can continue querying during the refresh.

Pros

No blocking reads.

This is the biggest advantage.

For many workloads, this alone is sufficient.

Cons

There are important caveats.

Requires a UNIQUE index

PostgreSQL requires at least one unique index covering all rows.

CREATE UNIQUE INDEX idx_mv_ordersummary_123
ON mv_ordersummary_123(order_id);

Not every materialized view naturally has a stable unique key.

Sometimes you end up manufacturing synthetic uniqueness just to satisfy refresh requirements.

Still one massive transaction

The entire concurrent refresh runs in one transaction.

For large views, that means:

long transaction lifetimes
large WAL generation
vacuum delays
replication lag risk
Slower than regular refresh

Concurrent refreshes are usually slower than standard refreshes because PostgreSQL must maintain visibility guarantees while comparing old and new rows.

Can create bloat

Frequent concurrent refreshes can generate table and index bloat over time.

You often need aggressive autovacuum tuning or periodic maintenance.

Verdict

A strong default choice when:

you can define a stable unique index
refresh duration is acceptable
storage churn is manageable But it still couples refresh execution to the live object itself.

That limitation matters at scale.

Approach 3: Incremental Refresh with Triggers or pg_ivm

The next level is incremental maintenance.

Instead of rebuilding the entire materialized view, you update only the changed rows.

This is the idea behind extensions like pg_ivm.

How It Works

Base table writes trigger incremental updates to the materialized view.

Conceptually:

INSERT into orders
    -> trigger fires
    -> materialized aggregate updated

Pros

Near real-time freshness.
No expensive full rebuilds.
No giant refresh windows.
Excellent for low-latency analytical workloads.

Cons

The complexity increases dramatically.

Write amplification

Every write to source tables now performs additional maintenance work.

Heavy OLTP systems can suffer significant write overhead.

2.** More operational complexity**

Triggers become part of the critical write path.

That introduces:

contention
debugging complexity
migration risk
transactional edge cases
SQL limitations

Incremental maintenance works best for simpler aggregation patterns.

Complex joins, window functions, or non-deterministic logic can become difficult or unsupported.

Verdict

Excellent for:

real-time analytics
event-heavy systems
carefully controlled schemas Overkill for many batch-oriented refresh pipelines.

Approach 4 (Recommended): Blue/Green Swap with Shadow Views

Instead of refreshing the live materialized view directly, we maintain a separate shadow refresh view.

For every logical view, we maintain three physical objects:

mv_ordersummary_123           -- live
mv_ordersummary_123_refresh   -- next candidate
mv_ordersummary_123_blue      -- previous generation

The key idea:

refresh happens entirely outside the live object
validation occurs before promotion
promotion is an atomic rename swap

The expensive work is isolated from readers.

The Refresh Flow

Our refresh functionality executes roughly in this sequence:

Refresh _refresh
Validate row count
Perform atomic rename swap
Reset stale refresh view
Update metadata

The critical insight is that the live object remains untouched until promotion.

Step 1: Refresh the Shadow View
REFRESH MATERIALIZED VIEW mv_ordersummary_123_refresh;

This may take minutes.

That is fine.

Nobody reads from _refresh.

The live materialized view continues serving traffic normally.

Step 2: Validate Before Promotion

We refuse to swap in obviously broken refreshes.

At minimum:

SELECT COUNT(*) 
FROM mv_ordersummary_123_refresh;

If the row count is zero, we abort promotion.

This catches failures like:

upstream ETL issues
accidental filters
empty joins
bad migrations

Validation can be more sophisticated:

row-count deltas
checksum comparisons
timestamp sanity checks
aggregate thresholds

The important thing is validating before touching production readers.

Step 3: Atomic 3-Way Rename Swap

This is the core pattern.

Inside one transaction:

BEGIN;

ALTER MATERIALIZED VIEW mv_ordersummary_123
    RENAME TO mv_ordersummary_123_blue;

ALTER MATERIALIZED VIEW mv_ordersummary_123_refresh
    RENAME TO mv_ordersummary_123;

ALTER MATERIALIZED VIEW mv_ordersummary_123_blue
    RENAME TO mv_ordersummary_123_refresh;

COMMIT;

The rename sequence is effectively atomic from the application's perspective.

Readers either see:

the old live view
or the new live view Never a half-built state.

Why This Works Well

The expensive operation is the refresh.

The rename itself is extremely fast.

The brief ACCESS EXCLUSIVE lock during rename typically lasts milliseconds.

That is fundamentally different from holding the lock during a multi-minute refresh.

Visualizing the Swap

Before:

LIVE      -> mv_ordersummary_123
REFRESH   -> mv_ordersummary_123_refresh
OLD       -> mv_ordersummary_123_blue

After refresh completes:

LIVE      -> old data
REFRESH   -> new data
OLD       -> previous generation

After atomic rename transaction:

LIVE      -> new data
REFRESH   -> old data
OLD       -> previous live generation name

Effectively:

live -> _blue
_refresh -> live
_blue -> _refresh

Readers continue querying the stable live name the entire time.

Step 4: Reset the Stale Refresh View

This is an important optimization.

After promotion, the old live view becomes the next cycle’s _refresh candidate.

We immediately clear it:

REFRESH MATERIALIZED VIEW mv_ordersummary_123_refresh
WITH NO DATA;

This does two things:

frees stale contents
guarantees the next refresh starts clean

Without this step, old data lingers indefinitely and can confuse debugging or validation logic.

The WITH NO DATA trick is underused and extremely useful in rotation-based refresh systems.

Step 5: Update Metadata

Finally:

UPDATE company_materialized_view_config
SET view_last_refreshed = NOW()
WHERE company_id = 123;

This powers:

dashboards
alerting
freshness monitoring
retry logic

Operational metadata matters as much as the refresh itself.

Why Blue/Green Works So Well

This pattern solves several operational problems simultaneously.

1.** Readers Never Touch the Refresh Process**

The live object remains stable during expensive refresh operations.

That isolation is the biggest win.

Validation Happens Before Exposure

You can reject bad refreshes safely.

That alone justifies the extra complexity.

Failures Are Easy to Contain

If refresh fails:

log it
skip promotion
continue processing other tenants

The existing live view remains untouched.

Promotion Is Fast

The swap itself is metadata-only.

That minimizes lock duration dramatically.

Tradeoffs

No solution is free.

*Approximately 2x Storage * You effectively maintain duplicate materialized views.

For very large datasets, this matters.

*Naming Conventions Become Important * Your orchestration layer must understand:

_refresh
_blue
live names

You need deterministic naming and discovery logic.

Brief Rename Locks Still Exist

The rename transaction acquires ACCESS EXCLUSIVE.

But the duration is tiny compared to full refresh locking.

Milliseconds instead of minutes is usually an acceptable trade.

Final thoughts:
Refreshing materialized views is easy in development environments.

Refreshing them safely in production is not.

As datasets grow and concurrency increases, refresh strategy becomes an operational architecture decision rather than just a SQL command.

There is no universally correct approach:

plain refresh is simplest
concurrent refresh improves availability
incremental maintenance provides real-time freshness
blue/green swaps maximize isolation and validation safety

For our workload, the blue/green pattern provided the best balance between simplicity, reliability, and operational safety.

The core idea is straightforward:

Never rebuild the object your users are actively reading.

Refresh elsewhere.

Validate.

Promote atomically.

DEV Community

Refreshing PostgreSQL Materialized Views Without Downtime

Why Refreshing Materialized Views Is Harder Than It Looks

Approach 1: Plain REFRESH MATERIALIZED VIEW

Approach 2: REFRESH MATERIALIZED VIEW CONCURRENTLY

Approach 3: Incremental Refresh with Triggers or pg_ivm

Approach 4 (Recommended): Blue/Green Swap with Shadow Views

Top comments (0)