Materialized views are one of PostgreSQL’s most useful features for analytics and reporting workloads.
They solve a very common problem: some queries are simply too expensive to run repeatedly in real time.
Imagine a dashboard query that joins multiple large tables, aggregates millions of rows, and calculates metrics per customer. Running that query on every request can quickly become slow, expensive, and unpredictable under load.
A materialized view solves this by storing the query result physically on disk.
Instead of recalculating the query every time, PostgreSQL precomputes the result once and serves future reads directly from the stored data.
CREATE MATERIALIZED VIEW mv_ordersummary AS
SELECT
company_id,
COUNT(*) AS total_orders,
SUM(amount) AS revenue
FROM orders
GROUP BY company_id;
Querying the materialized view is fast because the expensive computation already happened earlier.
But materialized views introduce a new problem:
the data becomes stale.
Unlike normal views, materialized views do not automatically update when source tables change. You must refresh them manually.
That is where things become complicated.
Refreshing a materialized view sounds simple at first:
REFRESH MATERIALIZED VIEW mv_ordersummary;
But in production systems, especially multi-tenant systems with continuous traffic, refresh behavior becomes an operational challenge.
- Large refreshes can take minutes.
- Refresh operations acquire locks.
- Users continue reading during refreshes.
- One failed refresh should not impact other tenants.
And at scale, dozens or hundreds of materialized views may refresh continuously.
In our case, every tenant has its own set of materialized views:
mv_ordersummary_123
mv_ordersummary_456
mv_ordersummary_789
Why Refreshing Materialized Views Is Harder Than It Looks
The problem is not creating the materialized view.
The problem is replacing old data with new data while traffic is actively reading from it.
A few realities make this difficult:
- Refreshes can take minutes on large datasets
- PostgreSQL refresh operations acquire locks
- Readers expect consistent results
- Multi-tenant systems amplify operational problems
- One bad tenant refresh should not break the entire refresh cycle
At small scale, REFRESH MATERIALIZED VIEW works fine.
At larger scale, you start caring about:
- lock duration
- transaction scope
- bloat
- validation
- observability
- failure isolation
- rollback safety
The strategy you choose depends on your workload characteristics.
Let’s walk through the main approaches PostgreSQL engineers typically use.
Approach 1: Plain REFRESH MATERIALIZED VIEW
The simplest option is the default PostgreSQL refresh behavior.
REFRESH MATERIALIZED VIEW mv_ordersummary_123;
How It Works
PostgreSQL rebuilds the materialized view contents in place.
During the refresh, PostgreSQL acquires an ACCESS EXCLUSIVE lock on the materialized view.
That lock blocks:
- reads
- writes
- concurrent refreshes
Readers wait until the refresh completes.
Pros
- Very simple.
- No special schema design required.
- No additional storage overhead.
Usually fast for smaller views.
ConsThe lock behavior is a dealbreaker for high-concurrency systems.
If your refresh takes 3 minutes, readers block for 3 minutes.
In practice, that means:API latency spikes
request timeouts
connection pool exhaustion
cascading failures
This becomes especially painful in multi-tenant systems where refreshes happen continuously.
Even if each refresh is “only” 30 seconds, enough concurrent tenants create constant lock pressure.
Verdict
Good for:
- internal tools
- small datasets
- low read concurrency
- maintenance windows
Not acceptable for continuously queried production workloads.
Approach 2: REFRESH MATERIALIZED VIEW CONCURRENTLY
PostgreSQL provides a safer option:
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_ordersummary_123;
This is the first thing most engineers try after discovering lock contention.
How It Works
Instead of replacing the materialized view contents directly, PostgreSQL builds updated contents alongside the existing data and swaps them internally.
Readers can continue querying during the refresh.
Pros
- No blocking reads.
This is the biggest advantage.
For many workloads, this alone is sufficient.
Cons
There are important caveats.
- Requires a UNIQUE index
PostgreSQL requires at least one unique index covering all rows.
CREATE UNIQUE INDEX idx_mv_ordersummary_123
ON mv_ordersummary_123(order_id);
Not every materialized view naturally has a stable unique key.
Sometimes you end up manufacturing synthetic uniqueness just to satisfy refresh requirements.
- Still one massive transaction
The entire concurrent refresh runs in one transaction.
For large views, that means:
- long transaction lifetimes
- large WAL generation
- vacuum delays
- replication lag risk
- Slower than regular refresh
Concurrent refreshes are usually slower than standard refreshes because PostgreSQL must maintain visibility guarantees while comparing old and new rows.
- Can create bloat
Frequent concurrent refreshes can generate table and index bloat over time.
You often need aggressive autovacuum tuning or periodic maintenance.
Verdict
A strong default choice when:
- you can define a stable unique index
- refresh duration is acceptable
- storage churn is manageable But it still couples refresh execution to the live object itself.
That limitation matters at scale.
Approach 3: Incremental Refresh with Triggers or pg_ivm
The next level is incremental maintenance.
Instead of rebuilding the entire materialized view, you update only the changed rows.
This is the idea behind extensions like pg_ivm.
How It Works
Base table writes trigger incremental updates to the materialized view.
Conceptually:
INSERT into orders
-> trigger fires
-> materialized aggregate updated
Pros
- Near real-time freshness.
- No expensive full rebuilds.
- No giant refresh windows.
- Excellent for low-latency analytical workloads.
Cons
The complexity increases dramatically.
- Write amplification
Every write to source tables now performs additional maintenance work.
Heavy OLTP systems can suffer significant write overhead.
2.** More operational complexity**
Triggers become part of the critical write path.
That introduces:
- contention
- debugging complexity
- migration risk
- transactional edge cases
- SQL limitations
Incremental maintenance works best for simpler aggregation patterns.
Complex joins, window functions, or non-deterministic logic can become difficult or unsupported.
Verdict
Excellent for:
- real-time analytics
- event-heavy systems
- carefully controlled schemas Overkill for many batch-oriented refresh pipelines.
Approach 4 (Recommended): Blue/Green Swap with Shadow Views
Instead of refreshing the live materialized view directly, we maintain a separate shadow refresh view.
For every logical view, we maintain three physical objects:
mv_ordersummary_123 -- live
mv_ordersummary_123_refresh -- next candidate
mv_ordersummary_123_blue -- previous generation
The key idea:
- refresh happens entirely outside the live object
- validation occurs before promotion
- promotion is an atomic rename swap
The expensive work is isolated from readers.
The Refresh Flow
Our refresh functionality executes roughly in this sequence:
- Refresh _refresh
- Validate row count
- Perform atomic rename swap
- Reset stale refresh view
- Update metadata
The critical insight is that the live object remains untouched until promotion.
Step 1: Refresh the Shadow View
REFRESH MATERIALIZED VIEW mv_ordersummary_123_refresh;
This may take minutes.
That is fine.
Nobody reads from _refresh.
The live materialized view continues serving traffic normally.
Step 2: Validate Before Promotion
We refuse to swap in obviously broken refreshes.
At minimum:
SELECT COUNT(*)
FROM mv_ordersummary_123_refresh;
If the row count is zero, we abort promotion.
This catches failures like:
- upstream ETL issues
- accidental filters
- empty joins
- bad migrations
Validation can be more sophisticated:
- row-count deltas
- checksum comparisons
- timestamp sanity checks
- aggregate thresholds
The important thing is validating before touching production readers.
Step 3: Atomic 3-Way Rename Swap
This is the core pattern.
Inside one transaction:
BEGIN;
ALTER MATERIALIZED VIEW mv_ordersummary_123
RENAME TO mv_ordersummary_123_blue;
ALTER MATERIALIZED VIEW mv_ordersummary_123_refresh
RENAME TO mv_ordersummary_123;
ALTER MATERIALIZED VIEW mv_ordersummary_123_blue
RENAME TO mv_ordersummary_123_refresh;
COMMIT;
The rename sequence is effectively atomic from the application's perspective.
Readers either see:
- the old live view
- or the new live view Never a half-built state.
Why This Works Well
The expensive operation is the refresh.
The rename itself is extremely fast.
The brief ACCESS EXCLUSIVE lock during rename typically lasts milliseconds.
That is fundamentally different from holding the lock during a multi-minute refresh.
Visualizing the Swap
Before:
LIVE -> mv_ordersummary_123
REFRESH -> mv_ordersummary_123_refresh
OLD -> mv_ordersummary_123_blue
After refresh completes:
LIVE -> old data
REFRESH -> new data
OLD -> previous generation
After atomic rename transaction:
LIVE -> new data
REFRESH -> old data
OLD -> previous live generation name
Effectively:
live -> _blue
_refresh -> live
_blue -> _refresh
Readers continue querying the stable live name the entire time.
Step 4: Reset the Stale Refresh View
This is an important optimization.
After promotion, the old live view becomes the next cycle’s _refresh candidate.
We immediately clear it:
REFRESH MATERIALIZED VIEW mv_ordersummary_123_refresh
WITH NO DATA;
This does two things:
- frees stale contents
- guarantees the next refresh starts clean
Without this step, old data lingers indefinitely and can confuse debugging or validation logic.
The WITH NO DATA trick is underused and extremely useful in rotation-based refresh systems.
Step 5: Update Metadata
Finally:
UPDATE company_materialized_view_config
SET view_last_refreshed = NOW()
WHERE company_id = 123;
This powers:
- dashboards
- alerting
- freshness monitoring
- retry logic
Operational metadata matters as much as the refresh itself.
Why Blue/Green Works So Well
This pattern solves several operational problems simultaneously.
1.** Readers Never Touch the Refresh Process**
The live object remains stable during expensive refresh operations.
That isolation is the biggest win.
- Validation Happens Before Exposure
You can reject bad refreshes safely.
That alone justifies the extra complexity.
- Failures Are Easy to Contain
If refresh fails:
- log it
- skip promotion
- continue processing other tenants
The existing live view remains untouched.
- Promotion Is Fast
The swap itself is metadata-only.
That minimizes lock duration dramatically.
Tradeoffs
No solution is free.
- *Approximately 2x Storage * You effectively maintain duplicate materialized views.
For very large datasets, this matters.
- *Naming Conventions Become Important * Your orchestration layer must understand:
_refresh
_blue
live names
You need deterministic naming and discovery logic.
- Brief Rename Locks Still Exist
The rename transaction acquires ACCESS EXCLUSIVE.
But the duration is tiny compared to full refresh locking.
Milliseconds instead of minutes is usually an acceptable trade.
Final thoughts:
Refreshing materialized views is easy in development environments.
Refreshing them safely in production is not.
As datasets grow and concurrency increases, refresh strategy becomes an operational architecture decision rather than just a SQL command.
There is no universally correct approach:
- plain refresh is simplest
- concurrent refresh improves availability
- incremental maintenance provides real-time freshness
- blue/green swaps maximize isolation and validation safety
For our workload, the blue/green pattern provided the best balance between simplicity, reliability, and operational safety.
The core idea is straightforward:
Never rebuild the object your users are actively reading.
Refresh elsewhere.
Validate.
Promote atomically.
Top comments (0)