<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Charity Jelimo</title>
    <description>The latest articles on DEV Community by Charity Jelimo (@data_with_jelimo).</description>
    <link>https://dev.to/data_with_jelimo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3396373%2F192bdc45-0fb9-40d0-b855-52526df3afbe.jpg</url>
      <title>DEV Community: Charity Jelimo</title>
      <link>https://dev.to/data_with_jelimo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/data_with_jelimo"/>
    <language>en</language>
    <item>
      <title>Refreshing PostgreSQL Materialized Views Without Downtime</title>
      <dc:creator>Charity Jelimo</dc:creator>
      <pubDate>Thu, 14 May 2026 13:09:00 +0000</pubDate>
      <link>https://dev.to/data_with_jelimo/refreshing-postgresql-materialized-views-without-downtime-28n6</link>
      <guid>https://dev.to/data_with_jelimo/refreshing-postgresql-materialized-views-without-downtime-28n6</guid>
      <description>&lt;p&gt;Materialized views are one of PostgreSQL’s most useful features for analytics and reporting workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They solve a very common problem:&lt;/strong&gt; some queries are simply too expensive to run repeatedly in real time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Imagine a dashboard query that joins multiple large tables, aggregates millions of rows, and calculates metrics per customer. Running that query on every request can quickly become slow, expensive, and unpredictable under load.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A materialized view solves this by storing the query result physically on disk.&lt;/p&gt;

&lt;p&gt;Instead of recalculating the query every time, PostgreSQL precomputes the result once and serves future reads directly from the stored data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;company_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;company_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Querying the materialized view is fast because the expensive computation already happened earlier.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;But materialized views introduce a new problem:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;the data becomes stale.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unlike normal views, materialized views do not automatically update when source tables change. You must refresh them manually.&lt;/p&gt;

&lt;p&gt;That is where things become complicated.&lt;/p&gt;

&lt;p&gt;Refreshing a materialized view sounds simple at first:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;REFRESH MATERIALIZED VIEW mv_ordersummary;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;But in production systems, especially multi-tenant systems with continuous traffic, refresh behavior becomes an operational challenge.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Large refreshes can take minutes.&lt;/li&gt;
&lt;li&gt;Refresh operations acquire locks.&lt;/li&gt;
&lt;li&gt;Users continue reading during refreshes.&lt;/li&gt;
&lt;li&gt;One failed refresh should not impact other tenants.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And at scale, dozens or hundreds of materialized views may refresh continuously.&lt;/p&gt;

&lt;p&gt;In our case, every tenant has its own set of materialized views:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mv_ordersummary_123
mv_ordersummary_456
mv_ordersummary_789
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Refreshing Materialized Views Is Harder Than It Looks
&lt;/h2&gt;

&lt;p&gt;The problem is not creating the materialized view.&lt;/p&gt;

&lt;p&gt;The problem is replacing old data with new data while traffic is actively reading from it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A few realities make this difficult:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Refreshes can take minutes on large datasets&lt;/li&gt;
&lt;li&gt;PostgreSQL refresh operations acquire locks&lt;/li&gt;
&lt;li&gt;Readers expect consistent results&lt;/li&gt;
&lt;li&gt;Multi-tenant systems amplify operational problems&lt;/li&gt;
&lt;li&gt;One bad tenant refresh should not break the entire refresh cycle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At small scale, REFRESH MATERIALIZED VIEW works fine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At larger scale, you start caring about:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;lock duration&lt;/li&gt;
&lt;li&gt;transaction scope&lt;/li&gt;
&lt;li&gt;bloat&lt;/li&gt;
&lt;li&gt;validation&lt;/li&gt;
&lt;li&gt;observability&lt;/li&gt;
&lt;li&gt;failure isolation&lt;/li&gt;
&lt;li&gt;rollback safety&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The strategy you choose depends on your workload characteristics.&lt;br&gt;
Let’s walk through the main approaches PostgreSQL engineers typically use.&lt;/p&gt;
&lt;h2&gt;
  
  
  Approach 1: Plain REFRESH MATERIALIZED VIEW
&lt;/h2&gt;

&lt;p&gt;The simplest option is the default PostgreSQL refresh behavior.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;REFRESH&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How It Works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PostgreSQL rebuilds the materialized view contents in place.&lt;/p&gt;

&lt;p&gt;During the refresh, PostgreSQL acquires an ACCESS EXCLUSIVE lock on the materialized view.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That lock blocks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reads&lt;/li&gt;
&lt;li&gt;writes&lt;/li&gt;
&lt;li&gt;concurrent refreshes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Readers wait until the refresh completes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Very simple.&lt;/li&gt;
&lt;li&gt;No special schema design required.&lt;/li&gt;
&lt;li&gt;No additional storage overhead.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Usually fast for smaller views.&lt;br&gt;
&lt;strong&gt;Cons&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The lock behavior is a dealbreaker for high-concurrency systems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If your refresh takes 3 minutes, readers block for 3 minutes.&lt;br&gt;
In practice, that means:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;API latency spikes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;request timeouts&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;connection pool exhaustion&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;cascading failures&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This becomes especially painful in multi-tenant systems where refreshes happen continuously.&lt;/p&gt;

&lt;p&gt;Even if each refresh is “only” 30 seconds, enough concurrent tenants create constant lock pressure.&lt;/p&gt;

&lt;p&gt;Verdict&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal tools&lt;/li&gt;
&lt;li&gt;small datasets&lt;/li&gt;
&lt;li&gt;low read concurrency&lt;/li&gt;
&lt;li&gt;maintenance windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not acceptable for continuously queried production workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 2: REFRESH MATERIALIZED VIEW CONCURRENTLY
&lt;/h2&gt;

&lt;p&gt;PostgreSQL provides a safer option:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;REFRESH MATERIALIZED VIEW CONCURRENTLY mv_ordersummary_123;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This is the first thing most engineers try after discovering lock contention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How It Works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of replacing the materialized view contents directly, PostgreSQL builds updated contents alongside the existing data and swaps them internally.&lt;/p&gt;

&lt;p&gt;Readers can continue querying during the refresh.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No blocking reads.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the biggest advantage.&lt;/p&gt;

&lt;p&gt;For many workloads, this alone is sufficient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are important caveats.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Requires a UNIQUE index&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PostgreSQL requires at least one unique index covering all rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_mv_ordersummary_123&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not every materialized view naturally has a stable unique key.&lt;/p&gt;

&lt;p&gt;Sometimes you end up manufacturing synthetic uniqueness just to satisfy refresh requirements.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Still one massive transaction&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The entire concurrent refresh runs in one transaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For large views, that means:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;long transaction lifetimes&lt;/li&gt;
&lt;li&gt;large WAL generation&lt;/li&gt;
&lt;li&gt;vacuum delays&lt;/li&gt;
&lt;li&gt;replication lag risk&lt;/li&gt;
&lt;li&gt;Slower than regular refresh&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Concurrent refreshes are usually slower than standard refreshes because PostgreSQL must maintain visibility guarantees while comparing old and new rows.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Can create bloat&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Frequent concurrent refreshes can generate table and index bloat over time.&lt;/p&gt;

&lt;p&gt;You often need aggressive autovacuum tuning or periodic maintenance.&lt;/p&gt;

&lt;p&gt;Verdict&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A strong default choice when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you can define a stable unique index&lt;/li&gt;
&lt;li&gt;refresh duration is acceptable&lt;/li&gt;
&lt;li&gt;storage churn is manageable
But it still couples refresh execution to the live object itself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That limitation matters at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 3: Incremental Refresh with Triggers or pg_ivm
&lt;/h2&gt;

&lt;p&gt;The next level is incremental maintenance.&lt;/p&gt;

&lt;p&gt;Instead of rebuilding the entire materialized view, &lt;em&gt;you update only the changed rows.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is the idea behind extensions like pg_ivm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How It Works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Base table writes trigger incremental updates to the materialized view.&lt;/p&gt;

&lt;p&gt;Conceptually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT into orders
    -&amp;gt; trigger fires
    -&amp;gt; materialized aggregate updated
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Near real-time freshness.&lt;/li&gt;
&lt;li&gt;No expensive full rebuilds.&lt;/li&gt;
&lt;li&gt;No giant refresh windows.&lt;/li&gt;
&lt;li&gt;Excellent for low-latency analytical workloads.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The complexity increases dramatically.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Write amplification&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every write to source tables now performs additional maintenance work.&lt;/p&gt;

&lt;p&gt;Heavy OLTP systems can suffer significant write overhead.&lt;/p&gt;

&lt;p&gt;2.** More operational complexity**&lt;/p&gt;

&lt;p&gt;Triggers become part of the critical write path.&lt;/p&gt;

&lt;p&gt;That introduces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;contention&lt;/li&gt;
&lt;li&gt;debugging complexity&lt;/li&gt;
&lt;li&gt;migration risk&lt;/li&gt;
&lt;li&gt;transactional edge cases&lt;/li&gt;
&lt;li&gt;SQL limitations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Incremental maintenance works best for simpler aggregation patterns.&lt;/p&gt;

&lt;p&gt;Complex joins, window functions, or non-deterministic logic can become difficult or unsupported.&lt;/p&gt;

&lt;p&gt;Verdict&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Excellent for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;real-time analytics&lt;/li&gt;
&lt;li&gt;event-heavy systems&lt;/li&gt;
&lt;li&gt;carefully controlled schemas
Overkill for many batch-oriented refresh pipelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Approach 4 (Recommended): Blue/Green Swap with Shadow Views
&lt;/h2&gt;

&lt;p&gt;Instead of refreshing the live materialized view directly, we maintain a separate shadow refresh view.&lt;/p&gt;

&lt;p&gt;For every logical view, we maintain three physical objects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mv_ordersummary_123           -- live
mv_ordersummary_123_refresh   -- next candidate
mv_ordersummary_123_blue      -- previous generation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The key idea:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;refresh happens entirely outside the live object&lt;/li&gt;
&lt;li&gt;validation occurs before promotion&lt;/li&gt;
&lt;li&gt;promotion is an atomic rename swap&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The expensive work is isolated from readers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Refresh Flow&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our refresh functionality executes roughly in this sequence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Refresh _refresh&lt;/li&gt;
&lt;li&gt;Validate row count&lt;/li&gt;
&lt;li&gt;Perform atomic rename swap&lt;/li&gt;
&lt;li&gt;Reset stale refresh view&lt;/li&gt;
&lt;li&gt;Update metadata&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The critical insight is that the live object remains untouched until promotion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Refresh the Shadow View&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;REFRESH MATERIALIZED VIEW mv_ordersummary_123_refresh;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This may take minutes.&lt;/p&gt;

&lt;p&gt;That is fine.&lt;/p&gt;

&lt;p&gt;Nobody reads from _refresh.&lt;/p&gt;

&lt;p&gt;The live materialized view continues serving traffic normally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Validate Before Promotion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We refuse to swap in obviously broken refreshes.&lt;/p&gt;

&lt;p&gt;At minimum:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123_refresh&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the row count is zero, we abort promotion.&lt;/p&gt;

&lt;p&gt;This catches failures like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;upstream ETL issues&lt;/li&gt;
&lt;li&gt;accidental filters&lt;/li&gt;
&lt;li&gt;empty joins&lt;/li&gt;
&lt;li&gt;bad migrations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Validation can be more sophisticated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;row-count deltas&lt;/li&gt;
&lt;li&gt;checksum comparisons&lt;/li&gt;
&lt;li&gt;timestamp sanity checks&lt;/li&gt;
&lt;li&gt;aggregate thresholds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important thing is validating before touching production readers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Atomic 3-Way Rename Swap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the core pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inside one transaction:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123&lt;/span&gt;
    &lt;span class="k"&gt;RENAME&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123_blue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123_refresh&lt;/span&gt;
    &lt;span class="k"&gt;RENAME&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123_blue&lt;/span&gt;
    &lt;span class="k"&gt;RENAME&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123_refresh&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rename sequence is effectively atomic from the application's perspective.&lt;/p&gt;

&lt;p&gt;Readers either see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the old live view&lt;/li&gt;
&lt;li&gt;or the new live view
Never a half-built state.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why This Works Well&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The expensive operation is the refresh.&lt;/p&gt;

&lt;p&gt;The rename itself is extremely fast.&lt;/p&gt;

&lt;p&gt;The brief ACCESS EXCLUSIVE lock during rename typically lasts milliseconds.&lt;/p&gt;

&lt;p&gt;That is fundamentally different from holding the lock during a multi-minute refresh.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Visualizing the Swap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LIVE      -&amp;gt; mv_ordersummary_123
REFRESH   -&amp;gt; mv_ordersummary_123_refresh
OLD       -&amp;gt; mv_ordersummary_123_blue
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After refresh completes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LIVE      -&amp;gt; old data
REFRESH   -&amp;gt; new data
OLD       -&amp;gt; previous generation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After atomic rename transaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LIVE      -&amp;gt; new data
REFRESH   -&amp;gt; old data
OLD       -&amp;gt; previous live generation name
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Effectively:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;live      -&amp;gt; _blue&lt;br&gt;
_refresh  -&amp;gt; live&lt;br&gt;
_blue     -&amp;gt; _refresh&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Readers continue querying the stable live name the entire time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Reset the Stale Refresh View&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is an important optimization.&lt;/p&gt;

&lt;p&gt;After promotion, the old live view becomes the next cycle’s _refresh candidate.&lt;/p&gt;

&lt;p&gt;We immediately clear it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;REFRESH&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_ordersummary_123_refresh&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="k"&gt;NO&lt;/span&gt; &lt;span class="k"&gt;DATA&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This does two things:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;frees stale contents&lt;/li&gt;
&lt;li&gt;guarantees the next refresh starts clean&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this step, old data lingers indefinitely and can confuse debugging or validation logic.&lt;/p&gt;

&lt;p&gt;The WITH NO DATA trick is underused and extremely useful in rotation-based refresh systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5: Update Metadata&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Finally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;company_materialized_view_config&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;view_last_refreshed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;company_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This powers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;dashboards&lt;/li&gt;
&lt;li&gt;alerting&lt;/li&gt;
&lt;li&gt;freshness monitoring&lt;/li&gt;
&lt;li&gt;retry logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Operational metadata matters as much as the refresh itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Blue/Green Works So Well&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This pattern solves several operational problems simultaneously.&lt;/p&gt;

&lt;p&gt;1.** Readers Never Touch the Refresh Process**&lt;/p&gt;

&lt;p&gt;The live object remains stable during expensive refresh operations.&lt;/p&gt;

&lt;p&gt;That isolation is the biggest win.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Validation Happens Before Exposure&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You can reject bad refreshes safely.&lt;/p&gt;

&lt;p&gt;That alone justifies the extra complexity.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Failures Are Easy to Contain&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If refresh fails:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;log it&lt;/li&gt;
&lt;li&gt;skip promotion&lt;/li&gt;
&lt;li&gt;continue processing other tenants&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The existing live view remains untouched.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Promotion Is Fast&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The swap itself is metadata-only.&lt;/p&gt;

&lt;p&gt;That minimizes lock duration dramatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tradeoffs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;No solution is free.&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;*&lt;em&gt;Approximately 2x Storage
*&lt;/em&gt;
You effectively maintain duplicate materialized views.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For very large datasets, this matters.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;*&lt;em&gt;Naming Conventions Become Important
*&lt;/em&gt;
Your orchestration layer must understand:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;_refresh
_blue
live names

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You need deterministic naming and discovery logic.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Brief Rename Locks Still Exist&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The rename transaction acquires ACCESS EXCLUSIVE.&lt;/p&gt;

&lt;p&gt;But the duration is tiny compared to full refresh locking.&lt;/p&gt;

&lt;p&gt;Milliseconds instead of minutes is usually an acceptable trade.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final thoughts:&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Refreshing materialized views is easy in development environments.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Refreshing them safely in production is not.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;As datasets grow and concurrency increases, refresh strategy becomes an operational architecture decision rather than just a SQL command.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There is no universally correct approach:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;plain refresh is simplest&lt;/li&gt;
&lt;li&gt;concurrent refresh improves availability&lt;/li&gt;
&lt;li&gt;incremental maintenance provides real-time freshness&lt;/li&gt;
&lt;li&gt;blue/green swaps maximize isolation and validation safety&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For our workload, the blue/green pattern provided the best balance between simplicity, reliability, and operational safety.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The core idea is straightforward:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Never rebuild the object your users are actively reading.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Refresh elsewhere.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Validate.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Promote atomically.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>performance</category>
      <category>postgres</category>
      <category>sql</category>
    </item>
    <item>
      <title>What's a data warehouse?</title>
      <dc:creator>Charity Jelimo</dc:creator>
      <pubDate>Sun, 29 Mar 2026 17:44:51 +0000</pubDate>
      <link>https://dev.to/data_with_jelimo/whats-a-data-warehouse-7f0</link>
      <guid>https://dev.to/data_with_jelimo/whats-a-data-warehouse-7f0</guid>
      <description>&lt;h2&gt;
  
  
  The data challenge in modern businesses
&lt;/h2&gt;

&lt;p&gt;Imagine you’re running an online store. Every day, customers place orders, browse products, and make payments. All this data is being captured across different systems; your website, payment service, and customer database.&lt;/p&gt;

&lt;p&gt;Now, at the end of the month, you’re asked a simple question: &lt;em&gt;“&lt;/em&gt;&lt;em&gt;What are our top-selling products, and how has revenue changed over time?&lt;/em&gt;&lt;em&gt;”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Surprisingly, answering this isn’t easy.&lt;/p&gt;

&lt;p&gt;The data is scattered, inconsistent, and stored in systems designed for daily operations, not analysis. This is the challenge most businesses face: they have data, but they can’t easily turn it into insights.&lt;/p&gt;

&lt;p&gt;This is where a &lt;strong&gt;data warehouse&lt;/strong&gt; comes in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A data warehouse is a centralized system that collects data from multiple sources, cleans and organizes it, and stores it in a structured format optimized for analysis.&lt;/strong&gt; Unlike operational databases that focus on fast transactions, data warehouses are designed for querying large volumes of historical data, making them ideal for reporting, dashboards, and business intelligence.&lt;/p&gt;

&lt;p&gt;In simple terms, a data warehouse doesn’t just store data, it transforms it into something the business can actually use to make better decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data warehouse vs other data repositories
&lt;/h2&gt;

&lt;p&gt;A data warehouse is just one way to store data, and it’s important to understand how it differs from other common systems:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Operational databases (OLTP):&lt;/strong&gt;  &lt;em&gt;OLTP systems record business interactions as they occur in the day-to-day operation of the organization, and support querying of this data to make inferences.&lt;/em&gt; Fast and efficient, they keep the business running but they aren’t built to analyze historical trends or answer complex questions.ge-scale analysis.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data lakes:&lt;/strong&gt; &lt;em&gt;Data Lakes store everything, raw and unprocessed. This includes structured data like tables, semi-structured data like JSON files, or even unstructured data like images and logs.&lt;/em&gt; They’re perfect for data scientists and advanced analytics, but without organization, it can be difficult to get clear answers quickly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data lakehouses:&lt;/strong&gt; Data lakehouses are a hybrid. &lt;em&gt;They combine the flexibility of a data lake with the structured, query-ready features of a warehouse.&lt;/em&gt; You can store raw data while also running analytics, giving businesses the best of both worlds.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>Common Table Expressions</title>
      <dc:creator>Charity Jelimo</dc:creator>
      <pubDate>Wed, 25 Feb 2026 13:46:35 +0000</pubDate>
      <link>https://dev.to/data_with_jelimo/common-table-expressions-59bd</link>
      <guid>https://dev.to/data_with_jelimo/common-table-expressions-59bd</guid>
      <description>&lt;p&gt;A Common Table Expression (CTE) is a temporary result set that simplifies and structures SQL queries. It is defined using the WITH keyword and can improve query readability and reusability. In some cases, CTEs can also enhance performance by avoiding redundant calculations.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are CTEs?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Temporary Result Set:&lt;/strong&gt;&lt;br&gt;
CTEs exist only during the execution of the query and are not stored in the database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Readability and Maintainability:&lt;/strong&gt;&lt;br&gt;
By breaking complex logic into reusable components, CTEs make queries easier to understand.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reusable Within the Query:&lt;/strong&gt;&lt;br&gt;
A CTE can be referenced multiple times within the query, avoiding repeated logic or calculations.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why use CTEs?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Simplify Complex Queries:&lt;/strong&gt;
Break large queries into smaller, named parts for clarity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eliminate Redundant Calculations:&lt;/strong&gt;
Replace repeated subqueries with a single calculation in a CTE.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improve Maintainability:&lt;/strong&gt;
Centralize repeated logic in one place, making updates easier.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable Recursive Queries:&lt;/strong&gt;
Handle hierarchical or iterative data using recursive CTEs.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How Do CTEs Optimize Performance?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reduce Redundancy:&lt;/strong&gt;
CTEs calculate a result once, reducing unnecessary repetition.
Example:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    o.OrderID
FROM Orders o
WHERE o.TotalAmount &amp;gt; (SELECT AVG(TotalAmount) FROM Orders)
  AND o.TotalAmount &amp;lt; (SELECT AVG(TotalAmount) FROM Orders);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; The subquery (SELECT AVG(TotalAmount) FROM Orders) is calculated twice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution Using a CTE:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WITH AvgTotalAmount AS (
    SELECT AVG(TotalAmount) AS AvgAmount
    FROM Orders
)
SELECT o.OrderID
FROM Orders o
JOIN AvgTotalAmount a
WHERE o.TotalAmount &amp;gt; a.AvgAmount
  AND o.TotalAmount &amp;lt; a.AvgAmount;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Simplify Execution Plans:&lt;/strong&gt;&lt;br&gt;
With a CTE, the database evaluates the logic once and reuses the result.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Readable Performance Gains:&lt;/strong&gt;&lt;br&gt;
Even if there’s no computational gain, a CTE often makes execution plans easier to debug.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>beginners</category>
      <category>database</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
