<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Hassan Munir</title>
    <description>The latest articles on DEV Community by Hassan Munir (@hsnmnr).</description>
    <link>https://dev.to/hsnmnr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F247119%2F53e8e432-f7e3-48af-8742-8b0bc7c98c61.jpg</url>
      <title>DEV Community: Hassan Munir</title>
      <link>https://dev.to/hsnmnr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hsnmnr"/>
    <language>en</language>
    <item>
      <title>Three Ways to Set Up CDC from Postgres to ClickHouse</title>
      <dc:creator>Hassan Munir</dc:creator>
      <pubDate>Sat, 30 May 2026 17:36:08 +0000</pubDate>
      <link>https://dev.to/hsnmnr/three-ways-to-set-up-cdc-from-postgres-to-clickhouse-2fob</link>
      <guid>https://dev.to/hsnmnr/three-ways-to-set-up-cdc-from-postgres-to-clickhouse-2fob</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3avk9no3960v8c4dj3dm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3avk9no3960v8c4dj3dm.png" alt="Cover illustration for Three Ways to Set Up CDC from Postgres to ClickHouse, showing the data flow from Postgres into ClickHouse." width="800" height="336"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You cannot run analytical queries on the same Postgres primary that serves your application without paying for it in CPU and connections. A read replica does not help: Postgres is row-oriented and built for OLTP, not for scanning tens of millions of rows for a &lt;code&gt;GROUP BY&lt;/code&gt;. If you want sub-second dashboards over a real dataset you need a column store on the side.&lt;/p&gt;

&lt;p&gt;We picked ClickHouse. The interesting question is not how to query ClickHouse. The interesting question is how to keep it in sync with Postgres. That is what Change Data Capture (CDC) solves.&lt;/p&gt;

&lt;p&gt;CDC is mechanical: every insert, update, and delete in Postgres gets captured and forwarded somewhere else. Postgres already writes everything to the Write-Ahead Log (WAL) for crash recovery. CDC reads that log via a logical replication slot and turns it into an ordered stream you can replay into another system. Polling with &lt;code&gt;WHERE updated_at &amp;gt; $last&lt;/code&gt; is the wrong tool: it cannot see deletes, it adds load, and your &lt;code&gt;updated_at&lt;/code&gt; column will lie eventually. Read the log.&lt;/p&gt;

&lt;p&gt;There are three common ways to plug Postgres CDC into ClickHouse. We evaluated all three. We ran two of them in production. The trade-offs below come from running them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Postgres prerequisites (all three options)
&lt;/h2&gt;

&lt;p&gt;Logical replication needs &lt;code&gt;postgresql.conf&lt;/code&gt; set up correctly. This applies to all three approaches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="c"&gt;# postgresql.conf
&lt;/span&gt;&lt;span class="py"&gt;wal_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;logical&lt;/span&gt;
&lt;span class="py"&gt;max_wal_senders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10&lt;/span&gt;
&lt;span class="py"&gt;max_replication_slots&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10&lt;/span&gt;
&lt;span class="py"&gt;max_slot_wal_keep_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;102400  # MB; cap WAL retained by stalled slots&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then a replication role and a publication on the source side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="n"&gt;cdc_user&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;REPLICATION&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'redacted'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;USAGE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;cdc_user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt; &lt;span class="n"&gt;TABLES&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;cdc_user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;PRIVILEGES&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt;
  &lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;TABLES&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;cdc_user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;PUBLICATION&lt;/span&gt; &lt;span class="n"&gt;cdc_pub&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;publish&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'insert,update,delete'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Tables in the publication need a primary key, or:&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="n"&gt;REPLICA&lt;/span&gt; &lt;span class="k"&gt;IDENTITY&lt;/span&gt; &lt;span class="k"&gt;FULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;REPLICA IDENTITY FULL&lt;/code&gt; writes the full old row to WAL on update/delete, which is required if a table does not have a primary key. It is more expensive on writes, so prefer a PK where you can.&lt;/p&gt;

&lt;p&gt;A note on connection requirements: ClickPipes (option 3) needs a direct Postgres connection. Connection poolers like PgBouncer transaction mode, RDS Proxy, and Supabase Pooler are not supported.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 1: Kafka and Debezium
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felzusxt91lav73abp2il.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felzusxt91lav73abp2il.png" alt="Postgres WAL flows through a replication slot into Debezium on Kafka Connect, which writes to per-table Kafka topics. A ClickHouse sink consumes those topics into ClickHouse; the same topics fan out to analytics, search indexing, and audit log consumers." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Debezium tails the Postgres WAL via a logical replication slot and publishes change events into Kafka. A separate sink consumes from Kafka and writes into ClickHouse. The same Kafka topic can fan out to your search index, your audit log, and your warehouse, which is the main reason teams accept the cost.&lt;/p&gt;

&lt;p&gt;The cost is real: Kafka brokers (or KRaft setup), Kafka Connect, schema registry, monitoring, capacity planning, and an on-call rotation that knows what a partition reassignment looks like.&lt;/p&gt;

&lt;p&gt;Pick this if you already operate Kafka, or if you need event streaming for reasons beyond CDC. Otherwise it is overkill, and we treated it as such.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 2: MaterializedPostgreSQL
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftl0w7znyfuqxwnsh0mjv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftl0w7znyfuqxwnsh0mjv.png" alt="The application writes to Postgres; WAL changes flow through a replication slot and publication directly into the MaterializedPostgreSQL engine inside ClickHouse, which writes to local replicated tables that serve dashboards. No Kafka or external sink in the path." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ClickHouse has a built-in database engine called &lt;a href="https://clickhouse.com/docs/engines/database-engines/materialized-postgresql" rel="noopener noreferrer"&gt;MaterializedPostgreSQL&lt;/a&gt; that does CDC inside ClickHouse itself. No Kafka, no Debezium, no third process. The engine opens a Postgres replication slot directly and streams WAL changes into mirror tables on the ClickHouse side.&lt;/p&gt;

&lt;p&gt;The setup is one container plus an init SQL.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;clickhouse&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;clickhouse/clickhouse-server:25.7-alpine&lt;/span&gt;
  &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;8123:8123'&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;9000:9000'&lt;/span&gt;
  &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;clickhouse-data:/var/lib/clickhouse&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./sql/clickhouse-init.sql:/docker-entrypoint-initdb.d/clickhouse-init.sql&lt;/span&gt;
  &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CLICKHOUSE_USER=analytics&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CLICKHOUSE_PASSWORD=changeme&lt;/span&gt;
  &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;api-db&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;allow_experimental_database_materialized_postgresql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;app_analytics&lt;/span&gt;
&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MaterializedPostgreSQL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'db:5432'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'app'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cdc_user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'redacted'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;SETTINGS&lt;/span&gt; &lt;span class="n"&gt;materialized_postgresql_tables_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'User,Project,Task,Comment,Tag'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ClickHouse takes an initial snapshot of the listed tables, opens a replication slot, and streams WAL deltas continuously. Replication lag for our workload was 1 to 2 seconds.&lt;/p&gt;

&lt;p&gt;The auto-created destination tables are always &lt;code&gt;ReplacingMergeTree&lt;/code&gt;, ordered by primary key. You do not get a choice. Schema looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Auto-created by MaterializedPostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;app_analytics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt;     &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;       &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;  &lt;span class="n"&gt;DateTime64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;     &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_sign&lt;/span&gt;       &lt;span class="n"&gt;Int8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_version&lt;/span&gt;    &lt;span class="n"&gt;UInt64&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ReplacingMergeTree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are four sharp edges that show up at scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  DDL silently breaks replication for the affected table
&lt;/h3&gt;

&lt;p&gt;Add or drop a column on Postgres and the corresponding ClickHouse table stops replicating. No log line, no error, no metric exposed by the engine. Other tables keep replicating, the slot keeps advancing, the only signal is that your numbers are wrong.&lt;/p&gt;

&lt;p&gt;Recovery is manual. The exact sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. The DDL has already happened on Postgres:&lt;/span&gt;
&lt;span class="c1"&gt;--    ALTER TABLE public.events ADD COLUMN country text;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. In ClickHouse, detach and re-attach the affected table:&lt;/span&gt;
&lt;span class="n"&gt;DETACH&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;app_analytics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="n"&gt;PERMANENTLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;app_analytics&lt;/span&gt;
  &lt;span class="k"&gt;MODIFY&lt;/span&gt; &lt;span class="n"&gt;SETTING&lt;/span&gt; &lt;span class="n"&gt;materialized_postgresql_tables_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="s1"&gt;'User,Project,Task,Comment,Tag,events'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;ATTACH&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;app_analytics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ATTACH&lt;/code&gt; triggers a fresh snapshot of that table from Postgres. There is no good way to automate this in a team that ships migrations regularly.&lt;/p&gt;

&lt;h3&gt;
  
  
  You cannot pick the destination engine
&lt;/h3&gt;

&lt;p&gt;Replicated tables are always &lt;code&gt;ReplacingMergeTree&lt;/code&gt; ordered by the source PK. You cannot use &lt;code&gt;AggregatingMergeTree&lt;/code&gt; for pre-rolled aggregates. You cannot partition by &lt;code&gt;toYYYYMM(created_at)&lt;/code&gt; for cheap retention. You cannot reorder for your real query pattern. For early dashboards this is fine. For a real analytics surface eventually it is not.&lt;/p&gt;

&lt;h3&gt;
  
  
  You cannot select columns
&lt;/h3&gt;

&lt;p&gt;The replication is whole-table. Wide rows with binary blobs, large JSON, or PII you would rather not propagate replicate anyway. The only workaround is restructuring the source table.&lt;/p&gt;

&lt;p&gt;A related gotcha: TOAST values are not replicated by MaterializedPostgreSQL. The default value is written instead. ClickPipes handles TOAST correctly. If you store anything large enough to TOAST and care about its analytical content, this is a hard limit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability is what you build yourself
&lt;/h3&gt;

&lt;p&gt;There is no built-in lag metric, no per-table status, no error events. The minimum monitoring you need is a slot-lag query on Postgres and a heartbeat row.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Postgres: per-slot lag in bytes, plus WAL retained by stalled slots&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;slot_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;wal_status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;confirmed_flush_lsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;pg_size_pretty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
         &lt;span class="n"&gt;pg_wal_lsn_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_current_wal_lsn&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;restart_lsn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="p"&gt;)&lt;/span&gt;                                             &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;retained_wal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;pg_wal_lsn_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_current_wal_lsn&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;confirmed_flush_lsn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;slot_lag_bytes&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;   &lt;span class="n"&gt;pg_replication_slots&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;  &lt;span class="n"&gt;slot_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'logical'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt;  &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;slot_lag_bytes&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Heartbeat: a row updated every minute on Postgres, read on ClickHouse&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;cdc_heartbeat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;cdc_heartbeat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONFLICT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;PUBLICATION&lt;/span&gt; &lt;span class="n"&gt;cdc_pub&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;cdc_heartbeat&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then on ClickHouse, lag is &lt;code&gt;now() - max(ts)&lt;/code&gt; from the replicated &lt;code&gt;cdc_heartbeat&lt;/code&gt; table. Export the result to your metrics backend, alert when it goes stale.&lt;/p&gt;

&lt;h3&gt;
  
  
  The hard ceiling
&lt;/h3&gt;

&lt;p&gt;The constraint that ends MaterializedPostgreSQL is not technical: it is not supported on ClickHouse Cloud. The moment you decide to stop operating ClickHouse yourself, this option is off the table.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 3: ClickHouse Cloud and ClickPipes
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclkfs7fgdqurv3w1mq5n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclkfs7fgdqurv3w1mq5n.png" alt="The application writes to Postgres with a publication configured; WAL changes flow through a replication slot into managed ClickPipes, which writes into ClickHouse Cloud serving dashboards. ClickPipes also emits metrics, lag, and errors to its built-in UI." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://clickhouse.com/docs/integrations/clickpipes/postgres" rel="noopener noreferrer"&gt;ClickPipes&lt;/a&gt; does the same job as MaterializedPostgreSQL (read Postgres logical replication, write into ClickHouse) but as a managed pipeline on ClickHouse Cloud. Under the hood it is PeerDB. The setup is a connection string, a publication name, a slot name, and a destination database. There is a UI, an OpenAPI, and a Terraform provider.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"clickhouse_clickpipe"&lt;/span&gt; &lt;span class="s2"&gt;"pg_to_ch"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"pg-prod-events"&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"postgres"&lt;/span&gt;
  &lt;span class="nx"&gt;connection_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pg_dsn&lt;/span&gt;
  &lt;span class="nx"&gt;publication&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cdc_pub"&lt;/span&gt;
  &lt;span class="nx"&gt;slot_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cdc_slot"&lt;/span&gt;
  &lt;span class="nx"&gt;tables&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"public.events"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"public.users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"public.orders"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;destination_db&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app_analytics"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Auto-created tables are still &lt;code&gt;ReplacingMergeTree&lt;/code&gt; but with PeerDB's metadata columns instead of MaterializedPostgreSQL's:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;                  &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt;             &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;               &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;          &lt;span class="n"&gt;DateTime64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;             &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_peerdb_synced_at&lt;/span&gt;   &lt;span class="n"&gt;DateTime64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now64&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;_peerdb_is_deleted&lt;/span&gt;  &lt;span class="n"&gt;Int8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_peerdb_version&lt;/span&gt;     &lt;span class="n"&gt;Int64&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ReplacingMergeTree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_peerdb_version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;_peerdb_synced_at&lt;/code&gt; gives you per-row freshness without building a heartbeat. &lt;code&gt;_peerdb_is_deleted = 1&lt;/code&gt; rows are deletes you should filter out in queries (or collapse with &lt;code&gt;FINAL&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The differences that matter at this stage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lag, throughput, error rate, and per-table replication status are first-class metrics in the UI.&lt;/li&gt;
&lt;li&gt;DDL on the source side does not break replication. Adding columns just propagates. Some type changes still require a resync.&lt;/li&gt;
&lt;li&gt;You can pick the destination engine, select columns, and apply basic transformations in the pipe.&lt;/li&gt;
&lt;li&gt;Replication slot lifecycle, ClickHouse upgrades, disk pressure, and backups are not your problem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The line item is bigger than self-hosted. The honest comparison is what one engineer's day is worth, multiplied by the days that used to go into the pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  A real query: Postgres vs ClickHouse
&lt;/h2&gt;

&lt;p&gt;The whole reason you set up CDC is so this query stops killing your primary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Postgres: 30-day daily active count by country with a join,&lt;/span&gt;
&lt;span class="c1"&gt;-- on a few tens of millions of events. Slow, eats CPU, blocks transactions.&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;                       &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;   &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt;   &lt;span class="n"&gt;users&lt;/span&gt;  &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;  &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt;  &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt;  &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ClickHouse equivalent against the CDC tables. FINAL deduplicates&lt;/span&gt;
&lt;span class="c1"&gt;-- ReplacingMergeTree rows; _peerdb_is_deleted filters tombstones.&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;toDate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;             &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;uniqExact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;   &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;FINAL&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;FINAL&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;  &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="k"&gt;DAY&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt;  &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_peerdb_is_deleted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt;  &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt;  &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ClickHouse's published Postgres-to-ClickHouse modeling guide reports the same kind of query going from minutes on Postgres to milliseconds on ClickHouse on similar shapes. Specific numbers depend on cardinality, partition layout, and whether you have a projection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tuning the destination beyond the auto-created defaults
&lt;/h2&gt;

&lt;p&gt;Once you outgrow &lt;code&gt;ReplacingMergeTree&lt;/code&gt; ordered by PK, the move is to define your own destination tables and have ClickPipes write into them with selected columns and a custom engine. A typical pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Tuned for date-range scans; partitioned for cheap retention&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events_tuned&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt;     &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;       &lt;span class="n"&gt;LowCardinality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;  &lt;span class="nb"&gt;DateTime&lt;/span&gt; &lt;span class="n"&gt;CODEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Delta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ZSTD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;     &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_version&lt;/span&gt;    &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_deleted&lt;/span&gt;    &lt;span class="n"&gt;Int8&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ReplacingMergeTree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;toYYYYMM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Pre-aggregated rollup fed off events_tuned via a materialized view&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;daily_event_rollup&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;day&lt;/span&gt;      &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;    &lt;span class="n"&gt;LowCardinality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;events&lt;/span&gt;   &lt;span class="n"&gt;AggregateFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;dau&lt;/span&gt;      &lt;span class="n"&gt;AggregateFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uniq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ENGINE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AggregatingMergeTree&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;AggregatingMergeTree&lt;/code&gt; keeps the rollup tiny and dashboards fast. Materialized views consume from the CDC table and write the partial aggregates. None of this is reachable on the MaterializedPostgreSQL engine, where you do not get to choose.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capability comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;MaterializedPostgreSQL&lt;/th&gt;
&lt;th&gt;Kafka + Debezium&lt;/th&gt;
&lt;th&gt;ClickPipes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Storage in ClickHouse&lt;/td&gt;
&lt;td&gt;Local replicated tables&lt;/td&gt;
&lt;td&gt;Local, written by your sink&lt;/td&gt;
&lt;td&gt;Local replicated tables&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Initial snapshot&lt;/td&gt;
&lt;td&gt;Yes, then WAL tail&lt;/td&gt;
&lt;td&gt;Yes, then WAL tail&lt;/td&gt;
&lt;td&gt;Parallel snapshot, then WAL tail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOAST values&lt;/td&gt;
&lt;td&gt;Not replicated; default written&lt;/td&gt;
&lt;td&gt;Replicated&lt;/td&gt;
&lt;td&gt;Replicated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DDL on source&lt;/td&gt;
&lt;td&gt;Breaks affected table; manual recovery&lt;/td&gt;
&lt;td&gt;Schema registry handles compatibly&lt;/td&gt;
&lt;td&gt;Most column changes propagate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Destination engine choice&lt;/td&gt;
&lt;td&gt;No (ReplacingMergeTree only)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Column selection / transforms&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (in sink)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Available on ClickHouse Cloud&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (any sink)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connection pooler support&lt;/td&gt;
&lt;td&gt;Direct Postgres&lt;/td&gt;
&lt;td&gt;Direct Postgres&lt;/td&gt;
&lt;td&gt;Direct Postgres only (no PgBouncer txn)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Status&lt;/td&gt;
&lt;td&gt;Experimental&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;td&gt;GA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operational cost&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Lowest of the three&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  When to pick which
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Small team, low budget, simple analytics needs.&lt;/strong&gt; MaterializedPostgreSQL. One container, sub-second replication, ship the same day. Build a runbook for the DDL trap before you hit it. Note that this option is gone the moment you move to ClickHouse Cloud.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You already operate Kafka or need event streaming for reasons beyond CDC.&lt;/strong&gt; Debezium and Kafka. Maximum flexibility, real cost, only worth it if the rest of your platform earns the Kafka tax.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Past the early stage, real analytics traffic, do not want to operate the pipeline.&lt;/strong&gt; ClickHouse Cloud and ClickPipes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The shape most teams converge on: MaterializedPostgreSQL first because you can ship it in an afternoon, then ClickPipes (or Kafka if the rest of your platform is going that way) when the trade-offs of the simple setup cost more than the operational tax of the managed one. Over-engineering at the start is the most expensive mistake, because you do not yet know which trade-offs will matter to you. Staying on the simple setup after you do know is the second most expensive.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>clickhouse</category>
      <category>database</category>
      <category>dataengineering</category>
    </item>
  </channel>
</rss>
