DEV Community

SoftwareDevs mvpfactory.io
SoftwareDevs mvpfactory.io

Posted on • Originally published at mvpfactory.io

PostgreSQL Partial Replication with Logical Decoding

---
title: "PostgreSQL Partial Replication: Skip the CDC Pipeline"
published: true
description: "Learn how PostgreSQL 15+ logical replication with row filters and column lists can replace Debezium/Kafka CDC for selective microservice data streaming."
tags: postgresql, architecture, devops, api
canonical_url: https://blog.mvpfactory.co/postgresql-partial-replication-skip-the-cdc-pipeline
---

## What We Will Build

In this workshop, I will walk you through setting up PostgreSQL 15+ logical replication with row filters and column lists so each downstream microservice receives only the data it actually needs. No Debezium. No Kafka Connect. No schema registry. Just PostgreSQL infrastructure you already run.

By the end, you will have per-service publications with filtered replication, proper WAL bloat prevention, and a monitoring query you can drop straight into your observability stack.

## Prerequisites

- PostgreSQL 15 or later (row filters and column lists require PG15+)
- At least two PostgreSQL instances (one primary, one subscriber)
- `wal_level = logical` set on the primary
- Basic familiarity with SQL DDL and replication concepts

## Step 1: Understand Why This Exists

Most teams reach for Debezium + Kafka the moment two services need shared data. That stack costs a lot to operate: ZooKeeper or KRaft clusters, Connect workers, schema registries, offset management, connector configs that silently break on DDL changes.

If your architecture is 3–8 services that each need a materialized read model from a shared PostgreSQL primary, built-in logical replication may be everything you need.

## Step 2: Create Filtered Publications

Here is the minimal setup to get this working. PostgreSQL's `CREATE PUBLICATION` now supports `WHERE` clauses and column lists directly:

Enter fullscreen mode Exit fullscreen mode


sql
-- Service: order-fulfillment
-- Only replicate orders that are paid and awaiting shipment
CREATE PUBLICATION pub_fulfillment
FOR TABLE orders (id, customer_id, status, shipping_address, created_at)
WHERE (status IN ('paid', 'processing'));

-- Service: analytics-ingest
-- Replicate all orders but only the columns needed for reporting
CREATE PUBLICATION pub_analytics
FOR TABLE orders (id, total_cents, currency, region, created_at),
TABLE line_items (id, order_id, sku, quantity, unit_price_cents);


Each subscriber connects to its own publication and receives only the filtered subset. No application-level filtering, no wasted bandwidth. The important detail: row filtering happens on the publisher side. The primary does the work, but avoids serializing and transmitting data the subscriber would discard.

## Step 3: Know the Trade-offs

| Dimension | PG Logical Replication | Debezium + Kafka |
|---|---|---|
| Additional infrastructure | None | Kafka cluster, Connect workers, schema registry |
| Row-level filtering | Native (`WHERE`) | SMT or consumer-side |
| Column filtering | Native (column lists) | SMT `ReplaceField` or downstream |
| Throughput ceiling | ~5K–15K TPS per slot | Horizontally scalable via partitions |
| Schema evolution | Manual `ALTER SUBSCRIPTION REFRESH` | Schema registry handles most cases |
| Fan-out to non-PG consumers | Not supported | Any Kafka consumer |

If your write throughput is under 10K TPS and all consumers are PostgreSQL databases, native replication wins on operational simplicity. It is not close.

## Step 4: Prevent WAL Bloat on Day One

Let me show you a pattern I use in every project. Every replication slot tells PostgreSQL: "Do not recycle WAL segments past this point." If a subscriber goes down, WAL accumulates on the primary. This is the single most common way logical replication causes outages.

Enter fullscreen mode Exit fullscreen mode


sql
ALTER SYSTEM SET max_slot_wal_keep_size = '10GB';
SELECT pg_reload_conf();


Then monitor continuously:

Enter fullscreen mode Exit fullscreen mode


sql
SELECT
slot_name,
active,
pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
) AS retained_wal,
pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
) AS total_retained
FROM pg_replication_slots
WHERE slot_type = 'logical';


Set alerts on `retained_wal` exceeding 20% of your available disk. When `max_slot_wal_keep_size` is breached, PostgreSQL invalidates the slot. The subscriber must be re-initialized, but your primary survives.

## Gotchas

- **Unmonitored slots will take down your primary.** An inactive replication slot is a disk-full outage waiting to happen. Drop inactive slots immediately when decommissioning a service: `SELECT pg_drop_replication_slot('slot_name');`
- **Name slots descriptively.** Use `sub_fulfillment_v1`, not `sub1`. Future you will thank present you.
- **DDL changes require manual intervention.** Schema evolution means running `ALTER SUBSCRIPTION REFRESH PUBLICATION` on the subscriber. The docs do not mention this prominently, but skipping it silently breaks replication.
- **Logical slots do not replicate to standbys.** If you use physical replication for HA, test failover carefully. PG17 introduces `sync_replication_slots`, but on earlier versions you must recreate slots after promotion.
- **Know when to graduate.** Move to Debezium/Kafka when you need non-PostgreSQL consumers (Elasticsearch, Redis, data lakes), write throughput exceeds what a single slot handles, or you need exactly-once delivery semantics beyond `pg_replication_origin`.

## Wrapping Up

Design your publications around domain boundaries. When you eventually need Debezium, the table-to-service mapping already exists — your publication definitions become your CDC connector config blueprint. Start with what PostgreSQL gives you. You probably already have the infrastructure.
Enter fullscreen mode Exit fullscreen mode

Top comments (0)