Let’s cut to the chase.
I’m working on a banking application made up of 10+ microservices, each owning its own schema and data domain. Some real-world challenges kept repeating:
- ❓ How do I keep queries blazing fast without hammering my transactional database?
- ❓ How do I fetch nested data (like user → accounts → cards) without orchestrating 3+ services?
- ❓ How do I scale reads and writes independently?
That’s when I decided to lean fully into CQRS — with PostgreSQL for writes, MongoDB for reads, and the real magic: Debezium + Kafka to wire it all together. ⚡
🏦 The Setup: A Modular Banking System
Here’s a simplified breakdown:
✅ User Service
- Tables:
user
,user_metadata
,user_address
,user_kyc
✅ Account Service
- Tables:
account
,account_metadata
- Types: savings, loan, FD, credit
✅ Card Service
- Tables:
cards
,card_controls
- Types: debit, credit, forex
Each service uses PostgreSQL as its transactional store and exposes APIs for CRUD operations.
🔍 The Query Headache
Here are the actual use cases I wanted to support:
- Get all accounts for a user
- Get all cards for an account
- Get all accounts and cards for a user in one shot
Sounds simple — but across microservices, this turns into a mess of joins, API calls, and orchestration.
💡 The CQRS Solution
Here’s the architectural shift I made:
- ✅ Writes go to PostgreSQL (normalized, ACID-compliant)
- ✅ Reads happen via MongoDB (denormalized, query-optimized)
- 🔄 Changes from Postgres are synced in real-time using Debezium → Kafka → Mongo
This lets me build denormalized Mongo documents like:
{
"userId": "u123",
"name": "Alice",
"accounts": [
{
"accountId": "a456",
"type": "Savings",
"cards": [
{
"cardId": "c789",
"type": "Debit",
"limit": 50000
}
]
}
]
}
🧪 Implementation: Two Approaches I Tried
❌ Option 1: JPA Entity Listeners
Use Spring JPA lifecycle hooks like @PostPersist
and @PostUpdate
to listen for changes:
- Capture
before
andafter
states of the entity - Push updates to Kafka or any message queue
- A downstream service listens and updates the Mongo read model
📉 Why I Moved Away:
- Too tightly coupled with business logic
- Doesn’t capture raw DB writes (e.g., SQL scripts or tools)
- Hard to maintain consistency across services
- Requires manual plumbing for every entity
✅ Option 2: Debezium + Kafka (Deep Dive)
Debezium is a CDC (Change Data Capture) tool that listens to changes in your database (like INSERTs, UPDATEs, DELETEs) and publishes those changes to Kafka topics in real time. Your services can then consume these topics and build query-optimized read models (e.g., in MongoDB).
🔍 How It Works — Step by Step
-
PostgreSQL emits changes to the WAL (Write-Ahead Log)
- WAL logs are internal logs that track every change made to the DB.
- They are designed for crash recovery, but Debezium uses them for streaming.
-
A replication slot is created in PostgreSQL
- Debezium uses this slot to track the WAL changes.
- Think of it as a pointer to the latest event — so Debezium knows what to stream next.
-
Debezium connects to PostgreSQL via logical decoding
- Logical decoding turns WAL logs into human-readable change events.
- Debezium uses this to get structured data like
{before, after}
.
-
Debezium pushes changes to Kafka
- Each table you're monitoring emits to its own Kafka topic.
- Format:
serverName.schema.table
, e.g.,banking_app.public.user
.
Kafka consumers (like your Mongo Writer Service) consume and store it in the read DB.
🧠 Key Concepts
🔌 PostgreSQL Replication Slot
A replication slot keeps track of how much of the WAL has been read by Debezium. It ensures:
- No WAL segments are deleted until they've been consumed
- Debezium doesn’t miss any change
- Only one logical consumer per slot
📌 Important: Don’t forget to drop the slot if you stop using Debezium, or your disk will fill up with old WAL logs!
-- View existing slots
SELECT * FROM pg_replication_slots;
-- Create manually (Debezium creates this for you automatically)
SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput');
🔁 Real-Time Change Data Capture from PostgreSQL to Kafka using Debezium
In this post, we'll explore how to use Debezium to stream real-time changes from a PostgreSQL database into Apache Kafka, making it easy to sync data into systems like MongoDB, Elastic, or downstream microservices.
📁 Debezium Kafka Connector Config (PostgreSQL)
Here's a sample Kafka Connect configuration for Debezium to monitor changes in a PostgreSQL database:
{
"name": "banking-postgres-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"plugin.name": "pgoutput",
"database.hostname": "localhost",
"database.port": "5432",
"database.user": "debezium",
"database.password": "dbz",
"database.dbname": "banking",
"database.server.name": "banking_app",
"table.include.list": "public.user,public.account,public.cards",
"database.include.schema.changes": "false",
"slot.name": "banking_slot",
"tombstones.on.delete": "false",
"transforms": "unwrap",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter"
}
}
Top comments (0)