DEV Community

Cover image for 🪄 Debezium: the magic behind data capture & async replication (for free)
adriens
adriens

Posted on

🪄 Debezium: the magic behind data capture & async replication (for free)

🪝 Teaser

Did you ever find yourself in a situation where :

  • Team 🇦 pushes data in a given database (let's say MySQL,...) with its very own custom software
  • Team 🇧 needs to get these data changes (INSERT, UPDATE, DELETE) as events so they can put them in let's say... an another database (MariaDB, PostgreSQL,...)
  • Base software cannot be changed : you have to "deal with it"

Eg, team B's motivation maybe to achieve datascience, RealTime Analytics, store in a datalake,...

👉 This blog post is dedicated to this case... and surprisingly : open source solutions do exist to achieve this magic!

🤔 About the "why"

Debezium Project's "why" is pretty straightforward :

"Turn your databases into change event streams"

... even for "legacy" like systems:

👂 How it does NOT work (why it's awesome)

The key thing here to remind is that Debezium does NOT act as a proxy in front of the database, and that's the most elegant part.

The key point is that Debezium is literally listening to database changes, whatever you call them :

, then send these events in a common standard format into Kafka messages... waiting to be used later by one or many consumers.

🪄 How it works

The magic resides in the following workflow :

  1. Capture data changes at the database level (WAL for postgres, archivelogs, whatever you call them...)
  2. Send/Stream events to Kafka
  3. Consume Kafka events so they they can be pushed to any third party data service 3'. JDBC : for example "consume events from multiple source topics, and then rite those events to a relational database by using a JDBC driver."

Image description

🍿 Demo from scratch

Below the live demo I was able to do, from scratch, but by following default instructions for a MySQL instance :

🔭 Going further

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry 🕒

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (2)

Collapse
 
adriens profile image
adriens

Very nice in-depth post :

Collapse
 
adriens profile image
adriens

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay