DEV Community

Cover image for MySQL CDC: Real-Time Replication with Binlog (Complete Guide 2026)
Dmitry Narizhnyhkh
Dmitry Narizhnyhkh

Posted on

MySQL CDC: Real-Time Replication with Binlog (Complete Guide 2026)

Most MySQL CDC guides stop at "enable binlog and stream changes".

In practice, that’s not the hard part.

What actually matters shows up once you try to run it in a real system.


What is MySQL CDC?

MySQL Change Data Capture (CDC) is a way to track and stream changes from a database in real time.

Instead of scanning full tables, CDC reads only what changed: inserts, updates, and deletes.

In MySQL, this is typically done using the binary log (binlog), which records every data modification as a sequence of events.

These events can then be applied to another system, keeping it in sync with the source database.


How MySQL CDC Works

At a high level, MySQL CDC is simple:

  1. MySQL writes every change to the binlog
  2. A reader parses those events
  3. Changes are applied to a target system

The binlog is just a sequence of events describing row-level changes.

Everything else — ordering, retries, consistency — is where things get tricky.


Why Use CDC?

Common use cases:

  • keeping a warehouse in sync
  • zero-downtime migrations
  • feeding analytics or search systems

CDC Implementation Methods

In practice, almost all production setups use binlog-based CDC.

Trigger-based and timestamp-based approaches still exist, but they don’t scale well and are rarely used in real systems.

Comparison

Method Latency Performance Impact Complexity Use Case
Trigger-based Real-time High Low Small-scale setups
Query-based Minutes Medium Low Simple polling-based sync
Binlog-based Milliseconds Minimal Medium Production systems

Configuring MySQL for CDC

Minimum required settings:

SET GLOBAL binlog_format = 'ROW';
SET GLOBAL binlog_row_image = 'FULL';
Enter fullscreen mode Exit fullscreen mode

Create a user with replication privileges:

CREATE USER 'cdc_user'@'%' IDENTIFIED BY 'password';
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'cdc_user'@'%';
Enter fullscreen mode Exit fullscreen mode

Other MySQL CDC Tools

Most CDC setups fall into a few categories:

  • Debezium — log-based CDC, but requires Kafka
  • Airbyte — connector-heavy, mostly batch
  • Fivetran — managed SaaS, usage-based pricing
  • AWS DMS — migration-focused, AWS-centric

Each solves part of the problem, but often requires combining multiple tools.

Full comparison:


How It Looks in Practice

In real setups, CDC is not configured via JSON.

The typical flow is:

  • create a source connection
  • create a target
  • start a CDC stream

The system handles binlog parsing, ordering, and delivery.

Step-by-step guide:


Start and Monitor the Stream

After setup, starting CDC is just one action.

From there, the system continuously reads binlog events and applies them to the target.

What matters in practice:

  • replication lag
  • throughput
  • failure handling

Most problems don’t come from setup — they show up while the stream is running.


Common Challenges

Initial data load

CDC only captures changes going forward.

That means existing data has to be copied before CDC starts.

For large tables, the typical approach is:

  • run a one-time bulk load first
  • then switch to CDC for ongoing changes

Skipping this step often leads to lag, gaps, or inconsistent data between source and target.


Testing and rollout

Before running CDC on the full dataset, it’s common to:

  • start with a few tables
  • run the stream for a limited time
  • verify consistency

This helps catch issues early without affecting production systems.


Throughput and latency

Throughput depends heavily on network conditions.

In high-latency environments, batching becomes important to avoid excessive round trips.

Most systems expose this as a configurable parameter, but defaults are usually enough to get started.


FAQ

Can MySQL CDC capture schema changes?
Yes. Binlog-based CDC captures DDL events if configured correctly.

What MySQL version is required?
MySQL 5.7+ works, but 8.0+ is recommended for production.

Does CDC impact performance?
Binlog-based CDC has minimal impact. Trigger-based approaches can slow down writes.

Does CDC work with cloud databases?
Yes. AWS RDS, Google Cloud SQL, and Azure Database all support it.

How do you handle schema changes?
Schema changes are one of the trickier parts. Most setups require coordination and sometimes stream reconfiguration.


Summary

MySQL CDC itself is straightforward:

read binlog → apply changes

The complexity comes from everything around it:

  • initial data load
  • consistency during replication
  • monitoring and recovery

Different tools mostly differ in how much of that they handle for you.

That’s where most CDC implementations either stay simple or become a mess.


Try it yourself

The fastest way to understand CDC is to run it.

Create your first stream:

Runs as a desktop app (Windows, macOS, Linux) or via Docker.
MySQL → PostgreSQL, S3, or files, real-time sync, no Kafka.


Originally published at:
https://streams.dbconvert.com/blog/mysql-change-data-capture/

Top comments (0)