DEV Community

Aniketh Deshpande
Aniketh Deshpande

Posted on

Change Data Capture - PostgreSQL

Change Data Capture is the concept of recording the changes in the database table fields.

  • It is very helpful in use cases where we want to track creation, updation, deletion of records in the table.

  • We might want to use this information to make changes in other databases or notify customers or notify other services.

  • Example:

    1. Save a copy of this data in a warehouse post transform.
    2. Trigger notification service to notify users about this change.
    3. Cache the data.

CDC In Postgres

  • Using Notify/Listen
    • NOTIFY provides a mechanism for interprocess communication between the database and the service that is LISTENing to this notification channel.
    • One or more services could be listening to this notification channel.
    • Name of this channel is usually the name of the database. However the user is free to set suitable names for these channels.
    • Any change in the table is captured by the DB and a trigger is initiated, which calls a function that formats the message to notify.
    • This usually contains the table name and the payload string.
    • The listening server registers to the channel and gets the message from the DB.
    • The service can then use this message and perform operations on it.

Pros:

  • Simple to implement. Use a trigger and a function to notify. Implement a listen service.

Cons:

  • Weak reliability. There is always a risk of loss of message especially when the listening service is down. Messages in the queue do not persist.

  • Using Debezium

    • Debezium is an open source tool used for capturing changes in the database tables based on the WAL (write ahead log).
    • The tool provides connectors to connect to a variety of databases.
    • The source connector is used to capture changes in the source database.
    • The sync connector is used to sync data directly in the destination database.

Pros:

  • The changes are persistent as they can be streamed to kafka. Hence highly reliable.

Cons:

  • Debezium does not take into account changes in the schema, users need to update the schema changes. Otherwise there would be data loss.

Thank you
Aniketh Deshpande

Top comments (1)

Collapse
 
rudu1 profile image
Rudram

This article is really helpful.