DEV Community

Cover image for DataBrew - a new way of integrating CDC into your project
Vladyslav Len
Vladyslav Len

Posted on

DataBrew - a new way of integrating CDC into your project

Back in the time when I was working for one of the previous companies - I faced the need in setting up data replication. We were fast growing startup at that point and as most of the startups we made quite a few mistakes during the active growth phase. By having microservice architecture we didn't have enough time to architect them well. It led us to the situation when we had micro-service architecture which looked more like monolithical one. 95% of communications were done by direct HTTP calls. That was exactly that one thing that let us down.

During the peak load times we had a more internal call than external ones. I know what you think about - "They must be dumb", and I'd say not fully :)
Most features we produced in a short term, sacrificing the stability with high hopes to fix it later.

Most of the problems were caused by services that contained important data we had to rely on in other services. So on each client call - we had to make 2+ underlying calls to return this data. (Caching was not an option since data couldn't be old due to requirements)

These services shortly became SPOF (Single Point of Failure) and we have to do something. We came up with adopting CDC (a.k.a Data Replication), which we spen countless hours trying to implement it, finding proper services, toolings, etc. But in the end - it helped. We managed to build really great architecture that could stand during the peak hours with no problems.

Now, when the background is set - let's talk about the journey we had made to solve our problems with CDC.

I can say for sure - CDC is not a magic pill, it may not solve all your problems, but it may help you gain precious time to grow, keeping your customers engaged and raising more money to re-write your architecture down the road.

You see, problem here is that most of the Replication/ETL services are focused on a few things:

  • Database to warehouse to perform analytics
  • Database to Database full replication with no transformations

Especially, when you start googling about CDC implementation you will find a ton of links following to projects like Confluent.io and Debezium. 
Don't get me wrong, these are the projects that push CDC industry forward, but they are extremely complex when you see them for the first time in your life. And you as a CTO/Tech lead of the Startup usually can't afford investing so much time into these things having no idea whether they will work out or not.

Meet DataBrew

DataBrew is a SaaS project that provides an easy way to integrate CDC(Change-Data-Capture) into your architecture. Basically by creating datamesh where you define the data your services expose and any service can consume that.

DataBrew Dashboard

You can see DataBrew's service dashboard with data streams going to the serviceWe tried to gather all our knowledge we gained during the CDC experiments and maintenance and create a product that will help developers.

DataBrew was created with a few things in mind:

  • We want to give developers more time to work on business logic, write the code - not spend countless hours debugging Kafka.
  • We want to give developers the most important thing - representation of the actual data flows. So they can see all the flows of data coming to the service and vice versa.
  • We want to have strict data contracts. Even if you service has 45 tables, you still can define that it exports only 2 of them. To prevent people from blind creation of DataFlows without thinking about the system stability.
  • We want to make it robust. Adopting #CDC may seem a bit risky decision, but with proper alerting, monitoring - nothing to worry about.

Currently, we are running in closed-beta but, we are going to open DataBrew for public access this September.

Feel free to apply for early access - we will reach out to you as soon as possible to discuss all the details.

Please, keep in mind that during the closed-beta we only support PostgeSQL database

Thanks for reading the article! We hope we could have sparked the interest in your eyes to give DataBrew a shot.

Useful links

Top comments (0)