Background
Kafka was introduced when Wehkamp started with microservices platform in 2014 or 2015.
We wanted a smart platform to exchange information between microservices
That's where Kafka came in to Wehkamp
Legacy Kafka setup in Wehkamp
In our initial implementation we used a self-maintained Apache Kafka cluster on EC2 instances.
However, with the growth of Wehkamp it was increasing engineering effort to provision, configure scale, upgrade and orchestrate the Kafka brokers and Zookeepers in production.
Sometimes, the cluster was unstable as well.
And, we didn't have that many colleagues within Wehkamp on Kafka expertise.
To reduce infrastructure management and focus more on our growing business, we decided to migrate from our self-managed Kafka cluster to Amazon Managed Streaming for Apache Kafka (Amazon MSK).
That means you spend less time managing infrastructure and more time building applications.
Not only with this, we were already heavily invested in AWS.
Amazon Managed Streaming for Kafka (MSK)
AWS launched Managed Streaming for Kafka (MSK) in 2018.
MSK is s a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning and analytics application.
Goal: Migrate Apache Kafka Queue Cluster to Amazon MSK
To accomplish our goal we have to do the following:
- Create AWS MSK clusters with resources to handle the workload/data from Legacy clusters. (we have used terraform msk module for provisioning clusters).
- Migrate all the topics and their data.
- Migrate all the Kafka Clients. (Consumers/Producers)
- Backup disks and Destroy Legacy Clusters.

Top comments (0)