DEV Community

Cover image for Did Kafka Just Get Easier?
Robiul
Robiul

Posted on

Did Kafka Just Get Easier?

credit goes: Mayank Ahuja [in/curiouslearner/]

'Apache Kafka Without ZooKeeper - Using KRaft' (Give it a read.)👇

𝐒𝐨𝐦𝐞 𝐁𝐚𝐜𝐤𝐠𝐫𝐨𝐮𝐧𝐝 -

◾ Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant data pipelines.
◾ It enables real-time data processing, event-driven architectures and reliable messaging.
◾ Kafka's architecture originally relied on ZooKeeper, an external coordination service.

📌 𝐖𝐡𝐚𝐭 𝐰𝐚𝐬 𝐙𝐨𝐨𝐤𝐞𝐞𝐩𝐞𝐫'𝐬 𝐫𝐨𝐥𝐞?

◾ Cluster Metadata Management ✔

  • Stored information about brokers, topics, partitions and their configurations.
  • Maintained cluster membership and facilitated broker discovery.

◾ Controller Functionality ✔

  • Elected a leader broker (Controller) responsible for managing cluster operations (e.g., partition reassignment, leader election).
  • Relied heavily on ZooKeeper for metadata updates and coordination.

📌 𝐋𝐞𝐭'𝐬 𝐚𝐥𝐬𝐨 𝐭𝐚𝐥𝐤 𝐚𝐛𝐨𝐮𝐭 𝐬𝐨𝐦𝐞 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞𝐬.

◾ External Dependency

  • Required separate deployment and management of ZooKeeper.
  • Increased operational complexity and potential points of failure.

◾ Scalability Limitations

  • ZooKeeper could become a bottleneck for large-scale clusters due to metadata management overhead.

◾ Operational Overhead

  • Maintaining a ZooKeeper ensemble added administrative burdens.

𝐒𝐨 𝐟𝐢𝐧𝐚𝐥𝐥𝐲,

Apache Kafka Raft (KRaft), a consensus protocol introduced in KIP-500 to eliminate Kafka's reliance on ZooKeeper.

** KIP-500 => Kafka Improvement Proposal 500

📌 𝐇𝐨𝐰 𝐢𝐭 𝐰𝐨𝐫𝐤𝐬? (Kafka with KRaft)

◾ With KRaft, Kafka now manages its own metadata through a 'metadata quorum' of brokers.

◾ These brokers utilize the Raft consensus protocol to ensure data consistency and availability, removing the need for ZooKeeper.

◾ Cluster metadata is stored in a dedicated, internal Kafka topic called '__cluster_metadata'.

◾ This topic is replicated across the metadata quorum, ensuring that metadata changes are durable and available even if some brokers fail.

◾ The Kafka Controller, responsible for various cluster management tasks like partition reassignment and leader election, is elected as a leader among the metadata quorum brokers.

◾ Only the leader Controller can modify the metadata. This ensures that metadata changes are serialized and prevents conflicts.

◾ Whenever metadata changes, the leader Controller appends the changes to the internal '__cluster_metadata' topic.

◾ Other brokers in the quorum follow the leader's decisions and replicate metadata changes.

◾ If the current leader fails, a new leader is elected automatically.

𝐖𝐡𝐢𝐜𝐡 𝐦𝐞𝐚𝐧𝐬,

✔ Simplified architecture.
✔ Improved scalability.
✔ Reduced operational overhead.
✔ Enhanced stability and performance.

📌 Support for ZooKeeper was deprecated in Kafka 3.4, encouraging users to migrate to KRaft.

📌 ZooKeeper support is expected to be removed entirely in a future Kafka release.

⭐ Follow https://www.linkedin.com/in/curiouslearner/

Top comments (0)