credit goes: Mayank Ahuja [in/curiouslearner/]
'Apache Kafka Without ZooKeeper - Using KRaft' (Give it a read.)👇
𝐒𝐨𝐦𝐞 𝐁𝐚𝐜𝐤𝐠𝐫𝐨𝐮𝐧𝐝 -
◾ Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant data pipelines.
◾ It enables real-time data processing, event-driven architectures and reliable messaging.
◾ Kafka's architecture originally relied on ZooKeeper, an external coordination service.
📌 𝐖𝐡𝐚𝐭 𝐰𝐚𝐬 𝐙𝐨𝐨𝐤𝐞𝐞𝐩𝐞𝐫'𝐬 𝐫𝐨𝐥𝐞?
◾ Cluster Metadata Management ✔
- Stored information about brokers, topics, partitions and their configurations.
- Maintained cluster membership and facilitated broker discovery.
◾ Controller Functionality ✔
- Elected a leader broker (Controller) responsible for managing cluster operations (e.g., partition reassignment, leader election).
- Relied heavily on ZooKeeper for metadata updates and coordination.
📌 𝐋𝐞𝐭'𝐬 𝐚𝐥𝐬𝐨 𝐭𝐚𝐥𝐤 𝐚𝐛𝐨𝐮𝐭 𝐬𝐨𝐦𝐞 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞𝐬.
◾ External Dependency
- Required separate deployment and management of ZooKeeper.
- Increased operational complexity and potential points of failure.
◾ Scalability Limitations
- ZooKeeper could become a bottleneck for large-scale clusters due to metadata management overhead.
◾ Operational Overhead
- Maintaining a ZooKeeper ensemble added administrative burdens.
𝐒𝐨 𝐟𝐢𝐧𝐚𝐥𝐥𝐲,
Apache Kafka Raft (KRaft), a consensus protocol introduced in KIP-500 to eliminate Kafka's reliance on ZooKeeper.
** KIP-500 => Kafka Improvement Proposal 500
📌 𝐇𝐨𝐰 𝐢𝐭 𝐰𝐨𝐫𝐤𝐬? (Kafka with KRaft)
◾ With KRaft, Kafka now manages its own metadata through a 'metadata quorum' of brokers.
◾ These brokers utilize the Raft consensus protocol to ensure data consistency and availability, removing the need for ZooKeeper.
◾ Cluster metadata is stored in a dedicated, internal Kafka topic called '__cluster_metadata'.
◾ This topic is replicated across the metadata quorum, ensuring that metadata changes are durable and available even if some brokers fail.
◾ The Kafka Controller, responsible for various cluster management tasks like partition reassignment and leader election, is elected as a leader among the metadata quorum brokers.
◾ Only the leader Controller can modify the metadata. This ensures that metadata changes are serialized and prevents conflicts.
◾ Whenever metadata changes, the leader Controller appends the changes to the internal '__cluster_metadata' topic.
◾ Other brokers in the quorum follow the leader's decisions and replicate metadata changes.
◾ If the current leader fails, a new leader is elected automatically.
𝐖𝐡𝐢𝐜𝐡 𝐦𝐞𝐚𝐧𝐬,
✔ Simplified architecture.
✔ Improved scalability.
✔ Reduced operational overhead.
✔ Enhanced stability and performance.
📌 Support for ZooKeeper was deprecated in Kafka 3.4, encouraging users to migrate to KRaft.
📌 ZooKeeper support is expected to be removed entirely in a future Kafka release.
Top comments (0)