Apache Kafka Kraft Protocol

#kafka #performance #kraft

This document that I created was mostly for self reference, I wanted to try out the shiny and new Kafka Kraft protocol where you dont need to deploy Zookeeper and Kafka containers themselves are self-sufficient, which simplifies deployment and improves on performance as well.

Before going through with this article, I would highly recommend that you go through the following video to get familiarized with what Kraft is all about, its a very brief and to the point explanation of the topic:

The Setup using docker-compose

Pre-requisites:

Machine with sufficient resources for 3 docker containers.

Docker and docker-compose setup.

Kafka binaries setup on a seperate machine, for performance tests on the cluster.

The docker-compose.yaml needed to setup and some basic configurations are provided at my github repo:
https://github.com/vishalendu/kafka-kraft-cluster-setup

Please go through the readme file to setup the Kafka Kraft cluster.

Disclosure:

I have used a mini-pc having a Ryzen 7 5800H (8-core/16-threads) with 32GB of RAM and 2TB of M.2 NVME SSD with Sequential Read/Write up to 3,500/2,900 MB/s as my testbench. The docker containers deployed share the same SSD for storage (which can be IO limiting). Max network transfer from this machine reached 110 MB/s, so it could also be Network limited.

Some performance tests on producer/consumer

Just to checkout the performance of the consumers and producers I have provided some basic commands in the repository

https://github.com/vishalendu/kafka-kraft-cluster-setup/blob/main/benchmark.md

Just for fun -- compared producer performance with different compression algorithms

You can find the comparison in the compression-comparison.xlsx in the repo.

Summary:

The Kraft protocol is going to be the only option from Kafka 4.0 onwards, so its definitely good to get some hands-on exercise to look at what changed and how the performance characteristics of Kafka have evolved.

The Kraft protocol has improved Kafka performance beyond the point where a small cluster can support millions of partitions, whereas older Zookeeper based implementation had major performance limitations with higher number of partitions.

It was also nice to run some producer compression tests to confirm that 'zstd' is the suitable compression algorithm that gives the most bang for the buck in terms of CPU utilization and the compression ratio.

Things to do:

Would like to look at migration to Kafka 3.6 from older versions.
Will need to read documentation on consumer/producer properties if any have changed between versions or for Kraft protocol.
Need to do some High Availability tests to look at how brokers and controllers are going to handle failures. I expect better recovery performance with Kraft protocol.

DEV Community

Apache Kafka Kraft Protocol

The Setup using docker-compose

Disclosure:

Some performance tests on producer/consumer

Just for fun -- compared producer performance with different compression algorithms

Summary:

Things to do:

Top comments (0)

Read next

Speed up Kamal deploys in GitHub Actions

Infamous N+1 Query Problem with Entity Framework Core

Threads: Como definir e limitar a execução visando a performance?

Performance trap: general libraries & helper objects