DEV Community

Vishalendu Pandey
Vishalendu Pandey

Posted on

2

Apache Kafka Kraft Protocol

This document that I created was mostly for self reference, I wanted to try out the shiny and new Kafka Kraft protocol where you dont need to deploy Zookeeper and Kafka containers themselves are self-sufficient, which simplifies deployment and improves on performance as well.

Before going through with this article, I would highly recommend that you go through the following video to get familiarized with what Kraft is all about, its a very brief and to the point explanation of the topic:


The Setup using docker-compose

Pre-requisites:

  • Machine with sufficient resources for 3 docker containers.
  • Docker and docker-compose setup.
  • Kafka binaries setup on a seperate machine, for performance tests on the cluster.

The docker-compose.yaml needed to setup and some basic configurations are provided at my github repo:
https://github.com/vishalendu/kafka-kraft-cluster-setup

Please go through the readme file to setup the Kafka Kraft cluster.

Disclosure:

I have used a mini-pc having a Ryzen 7 5800H (8-core/16-threads) with 32GB of RAM and 2TB of M.2 NVME SSD with Sequential Read/Write up to 3,500/2,900 MB/s as my testbench. The docker containers deployed share the same SSD for storage (which can be IO limiting). Max network transfer from this machine reached 110 MB/s, so it could also be Network limited.


Some performance tests on producer/consumer

Just to checkout the performance of the consumers and producers I have provided some basic commands in the repository

https://github.com/vishalendu/kafka-kraft-cluster-setup/blob/main/benchmark.md


Just for fun -- compared producer performance with different compression algorithms

Kafka Producer Compression algorithm comparison
You can find the comparison in the compression-comparison.xlsx in the repo.


Summary:

The Kraft protocol is going to be the only option from Kafka 4.0 onwards, so its definitely good to get some hands-on exercise to look at what changed and how the performance characteristics of Kafka have evolved.

The Kraft protocol has improved Kafka performance beyond the point where a small cluster can support millions of partitions, whereas older Zookeeper based implementation had major performance limitations with higher number of partitions.

It was also nice to run some producer compression tests to confirm that 'zstd' is the suitable compression algorithm that gives the most bang for the buck in terms of CPU utilization and the compression ratio.


Things to do:

  • Would like to look at migration to Kafka 3.6 from older versions.
  • Will need to read documentation on consumer/producer properties if any have changed between versions or for Kraft protocol.
  • Need to do some High Availability tests to look at how brokers and controllers are going to handle failures. I expect better recovery performance with Kraft protocol.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs