Adil Ansari

Posted on Jun 9

Setup Multi Node Kafka Cluster (KRaft)

In this guide I will walk you through running a 3‑node Apache Kafka cluster on AWS EC2 using KRaft (Kafka’s built-in metadata quorum). At the end of this tutorial you’ll end up with:

3 EC2 instances: kafka-01, kafka-02, kafka-03
Each node acts as broker + controller
Ports:
- 9092 for broker/client traffic
- 9093 for controller quorum traffic

This setup uses PLAINTEXT for simplicity. For production, add TLS + SASL, and proper monitoring.

Prerequisites

1) 3 EC2 instances in the same VPC (different AZs for HA)

Recommended: t3a.small or bigger to start (Kafka loves RAM and disk IOPS)
Disk: at least 20–50 GB per node (more if you plan to retain data longer)

2) Each node must be able to reach the other nodes on 9092 and 9093

Now let's move to setup our very own kafka cluster setup from scratch.

Step 1: Set hostnames (run on all 3 nodes)

On each node, set the hostname to match:

On kafka-01:

sudo hostnamectl set-hostname kafka-01

On kafka-02:

sudo hostnamectl set-hostname kafka-02

On kafka-03:

sudo hostnamectl set-hostname kafka-03

Then add the same /etc/hosts block on all nodes:

<KAFKA_01_PRIVATE_IP> kafka-01
<KAFKA_02_PRIVATE_IP> kafka-02
<KAFKA_03_PRIVATE_IP> kafka-03

Quick sanity check (from each node):

getent hosts kafka-01 kafka-02 kafka-03

Step 2: Install Java (run on all 3 nodes)

Kafka runs on the JVM, so Java is non‑negotiable.

On Ubuntu:

sudo apt-get update
sudo apt-get install -y openjdk-17-jre-headless
java -version

I used ubuntu for this tutorial, you can use other linux distributions. Most steps would be similar.

On Amazon Linux 2023:

sudo dnf install -y java-17-amazon-corretto-headless
java -version

Step 3: Download and install Kafka (run on all 3 nodes)

We’ll install Kafka under /opt and run it as a dedicated kafka user.

sudo useradd --system --create-home --home-dir /opt/kafka --shell /usr/sbin/nologin kafka || true
sudo mkdir -p /opt/kafka
sudo chown -R kafka:kafka /opt/kafka

Download Kafka 4.3.0 binary (run in /opt):

If curl isn’t installed:

# Ubuntu
sudo apt-get install -y curl

# Amazon Linux 2023
sudo dnf install -y curl

cd /opt
sudo curl -fL -o kafka_2.13-4.3.0.tgz "https://www.apache.org/dyn/closer.lua/kafka/4.3.0/kafka_2.13-4.3.0.tgz?action=download"
sudo tar -xzf kafka_2.13-4.3.0.tgz
sudo chown -R kafka:kafka /opt/kafka_2.13-4.3.0

Create a log directory (this is where Kafka stores data):

sudo mkdir -p /opt/kafka/logDir
sudo chown -R kafka:kafka /opt/kafka/logDir

Step 4: Create the Kafka config (per node)

We’ll use a custom config file: /opt/kafka_2.13-4.3.0/config/custom-server.properties.

Common config (same on all nodes)

Create the file and paste this base (we’ll change the node-specific lines next):

# ==================== Node Roles & Identity ====================
process.roles=broker,controller
node.id=1

# ==================== Network & Listeners ====================
listeners=PLAINTEXT://kafka-01:9092,CONTROLLER://kafka-01:9093
advertised.listeners=PLAINTEXT://kafka-01:9092

listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
inter.broker.listener.name=PLAINTEXT
controller.listener.names=CONTROLLER

# ==================== KRaft Quorum Configuration ====================
controller.quorum.voters=1@kafka-01:9093,2@kafka-02:9093,3@kafka-03:9093

# ==================== Storage Layout ====================
log.dirs=/opt/kafka/logDir

# ==================== Topic Defaults & System Replication ====================
num.partitions=6
offsets.topic.replication.factor=2
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1

# ==================== Log Retention Policies ====================
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

Now edit the node-specific values on each node:

On kafka-01
- node.id=1
- listeners=...kafka-01...
- advertised.listeners=...kafka-01...
On kafka-02
- node.id=2
- listeners=...kafka-02...
- advertised.listeners=...kafka-02...
On kafka-03
- node.id=3
- listeners=...kafka-03...
- advertised.listeners=...kafka-03...

Use your editor of choice:

sudo vi /opt/kafka_2.13-4.3.0/config/custom-server.properties

A note on `advertised.listeners`

Whatever you set here is what clients will use to connect. If you plan to connect from your laptop:

advertised.listeners must be reachable from your laptop (security group + routing + correct hostname/IP)

If you only connect from inside the VPC:

keep it on private hostnames/IPs (recommended)

Step 5: Generate a cluster ID (run once, then reuse on all nodes)

On any one node (e.g. kafka-01):

cd /opt/kafka_2.13-4.3.0
sudo -u kafka bin/kafka-storage.sh random-uuid

Example output:

CPRRAdoxRDaL5L1S2nwuJw

Save it somewhere safe. I like keeping it in a file so it’s not lost:

echo "<PASTE_YOUR_CLUSTER_ID_HERE>" | sudo tee /opt/kafka/cluster.id >/dev/null

Now copy that value to all three nodes (same exact cluster ID).

Step 6: Format the log directory (run on all 3 nodes)

This step initializes the KRaft metadata in your configured storage.

On each node:

cd /opt/kafka_2.13-4.3.0
KAFKA_CLUSTER_ID="$(cat /opt/kafka/cluster.id)"
sudo -u kafka bin/kafka-storage.sh format \
  --cluster-id "$KAFKA_CLUSTER_ID" \
  --config config/custom-server.properties

If you re-run the command later and it complains the directory is already formatted, that’s normal (and a good sign).

Step 7: Create `systemd` service on all 3 nodes

Create /etc/systemd/system/kafka.service:

[Unit]
Description=Apache Kafka (KRaft)
After=network.target

[Service]
Type=simple
User=kafka
Group=kafka
WorkingDirectory=/opt/kafka_2.13-4.3.0

Environment=KAFKA_HEAP_OPTS=-Xmx1G -Xms1G
Environment=KAFKA_JVM_PERFORMANCE_OPTS=-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent

ExecStart=/opt/kafka_2.13-4.3.0/bin/kafka-server-start.sh /opt/kafka_2.13-4.3.0/config/custom-server.properties
Restart=on-failure
RestartSec=3
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target

Start and enable it:

sudo systemctl daemon-reload
sudo systemctl enable --now kafka.service
sudo systemctl status kafka.service

Step 8: Verify the KRaft quorum is healthy

From any node:

cd /opt/kafka_2.13-4.3.0
sudo -u kafka bin/kafka-metadata-quorum.sh \
  --bootstrap-controller kafka-01:9093 \
  describe --status

You should see all voters listed and one node acting as leader.

If this fails, 90% of the time it’s one of these:

9093 blocked between nodes (security group issue)
wrong hostnames (bad /etc/hosts)
node.id mismatch vs controller.quorum.voters

Step 9: Create and test your first topic

Create a topic (replication factor 2 for 3 brokers is OK for a lab):

cd /opt/kafka_2.13-4.3.0
sudo -u kafka ./bin/kafka-topics.sh --create \
  --topic first-topic \
  --bootstrap-server kafka-01:9092 \
  --replication-factor 2 \
  --partitions 3

Describe the topic:

sudo -u kafka ./bin/kafka-topics.sh --describe \
  --bootstrap-server kafka-01:9092 \
  --topic first-topic

Start a producer:

sudo -u kafka ./bin/kafka-console-producer.sh \
  --topic first-topic \
  --bootstrap-server kafka-01:9092

In another terminal, start a consumer:

sudo -u kafka ./bin/kafka-console-consumer.sh \
  --topic first-topic \
  --from-beginning \
  --bootstrap-server kafka-01:9092

Type a few messages in the producer and confirm they show up in the consumer.

Troubleshooting (quick hits)

Clients can’t connect
- Check advertised.listeners (clients connect to that, not listeners)
- Confirm 9092 inbound rules for the client source IP/SG
Nodes don’t form a quorum
- Confirm 9093 is open between the Kafka nodes only
- Verify controller.quorum.voters matches the correct hostnames and IDs
- Ensure each node’s node.id is unique
Permission errors under /opt/kafka/logDir
- Fix ownership: sudo chown -R kafka:kafka /opt/kafka/logDir
Service “starts” but immediately stops
- Use sudo journalctl -u kafka.service -n 200 --no-pager to see the real error

Automation Script

You can automate the Linux-specific steps required to set up Kafka on your Ubuntu or Amazon Linux 2023 VM using the shell script provided in the GitHub repository adilansari488/kafka-multi-node-cluster-setup. Any setup steps outside the VM, such as AWS EC2, Security Group configuration, must still be completed manually.

Summary

If you have followed the above steps correctly, you should now have a Kafka cluster running with three nodes.

Feel free to give your feedback and suggestions.

HAPPY KAFKA 😊

Connect

DEV Community

Setup Multi Node Kafka Cluster (KRaft)

Prerequisites

Step 1: Set hostnames (run on all 3 nodes)

Step 2: Install Java (run on all 3 nodes)

Step 3: Download and install Kafka (run on all 3 nodes)

Step 4: Create the Kafka config (per node)

Common config (same on all nodes)

A note on `advertised.listeners`

Step 5: Generate a cluster ID (run once, then reuse on all nodes)

Step 6: Format the log directory (run on all 3 nodes)

Step 7: Create `systemd` service on all 3 nodes

Step 8: Verify the KRaft quorum is healthy

Step 9: Create and test your first topic

Troubleshooting (quick hits)

Automation Script

Summary

Top comments (0)

Prerequisites

Step 1: Set hostnames (run on all 3 nodes)

Step 2: Install Java (run on all 3 nodes)

Step 3: Download and install Kafka (run on all 3 nodes)

Step 4: Create the Kafka config (per node)

Common config (same on all nodes)

A note on advertised.listeners

Step 5: Generate a cluster ID (run once, then reuse on all nodes)

Step 6: Format the log directory (run on all 3 nodes)

Step 7: Create systemd service on all 3 nodes

Step 8: Verify the KRaft quorum is healthy

Step 9: Create and test your first topic

Troubleshooting (quick hits)

Automation Script

Summary

A note on `advertised.listeners`

Step 7: Create `systemd` service on all 3 nodes