DEV Community

Cover image for Apache Kafka: A Quickstart Guide for Developers
Jan Schulte for Outshift By Cisco

Posted on • Edited on

Apache Kafka: A Quickstart Guide for Developers

If you plan to use Kafka in your application, here's everything you
need to know to start with a local developer setup.

Apache Kafka is a powerful tool for stream processing and decoupling
service-to-service communication. Once up and running, leveraging Kafka
in your architecture can significantly improve overall application
performance and reliability.

If you thought about using Kafka but needed help figuring out where to
start, this guide is for you.

Here's everything you'll learn in this guide:

  • Kafka Terminology
  • How to get started with Kafka on your local developer machine
  • How to send test messages
  • Useful Command line tools

Prerequisites

Here's what you'll need to follow along with this tutorial:

  • Docker
  • Docker Compose

Kafka Terminology You Should Know

Kafka introduces a few new concepts and keywords. We'll cover the
essential concepts here, so you know enough to get started.

*Kafka Cluster*

By default, Kafka runs in a cluster of several so-called brokers
(each broker is a Kafka instance running on a dedicated machine).

Relying on a cluster instead of a single instance has several
advantages:

  • Increased level of fault tolerance: If a broker goes offline, producers and consumers can communicate to a different machine. Brokers replicate data among instances to prevent data loss.
  • Performance: With more than one available broker, not all consumers read from the same instance. Therefore a single broker does not become a bottleneck that easily.

To coordinate brokers, Kafka relies on a tool called Zookeeper. Among
other duties, Zookeeper manages the replication of topics and consumer
offsets.

*Consumer Offset*:

When a consumer reads data from a partition, it keeps track of the
latest event it reads with an offset value. This integer value indicates
the last read position. The consumer syncs the offset to Kafka or
Zookeper
so that in case of the
consumer crashes, it can quickly recover from its last known read
position.

We don't have to worry too much about Zookeeper for our local
development environment, but it has to be there for Kafka to work.

If you'd like to learn more about Kafka's core concepts, check out this
post
.

How to Install Kafka Locally (Using Docker)

Apache Kafka is written in Java. The advantage is: It runs everywhere
where Java runs. The downside: You need to have a Java Runtime installed
and configured. Therefore, we'll leverage a Docker Compose setup instead
of downloading and configuring Kafka directly to your system. Your
advantage: You're up and running in no time without having to install
and configure a Java Runtime. In a new directory, create a new file with
the filename docker-compose.yml and paste in the following content:

version: '3'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.3.0
    container_name: zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  broker:
    image: confluentinc/cp-kafka:7.3.0
    container_name: broker
    ports:
      - "9092:9092"
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_INTERNAL://broker:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
Enter fullscreen mode Exit fullscreen mode

This YAML snippet configures a single Kafka broker with a Zookeeper
instance.

Start the Kafka Server

Once you have saved the docker-compose.yml, run it with:

$ docker-compose up 
Enter fullscreen mode Exit fullscreen mode

We'll leave this in the foreground to monitor the log for potential
errors.

So, with Kafka up and running, how do we continue?

The next step is a housekeeping item. Before producing and consuming
events, we need to create a topic.

A topic is a logical space that contains events of the same kind or for
a specific use case, such as:

  • payment_processed
  • user_signed_up
  • order_placed

Create a Kafka Topic

Kafka provides a set of command line utilities to create a new topic.
We'll focus on the essentials for now, so we can start producing
messages quickly.

   $ docker exec broker \
kafka-topics --bootstrap-server broker:9092 \
             --create \
             --topic quickstart
Enter fullscreen mode Exit fullscreen mode

We directly run the necessary script inside the broker's Docker
container. When creating a topic, we can configure and customize several
different options. For the sake of simplicity, we create a topic with
default settings.

Publish and Consume Messages

With the topic created, let's publish our first message. Run the
following command:

   $ docker exec --interactive --tty broker \
kafka-console-producer --bootstrap-server broker:9092 \ 
   --topic quickstart
Enter fullscreen mode Exit fullscreen mode

Once the command prompt finished loading, let's type in some text. Each
line starts a new message:

Hello, World!
Test Message
Enter fullscreen mode Exit fullscreen mode

The producer publishes the messages to the topic once you hit return.

With the producer in place, let's consume our messages. In a new tab,
run the following command:

   $ docker exec --interactive --tty broker \
kafka-console-consumer --bootstrap-server broker:9092 \
                       --topic quickstart \
                       --from-beginning
Enter fullscreen mode Exit fullscreen mode

Notice the --from-beginning parameter. It instructs the consumer to
read from the beginning of the topic. By default, Kafka holds on to all
messages within a topic. When a consumer has read a message, it saves
its latest offset in Zookeeper. If this particular consumer dies, it can
retrieve the latest offset from Zookeeper and recover from where it left
off reading from the topic.

With the consumer running, we now see the following output:

Hello, World!
Test Message
Enter fullscreen mode Exit fullscreen mode

This approach is helpful to get started and get a first feeling for
Kafka and its mechanisms. In your application, however, you'll use the
Kafka API to produce and consume messages.

Connecting to Kafka from the Command Line: kcat

Before we come to an end here, let's explore one additional helpful
tool: kcat (formerly known as
kafkacat).

While the command line tools in the container are helpful, they require
the JVM to be present. In scenarios where no Java Runtime is installed,
or it wouldn't be feasible to install it, kcat comes in handy. It
connects to a Kafka cluster or broker and allows you to produce and
consume messages.

Start kcat in producer mode:

$ echo "TEST" | kcat -P -b localhost:9092 -t quickstart
Enter fullscreen mode Exit fullscreen mode

Note: kcat publishes whatever input we pipe into it. For
demonstration purposes, we use a simple echo output.

Start kcat in consumer mode:

$ kcat -C -b localhost:9092 -t quickstart
Enter fullscreen mode Exit fullscreen mode

kcat comes in handy whenever you need to test your application's
producer or consumer side.

Final Thoughts

Even if Kafka requires some upfront learning, with Tools like Docker and
kcat at our disposal, spinning up a dev instance of a broker becomes
much more manageable. Now that you can run Kafka locally, it's time to
start preparing your application code to publish and subscribe to
topics.

If you're thinking about deploying your (first) Kafka-based application
to production, check out Calisti. Calisti enables
you to run Kafka on a Kubernetes cluster without extensive and manual
configuration.

Also, check out this blog post to learn how you use Kafka client libraries in your code.

Top comments (0)