Kafkacat is an awesome tool and today I want to show you how easy it is to use it and what are some of the cool things you can do with it.
All the features explained below are available in version 1.5.0.
Kafkacat is available from Homebrew (latest version) and some Linux repositories, but it is possible that Linux repos don’t contain the latest version. If that’s the case, you can always run the latest kafkacat from docker.
Kafkacat is a command-line tool for producing and consuming Kafka messages. In addition, you can view metadata about the cluster or topics.
Kafkacat has quite a few parameters and it might look scary learning them all, yet (most of) the parameters make sense and are easy to remember. Let’s start with the most important: modes. When making a call to Kafkacat, you’ll always use it in one of the four modes it has. All the modes use capital letter:
- -P = Produce data
- -C = Consume data
- -L = List metadata
- -Q = Query
The next most important option is the broker list (-b) and after that, it’s usually topic (-t).
So you can almost write your command like a story. The following command:
kafkacat -C -b localhost:9092 -t topic1 -o beginning
could be read as: I want to Consume from broker localhost:9092 and topic topic1 with offset set to the beginning.
Ok, now that I have hopefully convinced you that all those cryptical parameters make sense, let’s look at how to use Kafkacat to achieve some common tasks.
What do we need so we could produce data? At a minimum, you need a broker and a topic you want to write to.
kafkacat -P -b localhost:9092 -t topic1
Default message separator is Enter. Type your messages, and separate them with Enter.
Producing keys and values
If you want to produce messages with key, you need to specify the Key delimiter (-K). Let’s use a colon to separate the key and the message in the input:
kafkacat -P -b localhost:9092 -t topic1 -K : key3:message3 key4:message4
Notice that parameter uses capital K.
Produce messages with headers
If you want to add headers to the messages, add them using -H parameter, in a key=value format:
kafkacat -P -b localhost:9092 \ -t topic1 \ -H appName=kafkacat -H appId=1
As you see, additional headers are added by repeating -H flag. Note that all the messages produced will have the two headers specified with -H flag.
Produce data from a file
If you want to produce data using a file, use the option -l (as in: fi*l*e)… I did say that most of the parameters are easy to remember :). Let’s say we have a file called data.txt containing key-value pairs, separated by a colon:
key1:message1 key2:message2 key3:message3
So the command would be:
kafkacat -P -b localhost:9092 -t topic1 -K: -l data.txt
Produce message with compression
Using a (-z) parameter you can specify message compression:
kafkacat -P -b localhost:9092 -t topic1 -z snappy
Supported values are: snappy, gzip and lz4.
Consume all the messages from a topic
kafkacat -C -b localhost:9092 -t topic1
Note that, unlike kafka-console-consumer, kafkacat will consume the messages from the beginning of the topic by default. This approach makes more sense to me, but YMMV.
Consume X messages
You can control how many messages will be consumed using the count parameter (-c, lowercase).
kafkacat -C -b localhost:9092 -t topic1 -c 5
If you want to read data from a particular offset, you can use the -o parameter. The offset parameter is very versatile. You can:
Consume messages from the beginning or end
kafkacat -C -b localhost:9092 -t topic1 -o beginning
Use constants beginning or end to tell kafkacat where to begin the consumption.
Consume from a given offset
kafkacat -C -b localhost:9092 -t topic1 -o 123
Use an absolute value for the offset and Kafkacat will start consuming from the given offset. If you don’t specify the partition to consume, Kafkacat will consume all the partitions from the given offset.
Consume last X messages in a partition(s)
kafkacat -C -b localhost:9092 -t topic1 -o -10
We do this by using a negative offset value.
It is possible to start consuming after a given timestamp in milliseconds using the format -o s@start_timestamp. Technically this is consuming based on an offset, the difference is that kafkacat figures out the offset for you based on the provided timestamp(s).
kafkacat -C -b localhost:9092 -t topic1 -o s@start_timestamp
You can also stop consuming when a given timestamp is reached using:
kafkacat -C -b localhost:9092 -t topic1 -o e@end_timestamp
This is very useful when you are debugging an error that occurred, you have the timestamp of the error, but you want to check how the message looked. Then, combining the start and end offset, you can narrow down your search:
kafkacat -C -b localhost:9092 -t topic1 -o s@start_timestamp -o e@end_timestamp
By default, Kafkacat will print out only the message payload (value of the Kafka record), but you can print anything you’re interested in. To define the custom output, specify (-f) flag, as in format, followed by a format string. Here’s an example that prints a string with key and value of the message:
kafkacat -C -b localhost:9092 -t topic1 \ -f 'Key is %k, and message payload is: %s \n'
%k and %s are format string tokens. The output might be something like this:
Key is key3, and message payload is: message3 Key is key4, and message payload is: message4
So what can you print out using format string?
- Topic (%t),
- partition (%p)
- offset (%o)
- timestamp (%T)
- message key (%k)
- message value (%s)
- message headers (%h)
- key length (%K)
- value length (%S)
As you’ve seen above, you can use newline (\n \r) or tab characters(\t) in the format string as well.
If messages are not written as strings, you need to configure a proper serde for keys and values using -s parameter.
For example, if both key and value are 32-bit integers, you would read it using:
kafkacat -C -b localhost:9092 -t topic1 -s i
You can specify separately serde for the key and value using:
kafkacat -C -b localhost:9092 -t topic1 -s key=i -s value=s
You will find the list of all the serdes in a kafkacat help (kafkacat -h).
Avro messages are a bit special since they require a schema registry. But Kafkacat has you covered there as well. Use (-r) to specify the schema registry URL:
kafkacat -C -b localhost:9092 \ -t avro-topic \ -s key=s -s value=avro \ -r http://localhost:8081
In the example above, we’re are reading messages from a topic where keys are strings, but values are Avro.
Listing metadata gives you info about topics: how many partitions it has, which broker is a leader for a partition as well as the list of in-sync replicas (isr).
Metadata for all topics
kafkacat -L -b localhost:9092
Simply calling -L with no other parameters will display the metadata for all the topics in the cluster.
Metadata for a given topic
If you want to see metadata for just one topic, specify it using (-t) parameter:
kafkacat -L -b localhost:9092 -t topic1
If you want to find an offset of a Kafka record based on a timestamp, Query mode can help with that. Just specify the topic, partition and a timestamp:
kafkacat -b localhost:9092 -Q -t topic1:1:1588534509794
I’ve tried to cover the most common use cases, but feel free to explore. If there’s something else that you find useful when using Kafkacat but is not mentioned above, make sure to leave your comment below, so that we all learn something new 😉
I have created a Kafka mini-course that you can get absolutely free. Sign up for it over at Coding Harbour.