DEV Community

Cover image for What does 'batching' mean when we're talking about Apache KafkaⓇ?
Lucia Cerchie
Lucia Cerchie

Posted on

8

What does 'batching' mean when we're talking about Apache KafkaⓇ?

Today I learned that when you hear the word 'batch' in the context of Apache Kafka, it can mean one of two things:

  1. A reference to batch-only data processing systems. Batch-only systems process data in a bounded way. That means that there's a start time and an end-time. Whether this batching is done in large or micro-batches, it is processed all at once. That's in contrast to the continuous data streaming that Apache Kafka enables, in which data is processed in event-sized pieces.

  2. Within the data streaming context, there's something called producer batching. It's a bit of a misnomer because it's not really related to the batch-only data processing systems. A Kafka producer, the client that publishes records to the Kafka cluster, compresses messages via a process called batching to increase throughput. This batching is part of the process handling data at once and in event-sized pieces, so it doesn't mean the same thing as batch-only data processing.

In conclusion, 'batching' means, in a very general way, 'grouping stuff together'. But 'producer batching' and 'batch-only data processing systems' do not share the term in any significant sense, because they are referring to the completely different functions I described above.

Image of Datadog

The Essential Toolkit for Front-end Developers

Take a user-centric approach to front-end monitoring that evolves alongside increasingly complex frameworks and single-page applications.

Get The Kit

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

👥 Ideal for solo developers, teams, and cross-company projects

Learn more

AWS GenAI LIVE!

GenAI LIVE! is a dynamic live-streamed show exploring how AWS and our partners are helping organizations unlock real value with generative AI.

Tune in to the full event

DEV is partnering to bring live events to the community. Join us or dismiss this billboard if you're not interested. ❤️