DEV Community

Cover image for The Offset Reset Dilemma: Avoiding Surprise Replays in Kafka
Laxman Patel
Laxman Patel

Posted on

The Offset Reset Dilemma: Avoiding Surprise Replays in Kafka

A consumer needs an offset (a bookmark) to know where to start reading from a partition. Normally:

Kafka stores the last committed offset in _consumer_offsets. When you restart a consumer, it resumes from that committed offset.

But… what if there is no valid offset for a partition?
That’s where auto.offset.reset kicks in.

🚨 When Does “no valid offset” Happen?

  • New consumer group (first time this group subscribes to a topic → no committed offsets exist yet).
  • Offsets got deleted (Kafka has a retention policy for committed offsets — e.g., offsets.retention.minutes).
  • Offset is invalid (maybe pointing to data that was deleted due to log retention).

⚙️ auto.offset.reset Options

1. earliest

earliest

  • Start reading from the beginning of the log (smallest available offset).
  • Consumer will replay all historical data.
  • Good for batch jobs, data pipelines, or when you really want everything (e.g., reindexing a search database).

2. latest

laest

  • Start reading from the end of the log (largest offset).
  • Consumer ignores past data → only gets new messages arriving after it joined.
  • Good for real-time dashboards or monitoring, where you don’t care about history.

📌 Why is this Important?

  • If you forget this setting, you can accidentally replay millions of messages when you didn’t intend to.
  • Conversely, you might miss data if you start from latest in a system that needs history.

Top comments (0)