DEV Community: Yogesh Kale

The Kafka config that silently breaks message ordering — and how to catch it

Yogesh Kale — Wed, 22 Apr 2026 15:38:08 +0000

A deep dive into the guarantees, tradeoffs, and production gotchas of implementing strictly ordered Kafka consumers — and the honest conversation about when not to.

The Why

In my previous post, I built a zero-loss Kafka consumer using @RetryableTopic. But I deliberately called out one tradeoff that deserves its own conversation:

"Moving a message to a retry topic breaks ordering guarantees for that partition."

That sentence sent me down a rabbit hole. If retryable topics — one of Spring Kafka's most powerful resilience tools — inherently break ordering, what does it actually take to guarantee strict message order in a production Kafka system?

This post is everything I found: the real guarantees Kafka provides, the configuration decisions that break ordering silently, and the honest tradeoffs you need to make before choosing this path.

What Kafka Actually Guarantees (And What It Doesn't)

Let's start with what Kafka promises, because most ordering bugs come from misreading this.

Kafka guarantees ordering within a partition. It makes no guarantees across partitions.

That's it. Full stop. Not per topic. Not per consumer group. Per partition.

Before going deeper, here is where each ordering guarantee actually lives in the pipeline — because each layer can independently break the guarantee above it:

PRODUCER                    BROKER                      CONSUMER
─────────────────────────   ─────────────────────────   ─────────────────────────
Stable partition key        Single partition            concurrency=1 (or 1 per partition)
  → same key always           → messages stored         enable.auto.commit=false
    routes to same partition    in append order          AckMode.RECORD
enable.idempotence=true     Replication (acks=all)      No @Async hand-offs
max.in.flight=1               → no gaps from            CooperativeStickyAssignor
  → no broker reordering        leader failover          → rebalance-safe handoff
  from retried batches
                            ← Ordering guarantee lives across ALL three layers →
                            Breaking any one layer silently breaks end-to-end order

This is also the map for debugging ordering incidents in production: work left to right. A gap in your offset sequence? Check the producer's max.in.flight setting. Out-of-order messages after a deployment? Check the rebalance assignor. Duplicates? Check idempotence and ack mode.

This has a direct consequence: if messages that must be processed in sequence land on different partitions, you have no ordering guarantee regardless of how well you configure your consumer. The ordering problem starts at the producer, not the consumer.

Consider an e-commerce order lifecycle:

Event 1: ORDER_PLACED     (orderId: 1001)
Event 2: PAYMENT_CAPTURED (orderId: 1001)
Event 3: INVENTORY_RESERVED (orderId: 1001)
Event 4: ORDER_SHIPPED    (orderId: 1001)

All four events must be processed in sequence. If EVENT_3 is processed before EVENT_2, inventory is reserved against an unpaid order. If EVENT_4 arrives before EVENT_3, the shipment goes out against unreserved stock.

This is exactly the class of problem where Kafka ordering matters — and where getting it wrong causes real business damage.

Or consider a financial context (as a reference, not a recommendation to use Kafka for core ledger operations): a multi-leg settlement where debit must precede credit. Kafka can work here, but only with explicit ordering discipline. We will come back to where Kafka ordering fits and where it does not.

The Foundation: Partition Key Strategy

Since ordering is partition-scoped, the partition key is the most important architectural decision you will make for ordered consumers.

The rule: all messages that must be processed in sequence must share the same partition key. Kafka routes messages with the same key to the same partition using a consistent hash.

// Producer: always key by the entity whose events must be ordered
ProducerRecord<String, String> record = new ProducerRecord<>(
    "order-events",
    order.getOrderId(),   // ← This is the partition key
    eventPayload
);

What goes wrong here in practice:

Null keys — Kafka round-robins null-key messages across all partitions. Order is immediately lost.
Random UUIDs as keys — same problem. Each message lands on a different partition.
Using producer instance ID or timestamp as the key — technically unique per message, which defeats the purpose entirely.
Changing partition count after go-live — Kafka's hash function maps keys to partitions based on partition count. Increasing partitions remaps keys to different partitions, breaking the ordering guarantee for in-flight and future messages.

Production rule: treat your partition count as immutable once consumers are live. Over-provision partitions at topic creation time rather than resizing later.

Consumer Concurrency: The Silent Ordering Killer

This is where the majority of ordering bugs I have seen are introduced — not at the architecture level, but in a configuration property that is easy to overlook.

@Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
    ConcurrentKafkaListenerContainerFactory<String, String> factory =
        new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(consumerFactory());
    factory.setConcurrency(3); // ← This is the problem
    return factory;
}

Setting concurrency > 1 means Spring Kafka spins up multiple consumer threads, and under normal operation each thread is assigned its own partition — a single partition is never served by two threads simultaneously. The real ordering risk is subtler: during a rebalance, a partition can be moved from one thread to another. The new thread can start consuming from that partition before the previous thread has finished processing its current batch. That window — old thread still running, new thread already polling — is where out-of-order processing occurs. It is easy to overlook precisely because it only surfaces under rebalance conditions, not in steady-state load testing.

For strict ordering, the rule is:

factory.setConcurrency(1); // One thread per listener container

Or, if you want to scale with partition count:

// Set concurrency = number of partitions
// Each thread owns exactly one partition
factory.setConcurrency(partitionCount);

The second option is correct but requires discipline: if you set concurrency=6 against a topic with 6 partitions, each thread owns one partition and ordering within each partition is preserved. The moment concurrency exceeds partition count, some partitions get no dedicated thread and ordering breaks.

Equally important: never use @Async inside a @KafkaListener method. The moment you hand processing off to another thread pool, the calling thread moves to the next message while the previous one is still being processed. Ordering is gone.

// This silently breaks ordering
@KafkaListener(topics = "order-events")
public void consume(ConsumerRecord<String, String> record) {
    asyncService.processAsync(record.value()); // returns immediately
}

// Processing must complete before the next message is touched
@KafkaListener(topics = "order-events")
public void consume(ConsumerRecord<String, String> record) {
    orderProcessingService.process(record.value()); // blocking, synchronous
}

Offset Commit Strategy: Where Zero-Loss and Ordering Intersect

If you read my previous post on zero-loss consumers, you already know that enable.auto.commit=true is dangerous for reliability. For ordered consumers, it is doubly dangerous.

Auto-commit uses a time-based interval. It has no awareness of whether your processing logic has completed. In a high-throughput system, you can auto-commit offsets for messages your application is still processing — and if the application crashes, those messages are never reprocessed, silently breaking both your delivery guarantee and your ordering assumptions.

The correct configuration:

// application.yml
spring:
  kafka:
    consumer:
      enable-auto-commit: false
      auto-offset-reset: earliest

# In your ContainerFactory
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.RECORD);

AckMode.RECORD commits the offset after each record is fully processed. For strict ordering this is the safest mode — you never advance past a message until it has been handled.

If you need manual control (for example, batching acks across a transaction boundary):

factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);

With MANUAL_IMMEDIATE, you must call acknowledgment.acknowledge() explicitly. The critical rule: acknowledge in order. Out-of-order acknowledgment creates offset gaps. When the consumer restarts, it will reprocess from the earliest unacknowledged offset, which may cause you to re-process messages you already handled.

Error Handling: The Hardest Tradeoff in Ordered Systems

This is where the real tension lives, and I want to be direct about it because most blog posts gloss over it.

The fundamental conflict: in an ordered consumer, a failed message blocks the partition. Every message behind it waits. But the standard tool for Kafka resilience — @RetryableTopic — resolves this by moving the failed message to a separate retry topic, which immediately breaks ordering for that partition.

You have to choose. There is no configuration that gives you both non-blocking retries and guaranteed ordering.

Option 1: Blocking Retry (Preserves Order, Risks Throughput)

@Bean
public DefaultErrorHandler errorHandler() {
    // Retry up to 3 times with exponential backoff
    ExponentialBackOffWithMaxRetries backOff = new ExponentialBackOffWithMaxRetries(3);
    backOff.setInitialInterval(1_000L);
    backOff.setMultiplier(2.0);
    backOff.setMaxInterval(10_000L);

    DefaultErrorHandler handler = new DefaultErrorHandler(
        new DeadLetterPublishingRecoverer(kafkaTemplate), backOff
    );

    // Never retry these — they will always fail
    handler.addNotRetryableExceptions(
        JsonParseException.class,
        ValidationException.class
    );

    return handler;
}

With DefaultErrorHandler and no retry topic, retries happen in-place. The partition is blocked while retries are attempted. This is the correct choice when:

Your messages have business-level dependencies on each other (true ordering requirement)
A transient failure retrying for a few seconds is acceptable
Your downstream can recover quickly

When retries are exhausted, the message goes to the DLT via DeadLetterPublishingRecoverer. At that point, you have a decision: stop the consumer (strict ordering, no skipping) or skip to the next message (lose strict ordering, maintain throughput).

Option 2: DLT With Consumer Pause (Best of Both Under Prolonged Failure)

For scenarios where the downstream is fully down (not just a transient blip), blocking retries will pin your partition for a long time. A better pattern is to pause the consumer and let messages accumulate safely in the broker — they stay in their original partition order, untouched, ready to be processed in sequence once the dependency recovers.

The trigger for a pause is your error handler detecting consecutive exhausted retries within a time window — not a single failure. A single transient failure should exhaust the backoff and either recover or go to DLT. It is only when failures are sustained (indicating downstream unavailability rather than a bad message) that pausing the entire consumer is warranted.

@Component
public class OrderedConsumerController {

    private final KafkaListenerEndpointRegistry registry;

    // Called by your error handler after N consecutive DLT publishes within a window,
    // indicating the downstream service is unavailable — not just a bad message
    public void pauseConsumer(String listenerId) {
        MessageListenerContainer container = registry.getListenerContainer(listenerId);
        if (container != null && container.isRunning()) {
            container.pause();
            log.warn("Consumer [{}] paused due to sustained downstream failure", listenerId);
        }
    }

    public void resumeConsumer(String listenerId) {
        MessageListenerContainer container = registry.getListenerContainer(listenerId);
        if (container != null) {
            container.resume();
            log.info("Consumer [{}] resumed after downstream recovery", listenerId);
        }
    }
}

Recovery monitoring is what completes this pattern. In my previous post on zero-loss consumers, I used a HeartbeatScheduler that periodically probed the downstream on a fixed interval and published a recovery event once the health check passed. The same approach applies here: a scheduled task that calls your downstream's health endpoint every 10–30 seconds, and calls resumeConsumer() on success. The key difference in an ordered consumer is that you must not resume until you are confident the dependency is stable — a premature resume that fails immediately will cause another pause cycle and risk committing a partial offset.

Producer-Side Guarantees: The Part Most Ordering Guides Skip

You can configure your consumer perfectly and still lose ordering at the broker level if your producer is misconfigured. Three settings matter:

spring:
  kafka:
    producer:
      acks: all                    # Wait for all in-sync replicas to acknowledge
      retries: 3
      properties:
        enable.idempotence: true
        max.in.flight.requests.per.connection: 1

acks=all — without this, a message can be acknowledged by the leader but not yet replicated. If the leader fails before replication, the message is lost. Your consumer never sees it, and the sequence has a permanent gap.

enable.idempotence=true — producer retries can cause duplicate messages at the broker if the original request succeeded but the acknowledgment was lost in transit. With idempotence enabled, Kafka deduplicates using a producer ID and sequence number. This is a prerequisite for correct ordered delivery.

max.in.flight.requests.per.connection=1 — this one is subtle. When a producer has multiple in-flight requests, a retry of request N can overtake request N+1 if N+1 was acknowledged first. The result: messages arrive at the broker out of order. Setting this to 1 ensures only one batch is in-flight at a time. Note: with enable.idempotence=true, Kafka allows up to 5 in-flight requests without reordering risk, but setting it to 1 is the conservative and safest choice for strictly ordered workloads.

Rebalance Handling: The Ordering Disruption You Cannot Ignore

A consumer group rebalance — triggered by a consumer joining, leaving, or timing out — temporarily pauses all consumers in the group while partition ownership is reassigned. Any in-flight processing during a rebalance can complete out of order relative to the messages that follow.

Use the Cooperative Sticky Assignor:

spring:
  kafka:
    consumer:
      properties:
        partition.assignment.strategy: org.apache.kafka.clients.consumer.CooperativeStickyAssignor

The default RangeAssignor is a stop-the-world rebalance: all consumers stop, all partitions are unassigned, then reassigned. CooperativeStickyAssignor only reassigns partitions that actually need to move. Consumers that keep their partition assignments continue processing uninterrupted. For ordered consumers on high-partition topics, this is a significant improvement.

Use Static Group Membership to reduce rebalance frequency:

spring:
  kafka:
    consumer:
      properties:
        group.instance.id: order-consumer-instance-1  # Unique per instance
        session.timeout.ms: 60000

Without static membership, every application restart triggers a rebalance. With group.instance.id, Kafka recognises the consumer as the same instance rejoining and avoids a full rebalance if it comes back within the session timeout. For Kubernetes deployments where pods restart frequently, this is a material difference.

Implement ConsumerRebalanceListener for clean handoffs:

@Component
public class OrderedRebalanceListener implements ConsumerRebalanceListener {

    private final Set<TopicPartition> currentAssignment = new HashSet<>();

    @Override
    public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
        // Complete or checkpoint in-flight work before partitions are handed off
        log.info("Partitions revoked: {}. Completing in-flight processing.", partitions);
        // Flush any buffered state, commit offsets manually if using MANUAL ack mode
        currentAssignment.removeAll(partitions);
    }

    @Override
    public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
        log.info("Partitions assigned: {}", partitions);
        currentAssignment.addAll(partitions);
    }
}

Idempotent Processing: Non-Negotiable

Rebalances cause redelivery. Network failures cause redelivery. Consumer restarts cause redelivery. Your consumer will process the same message more than once — the question is whether your business logic handles it correctly.

Every handler in an ordered consumer must be idempotent. Processing the same message twice must produce the same result as processing it once.

A simple deduplication pattern using the original offset and partition as a correlation key:

@Service
public class IdempotentOrderProcessor {

    private final OrderRepository orderRepository;
    private final ProcessedMessageRepository processedMessages;

    public void process(ConsumerRecord<String, String> record) {
        String deduplicationKey = record.topic() + "-" + record.partition() + "-" + record.offset();

        if (processedMessages.existsByKey(deduplicationKey)) {
            log.info("Duplicate message detected, skipping. Key: {}", deduplicationKey);
            return;
        }

        // Process the message
        OrderEvent event = deserialize(record.value());
        orderRepository.applyEvent(event);

        // Mark as processed (ideally in the same transaction as the business operation)
        processedMessages.save(new ProcessedMessage(deduplicationKey));
    }
}

Important note on this pattern: if you are using a relational database, saving the processed message key and applying the business operation should be in the same database transaction. If the application crashes between the business operation and saving the key, you get a reprocessed message on restart. Transactional outbox or a unique constraint on the key column enforced by the database are both valid approaches.

Sequence Numbers: Defense in Depth

Kafka offsets tell you where a message sits within a partition. But they do not tell you the intended business sequence — which can differ if messages are produced out of order by the upstream system, or if a producer retry creates a gap.

For critical ordered consumers, embedding a sequence number in the message payload gives you an additional layer of defense:

public class OrderEvent {
    private String orderId;
    private String eventType;
    private long sequenceNumber;      // Monotonically increasing per orderId
    private Instant occurredAt;
    // ...
}

In your consumer, validate before processing:

public void process(OrderEvent event) {
    long expectedSeq = orderSequenceTracker.getNext(event.getOrderId());

    if (event.getSequenceNumber() < expectedSeq) {
        log.warn("Duplicate or old event detected. orderId={}, seq={}, expected={}",
            event.getOrderId(), event.getSequenceNumber(), expectedSeq);
        return; // Idempotency: already processed
    }

    if (event.getSequenceNumber() > expectedSeq) {
        log.error("Gap in sequence detected. orderId={}, seq={}, expected={}",
            event.getOrderId(), event.getSequenceNumber(), expectedSeq);
        throw new SequenceGapException("Missing event in sequence for order: " + event.getOrderId());
    }

    applyEvent(event);
    orderSequenceTracker.advance(event.getOrderId());
}

A SequenceGapException thrown here will trigger your error handler. Whether you block-retry, pause the consumer, or route to a DLT depends on your business tolerance — but at least you have detected the gap rather than processing out-of-order silently.

Observability: You Cannot Debug What You Cannot See

Ordered consumers fail in subtle ways. A rebalance redelivery processed twice. A gap in the offset sequence. A message processed 200ms before its predecessor completed. None of these show up as errors in your application logs without deliberate instrumentation.

Log partition and offset on every message:

@KafkaListener(topics = "order-events", groupId = "order-processor")
public void consume(
    ConsumerRecord<String, String> record,
    @Header(KafkaHeaders.RECEIVED_PARTITION) int partition,
    @Header(KafkaHeaders.OFFSET) long offset) {

    log.info("Processing message. topic={}, partition={}, offset={}, key={}",
        record.topic(), partition, offset, record.key());

    orderProcessingService.process(record);

    log.info("Completed message. topic={}, partition={}, offset={}", 
        record.topic(), partition, offset);
}

Expose consumer lag as a metric. Consumer lag — the gap between the latest offset produced and the latest offset committed by your consumer — is the single most important health signal for an ordered consumer. A growing lag means messages are accumulating unprocessed, which eventually translates to ordering delays even if no failures have occurred.

With Micrometer and Spring Boot Actuator, consumer lag is exposed automatically if you include spring-kafka metrics. Ensure your Prometheus scrape config picks it up:

management:
  metrics:
    tags:
      application: order-consumer
  endpoints:
    web:
      exposure:
        include: health,metrics,prometheus

Alert on: lag growing beyond your SLA threshold, DLT publish rate above zero, offset gap between consecutive commits, and rebalance frequency.

Where Kafka Ordering Fits — And Where It Does Not

After all of the above, the honest answer: Kafka ordering is appropriate for event-driven workflows where sequence matters but the domain can tolerate the complexity overhead.

It fits well for:

Order lifecycle events (placed → paid → shipped → delivered)
User action streams that must be replayed in order
Audit trails where chronological sequence is required
CDC (Change Data Capture) streams that must apply database changes in order

It is the wrong tool for:

Core financial ledger operations requiring ACID guarantees across multiple legs
Workflows where any ordering failure must immediately halt the entire pipeline — a relational database with proper transaction management is simpler and safer
Low-volume, low-latency scenarios where the operational complexity of Kafka outweighs its throughput benefits

If your requirement is "these four operations must all succeed or all fail, in order, atomically" — that is a distributed transaction problem. Look at the Saga pattern with orchestration, or a transactional outbox, before reaching for Kafka ordering.

The Full Picture: Configuration Reference

Every property below has a reason. The two most commonly misconfigured ones in ordered consumers are max.poll.interval.ms and max.poll.records — and getting either wrong causes silent ordering failures rather than hard errors.

spring:
  kafka:
    consumer:
      enable-auto-commit: false
      auto-offset-reset: earliest
      properties:
        partition.assignment.strategy: org.apache.kafka.clients.consumer.CooperativeStickyAssignor
        group.instance.id: ${HOSTNAME}-consumer   # Static membership — prevents rebalance on pod restart
        session.timeout.ms: 60000

        # Conservative ceiling: limits how many records a single poll() fetches.
        # If your processing logic takes 200ms per record, 50 records = 10s of work per poll.
        # This directly informs max.poll.interval.ms below — they must be sized together.
        max.poll.records: 50

        # Must exceed your worst-case processing time for a full poll batch.
        # Formula: max.poll.interval.ms > max.poll.records × max_processing_time_per_record_ms
        # With 50 records at 200ms each = 10,000ms. 300,000ms gives a 30× safety margin.
        # If this deadline is breached, Kafka treats the consumer as dead, triggers a rebalance,
        # and the in-flight batch may be reprocessed by another thread — breaking ordering.
        max.poll.interval.ms: 300000
    producer:
      acks: all
      retries: 3
      properties:
        enable.idempotence: true
        max.in.flight.requests.per.connection: 1

@Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
    ConcurrentKafkaListenerContainerFactory<String, String> factory =
        new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(consumerFactory());
    factory.setConcurrency(1);  // Or = partition count with sticky assignment
    factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.RECORD);

    ExponentialBackOffWithMaxRetries backOff = new ExponentialBackOffWithMaxRetries(3);
    backOff.setInitialInterval(1_000L);
    backOff.setMultiplier(2.0);
    backOff.setMaxInterval(10_000L);

    factory.setCommonErrorHandler(new DefaultErrorHandler(
        new DeadLetterPublishingRecoverer(kafkaTemplate), backOff
    ));

    return factory;
}

Final Thoughts

Guaranteed message ordering in Kafka is achievable, but it is not a feature you turn on. It is a property that emerges from a set of consistent decisions made across the producer, the broker, and the consumer — and broken by any one decision made without understanding its implications.

The combination that holds in production:

Stable, business-meaningful partition keys set at the producer
enable.idempotence=true and max.in.flight.requests.per.connection=1 on the producer
enable.auto.commit=false with AckMode.RECORD on the consumer
concurrency=1 (or one thread per partition) with no async hand-offs
Blocking retry with DefaultErrorHandler — not @RetryableTopic, which trades ordering for throughput
Cooperative sticky assignor and static group membership for rebalance stability
Idempotent processing with deduplication keyed on topic + partition + offset
Sequence numbers in the payload for gap detection
Consumer lag and offset gap alerting as the primary health signals

I have not deployed this exact pattern to production — but I have run the zero-loss consumer from my previous post in production, and the foundation is the same: respect Kafka's actual guarantees, design for redelivery, and make your tradeoffs explicit rather than discovering them in an incident.

If you have implemented ordered consumers in production and found gaps in this — or made different tradeoff decisions — I would genuinely like to hear about it in the comments.

This is part of a series on production-grade Kafka with Spring Boot. The previous post covers zero-loss delivery with retryable topics.

Tags: #kafka #springboot #java #distributedsystems #eventdrivenarchitecture #microservices

Building a Zero-Loss Kafka Consumer with Spring Kafka Retryable Topics

Yogesh Kale — Tue, 14 Apr 2026 15:36:57 +0000

A production-grade approach to integrating Kafka with legacy systems using Spring Kafka Retryable Topics.

The Why

When we first discussed connecting a Kafka stream to our legacy downstream system, the obvious question was: why not just consume directly in the downstream app?

The honest answer — our legacy system simply isn't built for it. Kafka's polling model doesn't fit its threading architecture. Connections are tightly controlled, and sneaking a Kafka client into it would increase system risk more than it would solve. But the deeper issue is a fundamental mismatch: our downstream system is synchronous and deterministic. Kafka is asynchronous. That tension can't just be wished away.

So we built something in between — a stateless Kafka adaptor that absorbs event streams, converts them to synchronous HTTP calls, and guarantees zero-loss delivery. This post is everything we learned building it, including the parts that bit us in production.

Tech Stack

Java 21
Spring Boot 3.x
Spring Web (Spring MVC)
Spring Kafka (with @RetryableTopic)
OkHttp 5.x

1. The Core Challenge: The Stateless Synchronous Hand-off

Most Kafka consumer implementations follow a "consume-and-save" pattern — pull the message, persist it locally, acknowledge it. Simple and safe.

Our adaptor doesn't have that luxury. It has no local database. There's nowhere to buffer. So the only way to guarantee delivery is to make the Kafka acknowledgement contingent on the downstream system actually accepting the message.

We call this a Synchronous Hand-off: the consumer only ACKs the Kafka message once it receives a positive functional response from the downstream HTTP endpoint. The message is only "gone" from Kafka's perspective once it has safely landed downstream. No silent drops, no optimistic acks.

This one decision shapes the entire resilience strategy that follows.

2. A Multi-Layered Resilience Strategy

Running in production quickly teaches you to distinguish between a blip (a few seconds of network jitter) and a blackout (the downstream service is down for 30 minutes). A flat retry policy treats both the same way — and that's a problem.

So we built a tiered recovery model.

Layer 1 — Synchronous HTTP Retries

The first line of defence is a quick, synchronous retry loop inside our HttpClientService. For a transient 500 or a connection timeout, we don't want to immediately escalate to an async retry topic — that's expensive and slow. A few fast retries handle the vast majority of brief network hiccups before anything more serious kicks in.

Layer 2 — Asynchronous Non-Blocking Retries (`@RetryableTopic`)

If the synchronous retries are exhausted, we hand the problem to Spring Kafka's @RetryableTopic. This is really the heart of the whole design.

The key insight here is Head-of-Line blocking avoidance. Without retry topics, a single failed message can pin a partition — nothing behind it gets processed until it either succeeds or is manually skipped. With @RetryableTopic, the failed message is published to a separate retry topic and the main consumer moves on immediately. System throughput stays stable even while the downstream is recovering.

There's a tradeoff worth being upfront about: moving a message to a retry topic breaks ordering guarantees for that partition. Subsequent messages from the same partition will be processed before the retried one. For a lot of systems that's a deal breaker — but for ours it isn't, because each event is an independent financial instruction (an account update, a customer update, etc.) identified by its own transaction ID. The downstream system processes them idempotently by transaction ID, not by arrival sequence. If your domain requires strict ordering — say, a series of balance mutations on the same account that must apply in order — @RetryableTopic in this form is not the right tool without additional sequencing logic on top.

Here's what our actual consumer looks like:

@RetryableTopic(
  include = {DownStreamDownException.class},
  attempts = "${kafka.retry.topic.maxAttempts:1}",
  backoff = @Backoff(delayExpression = "${kafka.retry.backoff.delay:3600000}"),
  autoCreateTopics = "${kafka.autoCreateTopics:false}",
  dltTopicSuffix = DLQ_SUFFIX,
  retryTopicSuffix = RETRY_SUFFIX
)
@KafkaListener(
  topics = {"${spring.kafka.topic.updates.account}"},
  groupId = "${spring.kafka.consumer.group-id}",
  id = ACCOUNT_UPDATES_CONSUMER_ID
)
public void consumeMessage(
  ConsumerRecord<String, String> consumerRecord,
  @Header(value = KafkaHeaders.RECEIVED_PARTITION, required = false) Integer partition,
  @Header(value = KafkaHeaders.OFFSET, required = false) Long offset,
  @Header(value = KafkaHeaders.RECEIVED_TOPIC) String receivedTopic) {
    consume(consumerRecord, partition, offset, receivedTopic);
}

A few things worth calling out here:

include = {DownStreamDownException.class} — only retry on this specific exception. Validation errors and bad payloads should never retry; they should fail fast to DLT.
autoCreateTopics = false — this matters a lot in production (more on this below).
The backoff delay is a flat 1 hour by default (3600000ms, configurable from property file). This was a deliberate choice, not a default we left in place. In a financial context, the downstream outage pattern we've seen is almost never "back in 30 seconds" — it's either a brief HTTP blip (caught by Layer 1 sync retries) or a proper incident taking 20–60 minutes to resolve. A short exponential backoff would just fill the retry topic with noise during that window. One hour means the retry fires once, cleanly, after the service is likely recovered.

And when the HTTP call fails inside processMessage:

@Override
protected void processMessage(String messageWithMetadata, String endpoint,
  Integer originalPartition, Long originalOffset, String originalTopic) {

  boolean success = httpClientService
    .sendWithRetry(messageWithMetadata, endpoint, originalPartition, originalOffset, originalTopic)
    .orElse(false);

  if (!success) {
    log.debug("Request failed for Payload: {}", messageWithMetadata);
    throw new DownStreamDownException("DownStream service unavailable for topic: " + originalTopic);
  }

  log.debug("Request Payload: {}", messageWithMetadata);
  trackDeliverySuccess(originalTopic);
}

Throwing DownStreamDownException is the trigger. Spring Kafka catches it, sees it matches the include list, and routes the message to the retry topic. Clean and intentional.

Layer 3 — Dead Letter Topic (DLT)

If all retry attempts are exhausted, the message lands in the DLT (Spring Kafka calls it a Dead Letter Topic, not Queue — though the concept is the same).

Two scenarios send messages straight here without retrying:

All retry attempts exhausted.
Exceptions not in the include list — e.g., a malformed payload that would fail on every attempt anyway.

@DltHandler
public void handleDlt(
  ConsumerRecord<String, String> consumerRecord,
  @Header(value = KafkaHeaders.RECEIVED_PARTITION, required = false) Integer partition,
  @Header(value = KafkaHeaders.OFFSET, required = false) Long offset,
  Exception ex) {
    trackDlq();
    handleDlq(consumerRecord, partition, offset);
}

DLT messages are manually monitored. In practice, a message in the DLT is either a bad payload (fix the producer) or evidence of a prolonged downstream outage (manual replay needed after recovery).

3. Topic Naming & Consumer Group Gotcha (We Learned This the Hard Way)

Spring Kafka expects specific naming conventions for retry and DLT topics:

Retry topic → <main-topic>-retry
DLT topic → <main-topic>-dlt

It also automatically creates consumer groups for these by appending suffixes to your main group ID.

In UAT, Kafka had auto.create.topics.enable=true and group auto-creation on. Everything just worked. We moved to production — where auto-create is disabled for good reason — and nothing worked. The retry flow silently broke because the consumer groups didn't exist and we didn't have permission to create them.

The fix: explicitly create all three consumer groups before deploying and make sure your service account has the right ACLs on each. This should be part of your deployment checklist, not a production incident.

4. The Silent 20x Header Bloat

This one surprised us. By default, Spring Kafka attaches the full exception stack trace to the headers of every message it routes to a retry topic. In a high-throughput system, a 1KB message can balloon to 20KB just from headers.

Multiply that across millions of retries and you have a serious storage and network overhead problem that never showed up in unit tests.

We solved it with a KafkaProducerInterceptor:

@Override
public ProducerRecord<String, String> onSend(ProducerRecord<String, String> producerRecord) {
  Headers headers = producerRecord.headers();
  String topic = producerRecord.topic();

  if (topic.endsWith("-retry") || topic.endsWith("-dlt")) {
    // Strip the full stack trace - keep essential error metadata, drop the wall of text
    for (String headerKey : HEADERS_TO_REMOVE_RETRY) {
      headers.remove(headerKey);
    }
  }

  return new ProducerRecord<>(
    producerRecord.topic(),
    producerRecord.partition(),
    producerRecord.key(),
    producerRecord.value(),
    headers
  );
}

HEADERS_TO_REMOVE_RETRY includes kafka_exception-stacktrace and a few others that add size without adding operational value. The result: roughly 90% reduction in retry topic storage overhead, while still keeping the exception message and class for debugging.

props.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, HeaderFilteringProducerInterceptor.class.getName());

5. Preserving Original Metadata Across Retries

In financial systems, idempotency is non-negotiable. The downstream service needs to detect duplicates — and it does that using the original Kafka offset and partition as a correlation key.

The problem: when a message moves to a retry topic, Kafka assigns it a new offset and partition. If you blindly forward that metadata, your duplicate detection breaks.

Spring Kafka saves the day here by writing the original coordinates into headers (kafka_original-offset, kafka_original-topic, kafka_original-partition) before routing to the retry topic.

Our KafkaMessageUtil reads these:

public static Long resolveOffset(ConsumerRecord<?, ?> record, String topic) {
  if (topic.contains("-retry")) {
    // Use the original offset from headers, not the retry topic's offset
    return extractLongHeader(record, "kafka_original-offset");
  }
  return record.offset();
}

The downstream service always gets the coordinates of the first delivery attempt, regardless of how many retries have happened. This makes duplicate detection work cleanly across the entire message life cycle.

6. The Heartbeat & Operational Safety

There's a well-known failure mode called the Thundering Herd: your downstream comes back online after an outage, and every backed-up consumer immediately floods it with requests, crashing it again.

We prevent this with a HeartbeatScheduler that pings the downstream health endpoint every 300 seconds. If the health check fails, we call pause() on all active consumer containers. Messages accumulate safely in the broker. When the health check recovers, we call resume().

Worth being honest about what this approach does not cover. A heartbeat tells you if the service is alive — not if it's healthy. A downstream that is up but returning 500 on 40% of real requests will pass the health check and still get flooded the moment consumers resume. We accepted this limitation because our downstream health endpoint is a genuine deep check — it validates DB connectivity and core dependencies, not just a surface-level HTTP 200. Combined with the 1-hour backoff window, the retry storm risk is manageable. But it is a known gap, and it is the main reason we are moving to circuit breakers.

A ConsumerMetricsScheduler logs per-minute delivery success and DLT rates. This gives the SRE team a real-time view of system health without digging through logs.

7. Where We're Heading — Circuit Breakers

We're evaluating Resilience4j as a replacement for the Heartbeat approach. The key advantages:

Monitors actual traffic — rather than a synthetic health check.
Half-Open state — instead of resuming all consumers at once after recovery, it lets a few "probe" requests through first. Only if those succeed does it open the floodgates.
Removes the background polling thread — moves toward a fully event-driven resilience model.

Final Thoughts

Building a zero-loss Kafka adaptor is genuinely harder than it looks. The individual pieces — retries, dead letters, health checks — are all standard. The challenge is in the details: what happens to headers when you retry? What offset do you report to the downstream? What's your consumer group strategy in a locked-down production cluster?

The combination that worked for us:

A stateless consumer with synchronous HTTP hand-offs
Spring Kafka @RetryableTopic for async, non-blocking retry flows
A producer interceptor to keep retry-topic storage sane
Original metadata preservation for idempotency
Heartbeat-driven pause/resume to prevent thundering herd

It's production-proven now. Hopefully this saves someone else the three days of debugging we did.

Originally published on Medium.

DEV Community: Yogesh Kale

The Kafka config that silently breaks message ordering — and how to catch it

The Why

What Kafka Actually Guarantees (And What It Doesn't)

The Foundation: Partition Key Strategy

Consumer Concurrency: The Silent Ordering Killer

Offset Commit Strategy: Where Zero-Loss and Ordering Intersect

Error Handling: The Hardest Tradeoff in Ordered Systems

Option 1: Blocking Retry (Preserves Order, Risks Throughput)

Option 2: DLT With Consumer Pause (Best of Both Under Prolonged Failure)

Producer-Side Guarantees: The Part Most Ordering Guides Skip

Rebalance Handling: The Ordering Disruption You Cannot Ignore

Idempotent Processing: Non-Negotiable

Sequence Numbers: Defense in Depth

Observability: You Cannot Debug What You Cannot See

Where Kafka Ordering Fits — And Where It Does Not

The Full Picture: Configuration Reference

Final Thoughts

Building a Zero-Loss Kafka Consumer with Spring Kafka Retryable Topics

The Why

Tech Stack

1. The Core Challenge: The Stateless Synchronous Hand-off

2. A Multi-Layered Resilience Strategy

Layer 1 — Synchronous HTTP Retries

Layer 2 — Asynchronous Non-Blocking Retries (@RetryableTopic)

Layer 3 — Dead Letter Topic (DLT)

3. Topic Naming & Consumer Group Gotcha (We Learned This the Hard Way)

4. The Silent 20x Header Bloat

5. Preserving Original Metadata Across Retries

6. The Heartbeat & Operational Safety

7. Where We're Heading — Circuit Breakers

Final Thoughts

Layer 2 — Asynchronous Non-Blocking Retries (`@RetryableTopic`)