DEV Community

Markus
Markus

Posted on • Originally published at the-main-thread.com on

Your Second Reactive Messaging App: The One That Survives Production

Your first reactive messaging app did answers one question:

“How do messages flow?”

Cute. Useful. Also: not the hard part.

Your second app has to answer the questions production will ask at 2:17 a.m.:

  • What happens when messages are duplicated?
  • What happens when downstream services fail?
  • How do retries stop before your system melts down?
  • How do you prove what was processed… and what wasn’t?

This post is a teaser for a full hands-on tutorial where we build a claims intake pipeline in Quarkus that assumes things go wrong, and keeps working anyway. Think “realistic system,” not “hello Kafka.”

If you want the full build (with all code, all configs, DLQ wiring, idempotency table, and failure injection), I linked it at the end.

The mental model shift

Most “intro” messaging tutorials quietly assume this fairytale:

  • Messages arrive once
  • In order
  • And every downstream dependency behaves like a polite microservice in a demo

Production does not do polite.

Kafka is at-least-once. That means duplicates are normal.
External systems fail. That means timeouts are normal.
Ops needs answers. That means “it probably worked” is not a strategy.

So the second app is not about Kafka features.

It’s about discipline:
designing a pipeline that can take hits and still make forward progress.

What we build (high level)

A claims pipeline with the boring stuff that makes it real:

  • Invalid data gets rejected (without taking the service down)
  • Transient failures retry with backoff (and stop)
  • Permanent failures go to a dead-letter topic
  • Duplicates get ignored safely via idempotency
  • Operators can tell what happened

In the full tutorial, the pipeline is split into stages (submitted → validated → enriched → accepted), and we deliberately inject failures so you can watch the system behave under stress.

Because that’s the only honest way to learn.

A tiny taste: “throw exceptions on purpose”

Here’s the vibe. In the validation stage, we do the unthinkable:

We fail.

On purpose.

@Incoming("submitted")
@Outgoing("claims-validated")
public ClaimValidated validate(ClaimSubmitted claim) {
    if (claim.amount() <= 0) {
        throw new IllegalArgumentException("Claim amount must be positive");
    }
    return new ClaimValidated(claim.eventId(), claim.claimId(), claim.customerId(), claim.amount());
}
Enter fullscreen mode Exit fullscreen mode

The point is not the exception.

The point is what happens next.

Does the pipeline:

  • crash?
  • block a partition forever?
  • retry the poison message endlessly?
  • quietly drop it and pretend nothing happened?

The answer depends on how you design failure behavior per stage.

That’s where most teams get surprised.

The real boss fight: duplicates

If you build event-driven systems long enough, you stop asking “how do I avoid duplicates?”

You start asking:

“How do I make duplicates harmless?”

The pattern is simple and brutal:

  • Every event has a stable eventId
  • You persist a “processed events” marker
  • You treat that table like truth

Exactly-once is rare.

Idempotency is mandatory.

In the full tutorial, this becomes the core method of the whole pipeline.

DLQ > blind retries

Retries feel comforting. Like a warm blanket.

In production, blind retries are often a gasoline blanket.

Why?

Because connector-level retries can:

  • block a partition
  • amplify failure storms
  • hide poison messages until everything is on fire

A dead-letter queue (DLQ) is not “extra complexity.”

It’s how you keep the system alive while still preserving the evidence.

If you only take one idea from this teaser

Your first messaging app teaches flow.

Your second one teaches survival.

And survival is not configured. It’s designed.

The full hands-on tutorial

If you want the complete end-to-end build in Quarkus (project bootstrap, Kafka wiring, failure strategies, DLQ topic, idempotent persistence with Panache + Postgres via Dev Services, plus test runs and log traces), that’s here:

👉 https://www.the-main-thread.com/p/event-driven-claims-pipeline-java-quarkus-kafka

Top comments (0)