While covering the Outbox Pattern, I realized there's another side of event reliability to discuss — and that led me to write this article.
In event-driven systems, a lot of engineering discussions focus on publishing events reliably. That’s usually where the Transactional Outbox Pattern enters the conversation. And rightly so.
Reliable event publishing is hard.
But over time, I’ve noticed something in backend systems that:
publishing events reliably is only half the problem.
The other half is much harder.
Processing them reliably.
Because even if:
- Kafka delivers the event,
- RabbitMQ retries correctly,
- the Outbox Pattern guarantees publication
distributed systems still face another uncomfortable reality:
duplicate processing is inevitable.
Consumers crash.
Retries happen.
Brokers redeliver events.
Deployments interrupt processing.
Offsets commit at the wrong time.
Network failures create uncertain states.
And suddenly you're staring at production wondering why:
- a payment was processed twice,
- inventory was deducted twice,
- customers received three confirmation emails,
- some workflow executed multiple times.
That's where the Inbox Pattern enters the conversation.
The Outbox Pattern solves reliable event publishing.
The Inbox Pattern solves reliable event processing.
And if you're building serious event-driven systems, you usually need both.
1. The Problem Starts With At-Least-Once Delivery
Most messaging systems don't promise exactly-once delivery, they promise at-least-once delivery. This includes Apache Kafka, RabbitMQ and many cloud messaging platforms.
Note:
Some might think, I've missed to consider Kafka's Exactly-Once Semantics. By default, Kafka operates on an at-least-once model. Kafka is famous for introducing true Exactly-Once Semantics (EOS).It achieves EOS using idempotent producers (where the broker assigns a unique sequence number to each message packet to detect and discard duplicates) and a transactional API (which allows atomic writes across multiple partitions).
The Catch: It requires explicit configuration and only applies within the Kafka ecosystem (from Kafka topic to Kafka topic). Once you move data out of Kafka to an external database, you are back to managing delivery guarantees yourself.
At-least-once delivery is usually the correct trade-off.
Because systems prefer duplicate delivery over silent message loss.
That sounds reasonable until duplicate processing starts creating business problems.
A Failure Scenario
Let's say we have a payment consumer.
It receives a PaymentCompleted event.
The consumer does 3 things:
- updates the database
- sends confirmation email
- acknowledges the message
Now imagine this sequence:
- DB transaction succeeds
- Service crashes before acknowledgment
- Broker re-delivers event
- Consumer processes again
Now:
- duplicate emails get sent,
- workflows execute twice,
- business state becomes inconsistent.
This is one of the common distributed systems problems in production systems.
And retries make it unavoidable eventually.
2. Why Idempotency Alone Is Often Not Enough
Whenever duplicate processing comes up, the usual advice is:
“Make consumers idempotent.”
It is a good advice, but also incomplete. But in real systems, idempotency is often harder than it sounds.
Simple Idempotency Works for Simple Cases
Some operations are naturally safe.
Example:
user.setStatus(ACTIVE);
Running it twice or ten times causes no harm. But not many workflows are that simple.
Real Systems Have Side Effects
Now let's talk about flows that hurt.
Let's consider a flow:
payment processing,
inventory deduction,
shipment creation,
sending emails,
calling external APIs.
Suddenly duplicate execution becomes dangerous.
For example:
PaymentCompleted Event -> Inventory Reduced -> Email Sent
If the event processes twice:
inventory may reduce twice,
duplicate emails may send,
downstream workflows may trigger repeatedly.
Now business correctness becomes difficult.
This is the problem Inbox Pattern solves.
3. What the Inbox Pattern Actually Does
The Inbox Pattern is surprisingly simple. Basic idea is:
Before processing an event, record that you've seen it.
That sounds simple. But it changes reliability significantly.
Core Flow
The flow usually looks like this:
- Receive event
- Check inbox table
- Already processed? Ignore it
- Not processed?
- process event
- store event ID in inbox table
- Commit transaction
It creates de-duplication at the consumer side. Now retries become much manageable.
Typical Inbox Flow
The detail to note here is that the business update and inbox record usually commit in the same database transaction.
Without that consistency boundary, things get weird again.
4. Why the Inbox Pattern Works
It works because it shifts duplicate handling into transactional state. Instead of relying on broker guarantees, perfect retries, or exactly-once infrastructure semantics the application explicitly tracks processed events.
It makes processing behavior deterministic.
Example Consumer Flow
A simplified example:
@Transactional
public void process(OrderCreatedEvent event) {
if (inboxRepository.exists(event.getEventId())) {
return;
}
inventoryService.reserve(event);
inboxRepository.save(
new InboxRecord(event.getEventId())
);
}
Now even if Kafka re-delivers, retries happen, and/or consumers restart the duplicate event gets ignored safely.
This pattern becomes extremely useful in financial systems, inventory systems, Saga (choreography) workflows, CQRS projections, and external integrations.
5. Inbox Pattern and Exactly-Once Myths
One misunderstood phrase in event-driven systems is "Exactly-once".
You might even have come across the phrase:
“Kafka provides exactly-once processing.”
And then assume duplicates are gone forever, not really. Kafka can help reduce duplicate delivery scenarios.
But once business workflows involve:
- databases,
- external APIs,
- side effects, or
- distributed services,
the problem becomes much larger.
Exactly-once delivery does not automatically become exactly-once business execution.
The Inbox Pattern acknowledges this reality.
Instead of trying to eliminate duplicates globally, it focuses on:
making duplicates harmless locally.
That's usually a much more practical engineering approach.
6. Inbox + Outbox Together
Outbox and Inbox are really two halves of the same reliability story.
Outbox Solves Producer Reliability
The Outbox Pattern answers:
Did we publish the event?
If the business transaction commits, the event eventually gets published. Producer-side consistency solved.
Inbox Solves Consumer Reliability
The Inbox Pattern answers:
Did we already process this event?
If yes, ignore it. Consumer-side consistency solved.
Together They Create End-to-End Reliability
A typical flow looks like this:
This combination shows up in:
- CQRS systems,
- Saga workflows,
- payment systems,
- inventory pipelines, and
- event-driven microservices.
Because reliable publishing alone is not enough. Reliable processing matters equally.
7. Inbox Pattern in Saga Workflows
The Inbox Pattern becomes important in Saga choreography systems.
In choreography-based Sagas:
- services communicate entirely through events,
- retries are common,
- duplicate delivery eventually happens.
Example:
OrderCreated -> PaymentCompleted -> InventoryReserved -> ShippingStarted
Now imagine:
-
PaymentCompletedprocesses twice.
Without Inbox protection:
- inventory may reserve twice,
- shipping may trigger twice,
- workflows become inconsistent.
This is why Inbox patterns are extremely valuable in distributed workflows. They reduce the risk of duplicate state transitions.
8. CQRS Projection Safety
CQRS systems also benefit heavily from Inbox-style processing.
Projection consumers often consume domain events, update read models, and rebuild denormalized views.
Without de-duplication:
- counters may inflate,
- projections drift,
- analytics become inaccurate.
Inbox tracking helps projections remain consistent even during replays, retries, consumer restarts, and broker re-delivery scenarios.
9. Operational Complexity
Like most distributed systems patterns, the Inbox Pattern is not free.
It comes with the overhead of:
- inbox tables,
- de-duplication logic,
- cleanup policies,
- replay considerations, and
- operational overhead.
Large systems eventually need:
- inbox archival,
- retention strategies,
- indexing optimizations, and
- replay-safe workflows.
Learnt another important distributed systems lesson:
reliability patterns usually exchange simplicity for controlled consistency.
That trade-off is worth it. But teams should adopt it intentionally.
10. Common Mistakes Teams Make
I've come across few mistakes repeatedly.
- Assuming Brokers Eliminate Duplicates
Brokers don't eliminate duplicates. Retries and re-delivery still happen. Applications must still protect business correctness.
- Forgetting Side Effects
Database updates are usually easier to de-duplicate. External side effects like emails, payments, web-hooks, and/or notifications are harder.
These require careful and reply-aware design.
- Treating Exactly-Once as a Business Guarantee
Infrastructure guarantees doesn't mean guaranteed business correctness, side-effect safety, and/or distributed consistency.
- Ignoring Inbox Cleanup
Inbox tables grow continuously. Without cleanup indexes become slower, queries degrade, replay becomes expensive.
Operational maintenance is crucial.
11. When Inbox Pattern Helps
The Inbox Pattern becomes valuable when:
- duplicate processing is dangerous,
- retries are common,
- workflows contain side effects,
- systems use at-least-once delivery, or
- distributed workflows span multiple services.
Especially in:
- payments,
- inventory systems,
- CQRS projections,
- Saga choreography, and
- event-driven microservices.
12. When It Might Be Overkill
Not every system needs Inbox tracking.
For simpler systems like:
- internal tooling,
- low-scale applications,
- naturally idempotent workflows,
- tightly coupled monoliths,
the added complexity may not be justified.
Like most architecture patterns, the goal is not maximum sophistication. The goal is controlled operational reliability.
13. Conclusion
One thing event-driven distributed systems teach is that:
Reliable event publishing is difficult.
Reliable event processing is even harder.
The Outbox Pattern solves:
- “Did the event get published reliably?”
The Inbox Pattern solves:
“Did the event process safely despite retries and duplicates?”
Together, they form the most practical reliability foundations for event-driven systems. Not because they eliminate distributed systems complexity.
But because they acknowledge it honestly.
Assisted ChatGPT to generate diagrams.

Top comments (2)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.