Duplicate processing is not a bug.
It is the default behavior of reliable distributed systems.
Every distributed system eventually faces the same uncomfortable truth:
The moment you introduce:
- Retries
- Failovers
- Message brokers
- Network partitions
- Service crashes
duplicates become inevitable.
Yet many engineers still believe that "exactly-once processing" is something a broker can magically provide.
It isn't.
In this article we'll explore:
- Why exactly-once delivery is mostly a myth
- Why at-least-once is the real-world standard
- How idempotency actually works
- How to build Inbox and Outbox patterns in Go
- How Kafka, PostgreSQL, and Redis fit together in production
๐งจ Exactly-Once Is Not a Transport Property
A message broker cannot guarantee exactly-once processing across your entire system.
Exactly-once delivery requires a Single Point of Truth (SPoT) capable of enforcing uniqueness.
Why?
Because delivery, processing, and persistence are separate failure domains.
Even if a broker behaves perfectly:
- Your service can crash after processing
- Your database can commit while ACK fails
- Your consumer can retry unknowingly
This creates unavoidable duplication scenarios.
๐ Delivery Semantics in Real Systems
| Model | Meaning | Reality |
|---|---|---|
| At-most-once | Message may be lost | Common in fire-and-forget systems |
| At-least-once | Message may be duplicated | Kafka, SQS, RabbitMQ |
| Exactly-once | Message processed once | Only possible inside bounded systems |
๐ In practice, everything is at-least-once.
๐ฅ Why Duplicates Happen
Consumer Crash After Processing
func handle(msg Message) error {
if err := process(msg); err != nil {
return err
}
return ack(msg)
}
`
Failure window:
text
Process Message โ
Commit Database โ
Crash Service โ
ACK Never Sent โ
Result:
text
Broker thinks processing failed
โ
Message redelivered
โ
Duplicate processing
Network Timeout After Success
`go
err := process(msg)
if err != nil {
return err
}
return broker.Ack(msg)
`
What happens if:
text
ACK sent
โ
Network timeout
โ
Broker never receives ACK
The broker retries.
The operation runs again.
Retry Storms
A temporary latency spike can trigger:
text
Client Retry
โ
Gateway Retry
โ
Service Retry
โ
Consumer Retry
Result:
text
1 failure
โ
50 duplicate executions
๐ง The Real Solution: Idempotency
Instead of preventing duplicates:
text
Wrong Question:
How do I prevent duplicates?
Ask:
text
Correct Question:
How do I make duplicates harmless?
That's where idempotency comes in.
An idempotent operation produces the same final state no matter how many times it executes.
๐จ Naive Payment Service
Let's start with a broken implementation.
`go
func Charge(
ctx context.Context,
req PaymentRequest,
) error {
_, err := db.Exec(ctx, `
INSERT INTO payments (
id,
amount
)
VALUES ($1, $2)
`,
req.ID,
req.Amount,
)
return err
}
`
Now imagine:
text
Request arrives
โ
Payment inserted
โ
Response lost
โ
Client retries
โ
Payment inserted again
Customer charged twice.
โ Production Solution #1: Database Unique Constraints
Create an Inbox table.
sql
CREATE TABLE idempotency_keys (
event_id TEXT PRIMARY KEY,
created_at TIMESTAMP DEFAULT now()
);
Attempt to claim the key first.
go
_, err := tx.Exec(ctx,
INSERT INTO idempotency_keys(event_id)
VALUES($1)
`, eventID)
if err != nil {
var pgErr *pgconn.PgError
if errors.As(err, &pgErr) &&
pgErr.Code == "23505" {
return nil
}
return err
}
`
The database becomes the source of truth.
โ ๏ธ Common pgx Pitfall
This subtle bug catches many Go teams.
Wrong:
go
import "github.com/jackc/pgconn"
Correct:
go
import "github.com/jackc/pgx/v5/pgconn"
Otherwise:
go
errors.As(err, &pgErr)
silently fails.
Your duplicate protection stops working.
๐งช Testing Duplicate Safety
Let's simulate 100 concurrent requests.
`go
func TestIdempotency(
t *testing.T,
) {
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
processOrder(
context.Background(),
"same-event-id",
)
}()
}
wg.Wait()
}
`
Expected outcome:
text
100 requests
โ
1 insert succeeds
โ
99 rejected safely
โก Redis Optimization
Postgres guarantees correctness.
Redis improves performance.
Use Redis only as:
text
Fast Lock Layer
Not as:
text
Source of Truth
Atomic Lua Guard:
`lua
local key = KEYS[1]
if redis.call("GET", key) then
return 1
end
redis.call(
"SET",
key,
"PENDING",
"EX",
60
)
return 0
`
This protects the database from thundering herds.
๐ฆ Kafka Exactly-Once: What It Actually Means
Many engineers believe:
`text
Kafka Exactly Once
Business Logic Executes Once
`
Wrong.
Kafka guarantees:
text
Producer
โ
Kafka
โ
Consumer
Kafka does NOT guarantee:
text
Producer
โ
Kafka
โ
Consumer
โ
Postgres
The moment you touch an external database:
exactly-once disappears.
โ ๏ธ Disable Auto Commit
Never rely on Kafka auto commits.
Bad:
go
enable.auto.commit=true
Good:
go
consumer, _ := kafka.NewConsumer(
&kafka.ConfigMap{
"enable.auto.commit": false,
},
)
Process first.
Commit later.
`go
err := handleOrder(
ctx,
db,
msg,
)
if err == nil {
consumer.CommitMessage(msg)
}
`
๐ค Solving the Dual-Write Problem
This architecture is broken:
text
Insert Order
โ
Publish Event
What if Kafka is down?
text
Order exists
โ
Event lost forever
That's the Dual Write Problem.
๐ Transactional Outbox Pattern
Store event publishing intent in the same transaction.
sql
CREATE TABLE outbox (
id UUID PRIMARY KEY,
payload JSONB,
status TEXT DEFAULT 'PENDING'
);
go
_, err = tx.Exec(ctx,
INSERT INTO orders(...)
VALUES(...)
`)
_, err = tx.Exec(ctx, )
INSERT INTO outbox(...)
VALUES(...)
`
Commit both together.
๐ Outbox Worker
`go
type OutboxWorker struct {
db *pgxpool.Pool
broker Broker
}
func (w *OutboxWorker) Run(
ctx context.Context,
) {
ticker := time.NewTicker(
time.Second,
)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
w.processBatch(ctx)
}
}
}
`
โก Scaling Workers Safely
Use PostgreSQL native locking.
sql
SELECT *
FROM outbox
WHERE status = 'PENDING'
FOR UPDATE SKIP LOCKED
LIMIT 10;
Benefits:
- No deadlocks
- No duplicate workers
- Infinite horizontal scaling
๐ณ The Hardest Part: External Side Effects
Database writes are easy.
External systems are not.
Examples:
- Stripe
- Twilio
- SendGrid
- Payment Gateways
Without idempotency:
text
Duplicate Message
โ
Duplicate Payment
โ
Real Money Lost
Always verify external APIs support idempotency keys.
๐ก๏ธ Graceful Shutdown
Kubernetes gives you roughly:
text
30 seconds
before SIGKILL.
Handle shutdown properly.
`go
ctx, stop := signal.NotifyContext(
context.Background(),
syscall.SIGTERM,
)
defer stop()
`
Allow in-flight transactions to finish.
๐ Structured Observability
Track duplicates explicitly.
go
slog.Info(
"duplicate detected",
"event_id",
eventID,
)
If you can't measure duplicates:
you can't prove idempotency works.
๐๏ธ Final Production Architecture
text
Kafka
โ
โผ
Inbox Pattern
(Idempotency)
โ
โผ
PostgreSQL
โ
โผ
Outbox Pattern
โ
โผ
Kafka / RabbitMQ
This architecture embraces reality:
`text
At-Least-Once Delivery
+
Idempotency
+
Recovery
Production Reliability
`
๐ Key Takeaways
- Exactly-once delivery does not exist across distributed systems.
- At-least-once delivery is the industry standard.
- Idempotency transforms duplicates into harmless events.
- PostgreSQL UNIQUE constraints should be the source of truth.
- Redis is an optimization layer, not a consistency layer.
- Kafka auto commits can cause data loss.
- Inbox + Outbox patterns form the backbone of reliable event-driven systems.
- Reliability is not about preventing failures.
- Reliability is about remaining correct despite failures.
Top comments (0)