DEV Community

Cover image for ๐Ÿง  Exactly-Once Processing in Go: The Myth, Reality, and Production Patterns
Serif COLAKEL
Serif COLAKEL

Posted on

๐Ÿง  Exactly-Once Processing in Go: The Myth, Reality, and Production Patterns

Duplicate processing is not a bug.

It is the default behavior of reliable distributed systems.

Every distributed system eventually faces the same uncomfortable truth:

The moment you introduce:

  • Retries
  • Failovers
  • Message brokers
  • Network partitions
  • Service crashes

duplicates become inevitable.

Yet many engineers still believe that "exactly-once processing" is something a broker can magically provide.

It isn't.

In this article we'll explore:

  • Why exactly-once delivery is mostly a myth
  • Why at-least-once is the real-world standard
  • How idempotency actually works
  • How to build Inbox and Outbox patterns in Go
  • How Kafka, PostgreSQL, and Redis fit together in production

๐Ÿงจ Exactly-Once Is Not a Transport Property

A message broker cannot guarantee exactly-once processing across your entire system.

Exactly-once delivery requires a Single Point of Truth (SPoT) capable of enforcing uniqueness.

Why?

Because delivery, processing, and persistence are separate failure domains.

Even if a broker behaves perfectly:

  • Your service can crash after processing
  • Your database can commit while ACK fails
  • Your consumer can retry unknowingly

This creates unavoidable duplication scenarios.


๐Ÿ“Š Delivery Semantics in Real Systems

Model Meaning Reality
At-most-once Message may be lost Common in fire-and-forget systems
At-least-once Message may be duplicated Kafka, SQS, RabbitMQ
Exactly-once Message processed once Only possible inside bounded systems

๐Ÿ‘‰ In practice, everything is at-least-once.


๐Ÿ’ฅ Why Duplicates Happen

Consumer Crash After Processing

func handle(msg Message) error {
    if err := process(msg); err != nil {
        return err
    }

    return ack(msg)
}
Enter fullscreen mode Exit fullscreen mode


`

Failure window:

text
Process Message โœ…
Commit Database โœ…
Crash Service โŒ
ACK Never Sent โŒ

Result:

text
Broker thinks processing failed
โ†“
Message redelivered
โ†“
Duplicate processing


Network Timeout After Success

`go
err := process(msg)

if err != nil {
return err
}

return broker.Ack(msg)
`

What happens if:

text
ACK sent
โ†“
Network timeout
โ†“
Broker never receives ACK

The broker retries.

The operation runs again.


Retry Storms

A temporary latency spike can trigger:

text
Client Retry
โ†“
Gateway Retry
โ†“
Service Retry
โ†“
Consumer Retry

Result:

text
1 failure
โ†“
50 duplicate executions


๐Ÿง  The Real Solution: Idempotency

Instead of preventing duplicates:

text
Wrong Question:
How do I prevent duplicates?

Ask:

text
Correct Question:
How do I make duplicates harmless?

That's where idempotency comes in.

An idempotent operation produces the same final state no matter how many times it executes.


๐Ÿšจ Naive Payment Service

Let's start with a broken implementation.

`go
func Charge(
ctx context.Context,
req PaymentRequest,
) error {

_, err := db.Exec(ctx, `
    INSERT INTO payments (
        id,
        amount
    )
    VALUES ($1, $2)
`,
    req.ID,
    req.Amount,
)

return err
Enter fullscreen mode Exit fullscreen mode

}
`

Now imagine:

text
Request arrives
โ†“
Payment inserted
โ†“
Response lost
โ†“
Client retries
โ†“
Payment inserted again

Customer charged twice.


โœ… Production Solution #1: Database Unique Constraints

Create an Inbox table.

sql
CREATE TABLE idempotency_keys (
event_id TEXT PRIMARY KEY,
created_at TIMESTAMP DEFAULT now()
);

Attempt to claim the key first.

go
_, err := tx.Exec(ctx,

INSERT INTO idempotency_keys(event_id)
VALUES($1)
`, eventID)

if err != nil {
var pgErr *pgconn.PgError

if errors.As(err, &pgErr) &&
   pgErr.Code == "23505" {

    return nil
}

return err
Enter fullscreen mode Exit fullscreen mode

}
`

The database becomes the source of truth.


โš ๏ธ Common pgx Pitfall

This subtle bug catches many Go teams.

Wrong:

go
import "github.com/jackc/pgconn"

Correct:

go
import "github.com/jackc/pgx/v5/pgconn"

Otherwise:

go
errors.As(err, &pgErr)

silently fails.

Your duplicate protection stops working.


๐Ÿงช Testing Duplicate Safety

Let's simulate 100 concurrent requests.

`go
func TestIdempotency(
t *testing.T,
) {
var wg sync.WaitGroup

for i := 0; i < 100; i++ {
    wg.Add(1)

    go func() {
        defer wg.Done()

        processOrder(
            context.Background(),
            "same-event-id",
        )
    }()
}

wg.Wait()
Enter fullscreen mode Exit fullscreen mode

}
`

Expected outcome:

text
100 requests
โ†“
1 insert succeeds
โ†“
99 rejected safely


โšก Redis Optimization

Postgres guarantees correctness.

Redis improves performance.

Use Redis only as:

text
Fast Lock Layer

Not as:

text
Source of Truth

Atomic Lua Guard:

`lua
local key = KEYS[1]

if redis.call("GET", key) then
return 1
end

redis.call(
"SET",
key,
"PENDING",
"EX",
60
)

return 0
`

This protects the database from thundering herds.


๐Ÿ“ฆ Kafka Exactly-Once: What It Actually Means

Many engineers believe:

`text

Kafka Exactly Once

Business Logic Executes Once
`

Wrong.

Kafka guarantees:

text
Producer
โ†“
Kafka
โ†“
Consumer

Kafka does NOT guarantee:

text
Producer
โ†“
Kafka
โ†“
Consumer
โ†“
Postgres

The moment you touch an external database:

exactly-once disappears.


โš ๏ธ Disable Auto Commit

Never rely on Kafka auto commits.

Bad:

go
enable.auto.commit=true

Good:

go
consumer, _ := kafka.NewConsumer(
&kafka.ConfigMap{
"enable.auto.commit": false,
},
)

Process first.

Commit later.

`go
err := handleOrder(
ctx,
db,
msg,
)

if err == nil {
consumer.CommitMessage(msg)
}
`


๐Ÿ“ค Solving the Dual-Write Problem

This architecture is broken:

text
Insert Order
โ†“
Publish Event

What if Kafka is down?

text
Order exists
โ†“
Event lost forever

That's the Dual Write Problem.


๐Ÿš€ Transactional Outbox Pattern

Store event publishing intent in the same transaction.

sql
CREATE TABLE outbox (
id UUID PRIMARY KEY,
payload JSONB,
status TEXT DEFAULT 'PENDING'
);

go
_, err = tx.Exec(ctx,

INSERT INTO orders(...)
VALUES(...)
`)

_, err = tx.Exec(ctx,
INSERT INTO outbox(...)
VALUES(...)
)
`

Commit both together.


๐Ÿ”„ Outbox Worker

`go
type OutboxWorker struct {
db *pgxpool.Pool
broker Broker
}

func (w *OutboxWorker) Run(
ctx context.Context,
) {
ticker := time.NewTicker(
time.Second,
)

defer ticker.Stop()

for {
    select {
    case <-ctx.Done():
        return

    case <-ticker.C:
        w.processBatch(ctx)
    }
}
Enter fullscreen mode Exit fullscreen mode

}
`


โšก Scaling Workers Safely

Use PostgreSQL native locking.

sql
SELECT *
FROM outbox
WHERE status = 'PENDING'
FOR UPDATE SKIP LOCKED
LIMIT 10;

Benefits:

  • No deadlocks
  • No duplicate workers
  • Infinite horizontal scaling

๐Ÿ’ณ The Hardest Part: External Side Effects

Database writes are easy.

External systems are not.

Examples:

  • Stripe
  • Twilio
  • SendGrid
  • Payment Gateways

Without idempotency:

text
Duplicate Message
โ†“
Duplicate Payment
โ†“
Real Money Lost

Always verify external APIs support idempotency keys.


๐Ÿ›ก๏ธ Graceful Shutdown

Kubernetes gives you roughly:

text
30 seconds

before SIGKILL.

Handle shutdown properly.

`go
ctx, stop := signal.NotifyContext(
context.Background(),
syscall.SIGTERM,
)

defer stop()
`

Allow in-flight transactions to finish.


๐Ÿ“ˆ Structured Observability

Track duplicates explicitly.

go
slog.Info(
"duplicate detected",
"event_id",
eventID,
)

If you can't measure duplicates:

you can't prove idempotency works.


๐Ÿ—๏ธ Final Production Architecture

text
Kafka
โ”‚
โ–ผ
Inbox Pattern
(Idempotency)
โ”‚
โ–ผ
PostgreSQL
โ”‚
โ–ผ
Outbox Pattern
โ”‚
โ–ผ
Kafka / RabbitMQ

This architecture embraces reality:

`text
At-Least-Once Delivery
+
Idempotency
+

Recovery

Production Reliability
`


๐Ÿš€ Key Takeaways

  • Exactly-once delivery does not exist across distributed systems.
  • At-least-once delivery is the industry standard.
  • Idempotency transforms duplicates into harmless events.
  • PostgreSQL UNIQUE constraints should be the source of truth.
  • Redis is an optimization layer, not a consistency layer.
  • Kafka auto commits can cause data loss.
  • Inbox + Outbox patterns form the backbone of reliable event-driven systems.
  • Reliability is not about preventing failures.
  • Reliability is about remaining correct despite failures.

Top comments (0)