DEV Community

Cover image for Outbox Pattern in Go Microservices
Serif COLAKEL
Serif COLAKEL

Posted on

Outbox Pattern in Go Microservices

Solving the Dual Write Problem Without Losing Data

Distributed systems fail in uncomfortable ways.

Sometimes the database commit succeeds — but the Kafka publish fails.

Sometimes the event is published — but the transaction rolls back.

And sometimes everything looks successful… until downstream systems realize data is missing.

This is the dual write problem.

If your Go microservice:

  • writes to a database
  • publishes events
  • triggers async workflows
  • integrates with Kafka/RabbitMQ/NATS

then you are already dealing with it — whether you realize it or not.

This article explores how the Outbox Pattern solves this problem safely in production Go systems.


1. The Dual Write Problem

Consider a typical order flow:

HTTP Request
    ↓
Save order to DB
    ↓
Publish "OrderCreated" event
Enter fullscreen mode Exit fullscreen mode

Naive implementation:

func CreateOrder(ctx context.Context, order Order) error {
    err := db.Insert(order)
    if err != nil {
        return err
    }

    err = kafka.Publish("order.created", order)
    if err != nil {
        return err
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

Looks harmless.

But what happens if:

  • DB insert succeeds
  • Kafka publish fails

Now:

  • order exists
  • no event emitted
  • downstream services never know

Your system is inconsistent.


2. Why Distributed Transactions Are Rarely the Answer

Some engineers try:

  • two-phase commit
  • distributed transactions
  • XA protocols

In practice:

  • operationally complex
  • poor performance
  • difficult to scale
  • unsupported by many systems

Modern systems usually prefer:

  • eventual consistency
  • reliable event delivery

This is where the Outbox Pattern shines.


3. Core Idea of the Outbox Pattern

Instead of:

DB write
+
Kafka publish
Enter fullscreen mode Exit fullscreen mode

Do:

DB write
+
Insert event into outbox table
Enter fullscreen mode Exit fullscreen mode

inside the SAME transaction.

Then:

  • background worker publishes events later

Now:

  • either both persist
  • or neither persists

Atomicity restored.


4. Outbox Table Design

Typical schema:

CREATE TABLE outbox_events (
    id UUID PRIMARY KEY,
    event_type TEXT NOT NULL,
    payload JSONB NOT NULL,
    created_at TIMESTAMP NOT NULL,
    processed_at TIMESTAMP,
    retries INT DEFAULT 0
);
Enter fullscreen mode Exit fullscreen mode

Key fields:

  • payload
  • processed status
  • retry count
  • timestamps

This table becomes a durable event queue.


5. Writing to the Outbox (Go Example)

Inside transaction:

func CreateOrder(ctx context.Context, db *sql.DB, order Order) error {
    tx, err := db.BeginTx(ctx, nil)
    if err != nil {
        return err
    }

    defer tx.Rollback()

    _, err = tx.ExecContext(ctx,
        `INSERT INTO orders(id, amount) VALUES($1, $2)`,
        order.ID,
        order.Amount,
    )
    if err != nil {
        return err
    }

    payload, _ := json.Marshal(order)

    _, err = tx.ExecContext(ctx,
        `INSERT INTO outbox_events(id, event_type, payload, created_at)
         VALUES($1, $2, $3, NOW())`,
        uuid.New(),
        "order.created",
        payload,
    )
    if err != nil {
        return err
    }

    return tx.Commit()
}
Enter fullscreen mode Exit fullscreen mode

Now:

  • order + event persist atomically

No dual write inconsistency.


6. Background Publisher Worker

Separate worker:

func StartOutboxPublisher(ctx context.Context, db *sql.DB) {
    ticker := time.NewTicker(2 * time.Second)

    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            publishPendingEvents(ctx, db)
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

7. Publishing Pending Events

func publishPendingEvents(ctx context.Context, db *sql.DB) {
    rows, err := db.QueryContext(ctx,
        `SELECT id, event_type, payload
         FROM outbox_events
         WHERE processed_at IS NULL
         LIMIT 100`)
    if err != nil {
        return
    }

    defer rows.Close()

    for rows.Next() {
        var (
            id        string
            eventType string
            payload   []byte
        )

        rows.Scan(&id, &eventType, &payload)

        err := kafka.Publish(eventType, payload)
        if err != nil {
            continue
        }

        _, _ = db.ExecContext(ctx,
            `UPDATE outbox_events
             SET processed_at = NOW()
             WHERE id = $1`,
            id,
        )
    }
}
Enter fullscreen mode Exit fullscreen mode

Now failures become recoverable:

  • if Kafka fails → retry later
  • event never lost

8. The Hidden Problem: Duplicate Delivery

Outbox guarantees:

  • at-least-once delivery

Not:

  • exactly once

This means:

  • consumer may receive duplicates

Consumers MUST be idempotent.

This connects directly to:

  • retries
  • idempotency keys
  • distributed consistency

9. Handling Retries Properly

Never retry infinitely without control.

Track retries:

retries INT DEFAULT 0
Enter fullscreen mode Exit fullscreen mode

Update:

UPDATE outbox_events
SET retries = retries + 1
Enter fullscreen mode Exit fullscreen mode

Eventually:

  • dead-letter queue
  • manual inspection
  • alerting

10. Polling vs CDC (Change Data Capture)

Simple approach:

  • polling outbox table

Advanced approach:

  • Debezium / WAL streaming
  • CDC-based event publishing

Tradeoff:

Polling CDC
Simple Complex
Easier ops Higher throughput
Slight latency Near real-time

Most systems should start with polling.


11. Concurrency Pitfall: Multiple Workers

If multiple publisher instances run:

Two workers may publish same event.

Solution:

  • row locking

Example:

SELECT *
FROM outbox_events
WHERE processed_at IS NULL
FOR UPDATE SKIP LOCKED
LIMIT 100
Enter fullscreen mode Exit fullscreen mode

This is critical in Kubernetes deployments.


12. Observability Matters

Track:

  • pending outbox size
  • retry count
  • oldest unprocessed event
  • publish latency
  • dead-letter count

Danger signal:

growing outbox table

This means downstream systems are unhealthy.


13. Real Production Failure Story

Classic outage pattern:

  • Kafka degraded
  • API kept accepting writes
  • events silently failed
  • downstream inventory never updated

Without outbox:

  • permanent inconsistency

After outbox:

  • events queued safely
  • Kafka recovered later
  • system healed automatically

This is resilience.


14. Production Lessons

The outbox pattern teaches an important engineering truth:

Reliability is not preventing failure.

It’s surviving failure without losing correctness.

Distributed systems WILL:

  • retry
  • duplicate
  • reorder
  • partially fail

Your architecture must expect this.


Final Thoughts

The Outbox Pattern is one of the most important patterns in modern backend engineering.

It solves:

  • dual write inconsistency
  • event loss
  • partial failures

But it also forces you to think carefully about:

  • idempotency
  • retries
  • observability
  • operational recovery

Reliable distributed systems are not built by hoping failures won’t happen.

They are built by assuming they absolutely will.

Top comments (0)