In this article, we will explore how to coordinate distributed transactions in Go using the Saga pattern.
Coordinating Distributed Transactions Without Distributed Transactions
Modern distributed systems are built from independently deployable services.
That flexibility comes with a cost.
The moment a business operation spans multiple services, a simple database transaction is no longer enough.
Imagine an e-commerce checkout flow:
Order Service
↓
Payment Service
↓
Inventory Service
↓
Shipping Service
What happens if:
- the order is created successfully
- inventory is reserved
- payment fails
You now have partially completed work spread across multiple services.
In a monolith, you would simply roll back the transaction.
In a distributed system, there is no single transaction to roll back.
This is where the Saga Pattern comes in.
Instead of relying on distributed transactions, a Saga coordinates a series of local transactions and compensating actions to maintain business consistency.
In this article, we'll explore how production Go systems implement Saga workflows, the trade-offs involved, and practical patterns you can use in real microservices.
The Problem With Distributed Transactions
In a monolithic application, business operations are often protected by a single database transaction.
BEGIN;
INSERT INTO orders (...);
UPDATE inventory
SET quantity = quantity - 1;
INSERT INTO payments (...);
COMMIT;
Either everything succeeds or everything rolls back.
Life is good.
Microservices change the rules.
Each service owns its own database:
Order Service -> orders_db
Payment Service -> payments_db
Inventory Service -> inventory_db
Shipping Service -> shipping_db
No single transaction spans all of them.
Some teams attempt:
- Two-Phase Commit (2PC)
- XA Transactions
- Distributed Locks
In theory they provide consistency.
In practice they introduce:
- operational complexity
- reduced availability
- tight coupling
- performance bottlenecks
Most modern systems choose a different path:
Accept eventual consistency and design for recovery.
What Is a Saga?
A Saga is a sequence of local transactions.
Each step:
- Performs some business action
- Commits locally
- Triggers the next step
If a later step fails:
- previously completed steps execute compensating actions
Think of it as a distributed rollback mechanism.
Traditional Transaction
BEGIN
Step A
Step B
Step C
COMMIT
Failure:
ROLLBACK
Saga Transaction
Step A ✓
Step B ✓
Step C ✗
Compensate B
Compensate A
Instead of undoing database state through a transaction log, we undo business actions through explicit compensation.
A Real Production Example
Consider an online marketplace.
Checkout workflow:
Create Order
Reserve Inventory
Charge Payment
Create Shipment
Everything looks simple until a dependency fails.
Scenario:
Order Created ✓
Inventory Reserved ✓
Payment Failed ✗
Inventory is now locked.
Customers cannot buy those products.
Warehouse reports incorrect stock.
This is a real production issue many teams encounter.
The solution is compensation.
Defining Saga Steps in Go
Let's start with a generic Saga implementation.
type Step struct {
Name string
Execute func(context.Context) error
Compensate func(context.Context) error
}
Each step knows:
- how to execute
- how to undo itself
Now define the Saga.
type Saga struct {
steps []Step
}
Executing a Saga
func (s *Saga) Execute(ctx context.Context) error {
var completed []Step
for _, step := range s.steps {
if err := step.Execute(ctx); err != nil {
s.rollback(ctx, completed)
return fmt.Errorf(
"saga failed at step %s: %w",
step.Name,
err,
)
}
completed = append(completed, step)
}
return nil
}
If a step fails:
- rollback starts immediately
- previously completed steps are compensated
Implementing Compensation
func (s *Saga) rollback(
ctx context.Context,
completed []Step,
) {
for i := len(completed) - 1; i >= 0; i-- {
step := completed[i]
if err := step.Compensate(ctx); err != nil {
log.Printf(
"compensation failed for %s: %v",
step.Name,
err,
)
}
}
}
Compensation happens in reverse order.
Just like a stack unwind.
Production Checkout Workflow
Let's model an order process.
Step 1: Create Order
func createOrder(
ctx context.Context,
orderID string,
) error {
log.Printf("order created: %s", orderID)
return nil
}
Compensation:
func cancelOrder(
ctx context.Context,
orderID string,
) error {
log.Printf("order cancelled: %s", orderID)
return nil
}
Step 2: Reserve Inventory
func reserveInventory(
ctx context.Context,
productID string,
) error {
log.Printf(
"inventory reserved: %s",
productID,
)
return nil
}
Compensation:
func releaseInventory(
ctx context.Context,
productID string,
) error {
log.Printf(
"inventory released: %s",
productID,
)
return nil
}
Step 3: Charge Payment
func chargePayment(
ctx context.Context,
orderID string,
) error {
return errors.New(
"payment provider unavailable",
)
}
Compensation:
func refundPayment(
ctx context.Context,
orderID string,
) error {
log.Printf(
"payment refunded: %s",
orderID,
)
return nil
}
Running the Saga
saga := Saga{
steps: []Step{
{
Name: "Create Order",
Execute: func(ctx context.Context) error {
return createOrder(ctx, "order-123")
},
Compensate: func(ctx context.Context) error {
return cancelOrder(ctx, "order-123")
},
},
{
Name: "Reserve Inventory",
Execute: func(ctx context.Context) error {
return reserveInventory(
ctx,
"product-1",
)
},
Compensate: func(ctx context.Context) error {
return releaseInventory(
ctx,
"product-1",
)
},
},
{
Name: "Charge Payment",
Execute: func(ctx context.Context) error {
return chargePayment(
ctx,
"order-123",
)
},
Compensate: func(ctx context.Context) error {
return refundPayment(
ctx,
"order-123",
)
},
},
},
}
err := saga.Execute(context.Background())
Output:
order created
inventory reserved
payment provider unavailable
inventory released
order cancelled
Business consistency restored.
Compensation Is Not Rollback
This is one of the biggest misconceptions.
Many engineers assume:
Compensation == Rollback
It doesn't.
Consider payment processing.
You cannot magically undo:
Bank Transfer
Credit Card Charge
Email Sent
SMS Delivered
You can only perform another business action.
Examples:
Charge Card
↓
Refund Card
Create Shipment
↓
Cancel Shipment
These are not the same thing.
Compensation is business logic.
Choreography vs Orchestration
Two common Saga styles exist.
Choreography
Services communicate through events.
OrderCreated
↓
InventoryReserved
↓
PaymentProcessed
↓
ShipmentCreated
Each service reacts independently.
Advantages:
- loosely coupled
- scalable
- no central coordinator
Disadvantages:
- difficult debugging
- event explosion
- hidden dependencies
Large systems often struggle with visibility.
Orchestration
A dedicated coordinator controls the flow.
Saga Orchestrator
↓
Inventory
↓
Payment
↓
Shipping
Advantages:
- easier monitoring
- centralized workflow
- simpler debugging
Disadvantages:
- additional component
- orchestration logic grows over time
Many enterprise systems prefer orchestration because operational visibility matters.
Handling Retries Properly
Distributed systems fail.
Compensation can fail too.
Consider:
Payment Failed
↓
Release Inventory
↓
Inventory Service Down
Now rollback itself has failed.
Production systems usually implement:
- retries
- dead-letter queues
- manual recovery workflows
Example:
func retry(
ctx context.Context,
attempts int,
fn func() error,
) error {
for i := 0; i < attempts; i++ {
if err := fn(); err == nil {
return nil
}
time.Sleep(
time.Duration(i+1) *
time.Second,
)
}
return errors.New(
"retry attempts exhausted",
)
}
Never assume compensation always succeeds.
Saga + Outbox Pattern
This is where things become interesting.
Most production systems combine:
Saga
+
Outbox Pattern
Why?
Because Saga introduces events:
OrderCreated
InventoryReserved
PaymentCompleted
Those events must be delivered reliably.
The Outbox Pattern guarantees:
- no event loss
- atomic persistence
- safe retries
This combination is extremely common in modern microservices.
Idempotency Is Mandatory
Compensation may execute twice.
Retries may happen.
Network failures may duplicate requests.
Your operations must tolerate duplication.
Bad:
inventory -= 10
Good:
if reservationAlreadyReleased {
return nil
}
Idempotency is not optional.
It is foundational to Saga reliability.
Observability Matters
Track:
- saga started
- saga completed
- saga compensated
- compensation failures
- execution duration
- retry count
Useful metrics:
saga_execution_total
saga_compensation_total
saga_failure_total
saga_duration_seconds
If you cannot observe Saga behavior, you will eventually debug failures through database queries at 3 AM.
A Production Incident
A payment provider began timing out during a Black Friday campaign.
Order creation succeeded.
Inventory reservations succeeded.
Payment confirmations never arrived.
Without compensation:
50,000 products locked
Customers could not purchase inventory that physically existed.
The warehouse team believed stock was depleted.
After implementing Saga compensation:
Payment Timeout
↓
Inventory Released
↓
Order Cancelled
The system recovered automatically.
No manual intervention required.
This is exactly the type of failure Saga patterns are designed to handle.
Key Takeaways
- Distributed transactions rarely scale well in microservices.
- Saga patterns embrace eventual consistency rather than fighting it.
- Compensation is business logic, not database rollback.
- Retries and idempotency are mandatory.
- Most production systems combine Saga and Outbox patterns.
- Observability is critical for debugging distributed workflows.
Microservices make distributed failures inevitable.
Saga patterns don't eliminate those failures.
They make them survivable.
And in production systems, survivability is often more important than perfection.
Happy Coding 🚀
Top comments (0)