DEV Community

Semyon
Semyon

Posted on

How I handle multi-step rollbacks in Go without external infrastructure

The problem

Every non-trivial backend service has operations that span multiple steps. A classic example - placing an order:

  1. Charge the customer's card
  2. Reserve items in the warehouse
  3. Create a shipment

Simple enough. But what happens when step 3 fails? You need to release the reservation and refund the charge. In the right order. With the right data.

Most teams handle this by hand. It looks something like this:

func PlaceOrder(ctx context.Context, req *Request) error {
    chargeID, err := payments.Charge(ctx, req.CardToken, req.Amount)
    if err != nil {
        return err
    }

    reservationID, err := warehouse.Reserve(ctx, req.ItemID)
    if err != nil {
        _ = payments.Refund(ctx, chargeID)
        return err
    }

    if err := shipping.Create(ctx, reservationID); err != nil {
        _ = warehouse.Release(ctx, reservationID)
        _ = payments.Refund(ctx, chargeID)
        return err
    }

    return nil
}
Enter fullscreen mode Exit fullscreen mode

This works. Until it doesn't.

Add a fourth step and you touch every error branch. Forget one _ and you have a silent bug. The compensation logic is scattered across the function instead of being colocated with the step it compensates.

Why not Temporal?

Temporal is a great tool - for the right problem. But it requires running a dedicated server cluster and introduces significant operational complexity for what is essentially a local coordination problem.

If you're already running Temporal, great. But if you just want clean rollback logic in a Go service without spinning up new infrastructure - it's overkill.

A cleaner approach

I wanted something that works like this:

runner := kata.New(
    kata.Step("charge-card", chargeCard).
        Compensate(refundCard).
        Retry(3, kata.Exponential(100*time.Millisecond)),
    kata.Step("reserve-stock", reserveStock).
        Compensate(releaseStock),
    kata.Step("create-shipment", createShipment),
)

if err := runner.Run(ctx, &OrderState{
    CardToken: req.CardToken,
    Amount:    req.Amount,
    ItemID:    req.ItemID,
}); err != nil {
    // refundCard and releaseStock already ran automatically
}
Enter fullscreen mode Exit fullscreen mode

If create-shipment fails, the library automatically calls releaseStock then refundCard - in reverse order, with the full shared state available to each compensation.

No infrastructure. No DSL. Just Go.

Key design decisions

Shared typed state instead of a chain

Each step reads from and writes to a shared struct. This means compensations always have access to IDs and data created by earlier steps - which is exactly what you need for a real refund or release.

type OrderState struct {
    CardToken     string
    Amount        int64
    ChargeID      string // filled by charge-card step
    ReservationID string // filled by reserve-stock step
}

func chargeCard(ctx context.Context, s *OrderState) error {
    id, err := payments.Charge(ctx, s.CardToken, s.Amount)
    s.ChargeID = id
    return err
}

func refundCard(ctx context.Context, s *OrderState) error {
    return payments.Refund(ctx, s.ChargeID)
}
Enter fullscreen mode Exit fullscreen mode

Generics for type safety

The runner is generic over your state type - no interface{}, no casting:

var orderRunner = kata.New(
    kata.Step("charge-card", chargeCard).Compensate(refundCard),
    // ...
)
Enter fullscreen mode Exit fullscreen mode

Two distinct error types

Not all failures are equal. If a step fails and all compensations run successfully - that's a clean rollback. If a compensation also fails - that's a potential data inconsistency requiring manual intervention.

var stepErr *kata.StepError
var compErr *kata.CompensationError

switch {
case errors.As(err, &stepErr):
    log.Printf("step %q failed: %v", stepErr.StepName, stepErr.Cause)

case errors.As(err, &compErr):
    pagerduty.Fire(compErr)
}
Enter fullscreen mode Exit fullscreen mode

Parallel steps

Sometimes you want to run steps concurrently - like sending email, SMS, and push notifications at the same time:

kata.Parallel("notify",
    kata.Step("email", sendEmail),
    kata.Step("sms", sendSMS).Compensate(cancelSMS),
    kata.Step("push", sendPush),
)
Enter fullscreen mode Exit fullscreen mode

If any step in the group fails, the others are cancelled and successful ones are compensated.

What it doesn't do

This library is intentionally scoped. It does not:

  • Persist state to a database (no crash recovery)
  • Coordinate across services over a network
  • Replace Temporal for long-running workflows

If you need those things - use Temporal. kata is for in-process coordination where you want clean rollback logic without the operational overhead.

Try it

go get github.com/kerlenton/kata
Enter fullscreen mode Exit fullscreen mode

GitHub: https://github.com/kerlenton/kata

Zero dependencies, requires Go 1.22+. Still early - feedback on API ergonomics especially welcome.

Top comments (0)