DEV Community

Cover image for Designing a payout ledger as a real-time read model (event sourcing in payments)
Payneteasy
Payneteasy

Posted on

Designing a payout ledger as a real-time read model (event sourcing in payments)

TL;DR

  • A payouts table that gets UPDATEd on every webhook is the bug factory. Replace it with an append-only event log + a read model projected from it.
  • The read model is your "current balance / current state" view — rebuildable any time, idempotent, and cheap to query.
  • Reconciliation against bank/PSP statements becomes a diff between two read models, not a Friday-night batch.
  • Costs are real (storage, projection latency, schema discipline) but smaller than the cost of every payouts bug you'd otherwise pay.

If your payouts table looks like (id, status, amount, updated_at) and your code is full of WHERE status IN ('queued', 'in_flight', 'settled', 'failed'), this post is a refactor sketch worth keeping.

The CRUD-payouts trap

Most payout systems start the same way:

CREATE TABLE payouts (
  id           uuid PRIMARY KEY,
  user_id      uuid NOT NULL,
  amount_cents bigint NOT NULL,
  currency     text NOT NULL,
  status       text NOT NULL,         -- queued / in_flight / settled / failed / reversed
  acquirer     text,
  external_id  text,
  updated_at   timestamptz NOT NULL
);
Enter fullscreen mode Exit fullscreen mode

The bugs follow within a quarter:

  • Webhooks arrive out of order — a settled overwrites a later reversed, and the user-facing balance is wrong.
  • Race between worker and webhook — both try to update status at once; last write wins, and last write is often the stale one.
  • No history — "why does this payout say failed?" has no answer beyond a Slack thread.
  • Reconciliation is a batch job — finance can't trust the table mid-day, so they wait for an overnight reconciliation that ends up patching the table from the bank statement.

Every one of these is structural. Updating a single row with the latest status throws away the truth.

Events as the source of truth

The truth is the sequence of things that happened. Model that, project everything else from it.

CREATE TABLE payout_events (
  event_id     uuid PRIMARY KEY,         -- idempotency
  payout_id    uuid NOT NULL,
  type         text NOT NULL,            -- created / submitted / settled / failed / reversed / fee_charged
  payload      jsonb NOT NULL,
  ts_event     timestamptz NOT NULL,     -- when it happened (PSP/bank time)
  ts_recv      timestamptz NOT NULL,     -- when we received it
  source       text NOT NULL             -- internal / psp_a / bank_statement / ops
);

CREATE INDEX ON payout_events (payout_id, ts_event);
Enter fullscreen mode Exit fullscreen mode

A few conscious choices:

  • event_id is the idempotency key. Same event from the same source arrives twice → second insert is a primary-key conflict → no-op. This is how out-of-order delivery becomes safe.
  • Two timestamps. ts_event is the real clock (what the PSP says happened); ts_recv is your clock (when you saw it). Projections must order by ts_event and break ties deterministically. You'll thank yourself the first time a webhook arrives 6 hours late.
  • source tagged on every row. When the PSP and the bank statement disagree, you need to know which one said what.

Examples of events you'd actually emit:

{ "type": "payout.created",   "payout_id": "p1", "payload": { "amount_cents": 12000, "currency": "EUR", "user_id": "u1" } }
{ "type": "payout.submitted", "payout_id": "p1", "payload": { "acquirer": "acq_a", "external_id": "tx_998" } }
{ "type": "payout.settled",   "payout_id": "p1", "payload": { "external_id": "tx_998", "fee_cents": 35 } }
{ "type": "payout.reversed",  "payout_id": "p1", "payload": { "reason": "wrong_account" } }
Enter fullscreen mode Exit fullscreen mode

The read model — what apps actually query

Apps don't want to fold over event logs. Project to a flat table that's fast to read and trivially rebuildable.

CREATE TABLE payout_state (
  payout_id    uuid PRIMARY KEY,
  user_id      uuid NOT NULL,
  amount_cents bigint NOT NULL,
  currency     text NOT NULL,
  status       text NOT NULL,        -- derived
  fee_cents    bigint NOT NULL DEFAULT 0,
  acquirer     text,
  external_id  text,
  last_event   uuid NOT NULL,        -- pointer back into payout_events
  last_ts      timestamptz NOT NULL
);
Enter fullscreen mode Exit fullscreen mode

The projector is a small, deterministic function:

def project(state, event):
    t = event["type"]
    if t == "payout.created":
        return {
            "payout_id": event["payout_id"],
            "user_id":   event["payload"]["user_id"],
            "amount_cents": event["payload"]["amount_cents"],
            "currency": event["payload"]["currency"],
            "status": "queued",
            "fee_cents": 0,
            "last_event": event["event_id"],
            "last_ts": event["ts_event"],
        }
    if t == "payout.submitted":
        state.update(status="in_flight",
                     acquirer=event["payload"]["acquirer"],
                     external_id=event["payload"]["external_id"])
    elif t == "payout.settled":
        state.update(status="settled",
                     fee_cents=state["fee_cents"] + event["payload"].get("fee_cents", 0))
    elif t == "payout.failed":
        state.update(status="failed")
    elif t == "payout.reversed":
        state.update(status="reversed")
    state["last_event"] = event["event_id"]
    state["last_ts"]    = event["ts_event"]
    return state
Enter fullscreen mode Exit fullscreen mode

Three properties of a good projector:

  1. Pure. No side effects beyond writing the state row. No external calls, no clocks.
  2. Idempotent. Replaying the same event must produce the same result. The event_id short-circuit lives at the caller layer.
  3. Order-tolerant. A late submitted arriving after settled shouldn't downgrade the status. Encode the rule explicitly:
RANK = {"queued": 0, "in_flight": 1, "failed": 2, "settled": 3, "reversed": 4}

def upgrade(state, candidate_status):
    if RANK[candidate_status] >= RANK[state["status"]]:
        state["status"] = candidate_status
Enter fullscreen mode Exit fullscreen mode

That's the entire "out-of-order webhooks" problem solved in five lines.

Rebuilding the read model

The thing event sourcing buys you that the CRUD table can't: you can always rebuild.

def rebuild(payout_id):
    state = {}
    for ev in events_for(payout_id, order_by="ts_event"):
        state = project(state, ev)
    upsert("payout_state", state)
Enter fullscreen mode Exit fullscreen mode

In production you run this on:

  • Schema migrations. Add a new field to payout_state? Backfill by replay, not by UPDATE ... SET.
  • Bug fixes in the projector. A wrong status mapping fixes itself on replay.
  • Reconciliation discrepancies. When a balance is off, dump the events, replay, and the bug is either in the events (missing/extra) or in the projector (logic). One of those two — never both.

Reconciliation as a diff

This is where the model earns the rest of its weight. Take the bank statement, ingest it as events (source = bank_statement), project to a separate read model, and diff:

SELECT s.payout_id, s.amount_cents AS internal, b.amount_cents AS bank
FROM   payout_state s
FULL OUTER JOIN payout_state_bank b USING (payout_id)
WHERE  s.amount_cents IS DISTINCT FROM b.amount_cents
   OR  s.status       IS DISTINCT FROM b.status;
Enter fullscreen mode Exit fullscreen mode

The output is a finite list of disputes-with-the-truth. Each row is either:

  • A missing event on one side (most common) — backfill from the other source.
  • A real money discrepancy (rare, important) — escalate.

Note the asymmetry: you don't correct the read model directly. You add the missing event, replay the projector, and the diff resolves itself. That's the discipline that keeps the books honest.

What it costs you

Honest tradeoffs:

Cost Mitigation
More storage (events live forever) Cold storage / archive tier after N months; events are tiny
Projection latency (a few ms → seconds for high volume) Synchronous projector for the common path; async for bulk events
Schema discipline (event types are forever) Treat event types like API contracts — versioned, deprecated, never silently changed
Operational learning curve One page in the runbook on "how to replay" pays for itself the first incident

These are real but not large. The alternative is debugging a payouts table at midnight every settlement window.

What about your accounting ledger?

Two ledgers, two purposes:

  • The payout read model is for operational state — what's in flight, what settled, what failed. Cached, fast to query, regenerable.
  • The accounting ledger (debits/credits, double-entry) is for financial state — what your books look like. Driven from the same events but with stricter rules and an immutable history that finance can audit.

Don't confuse the two. Both should be projections from the same event log.

Closing

Event sourcing isn't free, but for payouts specifically the alignment is unusually clean: payments are events in the real world, and webhooks are the world telling you about them. The read model is the part your apps query; the events are the truth you can always fall back to.

If you want to see how this fits the wider orchestration picture — global payouts, multi-acquirer settlement, FX, reconciliation — the global payouts solution page has the architecture diagram.

Next in this series: cross-border payment reconciliation — matching multi-currency, multi-acquirer settlement files against the read model above.


Author: payments engineer at PaynetEasy — we build payment orchestration and global payouts infrastructure → payneteasy.com

Top comments (0)