DEV Community

Cover image for Why Building a Ledger Is Harder Than It Looks
Tomas Almeida
Tomas Almeida

Posted on

Why Building a Ledger Is Harder Than It Looks

On August 1, 2012, Knight Capital pushed a faulty deployment to production. Within 45 minutes, its router fired more than 4 million unintended orders into the market, and the firm lost over $460 million (U.S. SEC, 2013). One build, one morning, one company nearly gone.

Most money systems don’t fail like that. They fail quietly: a balance drifting by a cent every few thousand operations, a webhook replay that doubles a payout, a reconciliation job that “almost matches” forever.

We’ve spent years building a double-entry ledger for a crypto payments system. What follows isn’t theory. It’s the set of constraints we ended up needing once the system started seeing real money, retries, and adversarial failure modes.


Why Floating Point Breaks Finance Systems

Floating point isn’t “slightly imprecise.” It’s structurally incompatible with exact accounting.

$1.23 USD    ->  123
1.23 USDC     ->  1230000
0.5 ETH       ->  500000000000000000
Enter fullscreen mode Exit fullscreen mode

The rule we landed on is simple: money is always an integer in the smallest unit of the asset.

Not decimals. Not floats. Not “precision-aware” types that still round under the hood.

We use numeric(38, 0) so fractional units cannot exist at rest. If a half-unit appears in a calculation, it fails before it becomes debt.

The real complexity shows up elsewhere: fees.

A 1% fee on 99,999,999 units produces a remainder that must be deterministically assigned. If you don’t decide where that remainder goes, your ledger will drift. Slowly. Invisibly. Permanently.


Enforcing Double-Entry at the Database Layer

Application-level invariants don’t survive production reality.

Backfills, scripts, incident patches, and “temporary” admin queries bypass them instantly.

So we push the invariant into Postgres.

The rule is strict:

Within a transaction, debits must equal credits per currency.

And it is enforced at commit time, not row time.

CREATE CONSTRAINT TRIGGER postings_must_balance
AFTER INSERT ON postings
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE FUNCTION enforce_balanced_postings();
Enter fullscreen mode Exit fullscreen mode

The important part is not the trigger itself. It’s the deferred validation.

Without deferral, you can’t validate a group of rows that doesn’t exist yet.

With it, Postgres becomes a transactional accounting engine.

flowchart TD
  A[Insert postings rows] --> B[Transaction open]
  B --> C[Deferred trigger at commit]
  C --> D{Debits == Credits per currency?}
  D -->|Yes| E[Commit]
  D -->|No| F[Reject transaction]
Enter fullscreen mode Exit fullscreen mode

This is one of those rare cases where “database constraint” is not defensive programming. It is the system boundary.


Idempotency Is Not a Feature, It’s the System Model

Every payment system is at-least-once.

Not “sometimes.” Always.

Stripe explicitly retries webhooks for days and may deliver duplicates by design.

That means your system is not correct if retries are handled at the edge. It must be correct under duplication everywhere.

We treat idempotency as a hard uniqueness rule:

One intent → one transaction, forever.

flowchart TD
  A[Incoming request] --> B{Idempotency key seen?}

  B -->|Yes| C[Return existing transaction]
  B -->|No| D[Insert transaction]

  D --> E{Unique constraint hit?}
  E -->|Yes| C
  E -->|No| F[Create postings]
Enter fullscreen mode Exit fullscreen mode

The real guarantee is the database index:

  • (ledger_id, idempotency_key) is the source of truth
  • cache is only a performance optimization
  • concurrency collapses naturally into a single winner

If two requests race, one wins the insert, the other resolves to it. No locks. No coordination layer. Just relational constraints doing what they were designed for.


Append-Only Is an Audit Model, Not a Storage Choice

The hardest habit to break is mutability.

CRUD thinking says: fix incorrect rows.

Ledger thinking says: never change history, only extend it.

flowchart LR
  A[Original entry] --> B[Posted]
  B --> C{Error discovered}
  C --> D[Reversal entry]
  D --> E[Corrective entry]
Enter fullscreen mode Exit fullscreen mode

This is not aesthetic purity. It’s operational safety.

If you allow updates:

  • you destroy audit trails
  • you break reproducibility
  • you make reconciliation non-deterministic

If you forbid updates:

  • every correction becomes explicit
  • every mistake is traceable
  • every balance is recomputable from first principles

We enforce this at two levels:

  • application layer rejects updates/deletes
  • database triggers enforce it as a final gate

The only mutable fields are metadata (tags, references). Money fields are immutable by design.


Exchange Rates Must Be Frozen, Not Recomputed

Any call to “current price” inside a ledger pipeline is a correctness bug.

Not a race condition. A correctness bug.

Rates are captured once:

exchange_rate   decimal(38, 18)
base_currency   text
quote_currency  text
Enter fullscreen mode Exit fullscreen mode

And stored on every affected entry.

flowchart TD
  A[Payment confirmed] --> B[Fetch FX rate]
  B --> C[Write rate into ledger entries]
  C --> D[Downstream systems use snapshot only]
Enter fullscreen mode Exit fullscreen mode

This ensures that:

  • retries don’t change accounting
  • reports are deterministic across time
  • historical balances remain reproducible

A refund six months later uses the original rate, not today’s market rate. That’s not “approximate fairness.” It’s consistency.


Reconciliation Is Where Systems Stop Agreeing With Each Other

Everything before this point ensures internal consistency.

Reconciliation is where reality enters.

And reality is messy.

External systems differ in:

  • timing
  • batching
  • rounding rules
  • partial settlements
  • eventual consistency windows

So reconciliation cannot be a simple equality check.

It becomes a matching engine with strategies.

flowchart TD
  A[External records] --> B[Reconciliation engine]

  B --> C[Exact match]
  B --> D[Windowed sum match]
  B --> E[Tolerance-based match]

  C --> F[Mark reconciled]
  D --> F
  E --> F

  B --> G[Unmatched queue for review]
Enter fullscreen mode Exit fullscreen mode

Key design choice: reconciliation writes back into the ledger.

A posting is either:

  • reconciled
  • tolerably auto-matched
  • or explicitly flagged for review

No spreadsheets. No external truth tables.

The ledger is the system of record, not the reconciliation output.


The Minimal Model Behind a Payment Gateway

At runtime, the entire system collapses into four accounts:

graph TD
  A[Crypto Received] --> B[Merchant Payable]
  A --> C[Fee Revenue]
  B --> D[Payout Disbursed]
Enter fullscreen mode Exit fullscreen mode

That simplicity is misleading.

The correctness doesn’t come from the graph.

It comes from everything around it:

  • integer-only math
  • deferred constraints
  • idempotency at the database level
  • append-only history
  • frozen exchange rates
  • reconciliation as a write-back process

Without those, the diagram is just wishful thinking.

With them, it becomes a financial system.


Closing Thought

None of this is exotic. That’s the uncomfortable part.

Each rule is simple in isolation. Every failure mode only appears under production pressure: retries, concurrency, partial failure, and time.

The systems that survive are the ones that assume:

  • the network will duplicate requests
  • the database is the only trustworthy boundary
  • time will invalidate assumptions
  • humans will eventually need to audit everything

A ledger isn’t a data model.

It’s a collection of enforced invariants that happen to store money.

If you don’t want to build all of this from scratch, there’s also SouthPay Ledger: SouthPay Ledger

Top comments (0)