The bug I kept designing around was not "wrong penalty amount."
Wrong amounts are visible. A supplier sees a credit note for EUR 8400, checks the contract, and pushes back. The operator can investigate. The trail exists.
The quieter failure is a penalty row with no mirror. One side of the system says the buyer is owed money. The other side never records the supplier-facing debit. The total looks correct in the dashboard because the credit row exists. The settlement builder can even pick it up. But the accounting story is incomplete, and it only becomes obvious when someone asks why the counterparty view does not match the buyer view.
That is the sort of error append-only systems can preserve forever.
So I made the ledger refuse single rows.
The system calculates supplier SLA penalties. A missed response target, a delivery window breach, a compounding daily penalty, a tier crossing, or a per-ticket miss becomes money owed. That money then needs to move through three phases: accrual, possible reversal, and settlement. Each phase has a tempting shortcut.
For accrual, the shortcut is one row with amount_cents and direction.
For reversal, the shortcut is updating the original row.
For settlement, the shortcut is marking the original row as settled.
All three shortcuts make the data easier to write and harder to defend.
The Shape That Forced the Design
The PRD had a hard constraint: penalty_ledger is append-only. Disputes, withdrawals, and corrections use compensating entries. Settlement membership lives outside the ledger. That sounds clean until the first application flow has to implement it.
The accrual worker receives a breach, loads the contract and clause, calls RulesEngine.calculatePenalty, and gets one answer back: no penalty, a domain error, an accrued amount, or a capped amount. The amount is not the ledger. It is only a financial fact waiting to become a record.
The ledger record has more obligations:
- a credit side and mirror side
- same amount and currency
- same tenant, contract, counterparty, clause, and breach
- same accrual period
- same entry kind
- same compensation reference when reversing
If any of those differ, the pair is not a pair. It is two rows that happen to be written near each other.
I originally thought the database trigger was the center of the solution. Block update and delete. Done. That part exists:
create trigger penalty_ledger_block_update
before update on penalty_ledger
for each row execute function block_penalty_ledger_mutation();
create trigger penalty_ledger_block_delete
before delete on penalty_ledger
for each row execute function block_penalty_ledger_mutation();
But that only stops mutation after insertion. It does not stop a bad insert. A database that blocks updates can still store a malformed truth forever.
That is where the F# domain layer earns its place.
The Load-Bearing Function
The most important code in the ledger path is not the insert statement. It is LedgerPair.create.
let private validate credit mirror =
if Money.cents credit.Amount <= 0L || Money.cents mirror.Amount <= 0L then
Error LedgerAmountMustBePositive
elif
credit.Direction <> LedgerDirection.CreditOwedToUs
|| mirror.Direction <> LedgerDirection.Mirror
then
Error LedgerPairDirectionInvalid
elif credit.EntryKind <> mirror.EntryKind then
Error LedgerPairKindInvalid
elif credit.Amount <> mirror.Amount then
Error(LedgerPairMismatch "amount must match")
elif
credit.AccrualPeriodStart <> mirror.AccrualPeriodStart
|| credit.AccrualPeriodEnd <> mirror.AccrualPeriodEnd
then
Error(LedgerPairMismatch "period must match")
elif not (sameContext credit mirror) then
Error(LedgerPairMismatch "tenant contract counterparty clause breach context must match")
elif credit.CompensatesLedgerId <> mirror.CompensatesLedgerId then
Error(LedgerPairMismatch "compensating ledger reference must match")
elif credit.AccrualPeriodEnd < credit.AccrualPeriodStart then
Error PeriodInvalid
else
Ok()
That function is deliberately narrow. It does not know about HTTP, Hangfire, PDF rendering, Invoice Recon, or even settlement grouping. It only knows what makes two candidate rows a valid accounting pair.
The private LedgerPair type matters. Callers cannot construct one directly. They can propose two LedgerEntryCandidate records, but only the domain module can expose the pair. That means the application layer does not get to "remember" to validate. It has no valid object unless validation has already happened.
This is the part I got wrong at first in my head: I thought append-only was mainly a persistence rule. It is not. It is a construction rule. Once bad ledger rows are inserted, append-only protects the bad rows just as strongly as the good ones.
Why The Rules Engine Stays Pure
The penalty math also stays outside the database. RulesEngine.calculatePenalty takes a PenaltyCalculationInput and returns a result. No connection string. No repository. No clock except the explicit AsOf value passed in.
That was not aesthetic. The engine needs deterministic recompute. Given the same contract, clause, breach, prior accruals, and timestamp, it should return the same penalty. The snapshot test covers twelve cases: flat penalties, capped penalties, monthly fee proration, tier crossing, overflow tier behavior, currency mismatch, daily compounding cap, missing units, inactive clauses, and pre-contract breaches.
The awkward case is previous accruals. Caps depend on what was already credited. Tiered penalties may need to accrue only the difference between the previous tier and the new one. Compounding daily penalties need to avoid adding the same days twice. So the rules engine receives PreviousAccruals, filters to credit-side rows for the same clause, and calculates incremental money from that prior state.
That looks like a database concern until it breaks. If the rules engine quietly queried the database itself, replay tests would become setup-heavy and timing-sensitive. By making prior state an input, the application layer owns retrieval and the domain layer owns the calculation.
Reversal Without Mutation
The reversal path is where the design either holds or collapses.
When a supplier disputes a breach and wins, the system cannot update the original accrual to zero. It cannot delete the row. It cannot mark the row as false and pretend the old financial position never existed.
It writes a reversal pair.
ReversalEngine.uncompensatedCreditAccruals first looks for accrual rows that do not already have a reversal pointing at them. Then reversalCandidate copies the amount, period, tenant, contract, counterparty, clause, and breach from the original accrual, changes the entry kind to Reversal, and sets CompensatesLedgerId.
That little filter is important. Without it, the same breach could be reversed twice. Append-only would preserve both reversals, and the system would show the supplier owed less than zero. The code does not rely on a user avoiding the button. It filters uncompensated credit rows and writes every reversal inside one transaction with the status change.
The design tradeoff is verbosity. A simple breach flow creates two rows on accrual. Accrual plus reversal creates four rows. A ledger explorer has to explain direction, entry kind, and compensation references. The UI has more work because the database refuses to simplify history for display.
I accept that. Accounting systems should make the audit easy and the write path strict.
Settlement Is Not a Ledger Mutation
Settlement introduced a second temptation: add settlement_id to penalty_ledger.
That would make queries simple. Find uncommitted rows where settlement_id is null. Mark them when the PDF is built. Done.
It would also puncture the ledger invariant. If building a settlement mutates a ledger row, the ledger is no longer an append-only record of penalty facts. It becomes a workflow table.
The repo uses settlement_ledger_entries instead. SettlementsRepository.listUncommittedAccruals selects credit-side accruals for the period, excludes rows that already have active settlement membership, and excludes rows with reversals. Then SettlementsRepository.insert writes the settlement row and membership rows in the same transaction.
That means settlement membership is append-like metadata around the ledger, not a change to the ledger event itself.
The cost is an extra table and a more careful query. The gain is that a settlement can be cancelled or released without rewriting what the penalty ledger said at the time of accrual.
The Part That Still Needs Pressure
The full suite passes: 12 domain tests, 6 data tests, 30 application tests, 10 API tests, and 3 UI tests. The dashboard load audit has tooling for 10000 breaches and 5000 settlements. The fake staging flow proves a Contract Lifecycle event can become an accrual, settlement, Invoice Recon outbox message, and settlement.posted hub event in the local harness.
But the PRD still has open success criteria around staged NATS-to-ledger time, Invoice Recon posting p95, and million-row ledger query behavior. That is the honest limit of the current proof. The invariants are strong. The operational envelope still needs live pressure.
What surprised me is how much of the system exists to protect against its own future convenience. Every obvious shortcut would make one screen or one query easier. Single-row ledger entries. Updating rows for reversals. Marking ledger rows as settled. Reading tenant identity live during PDF posting.
Each shortcut is fine until the first external audit, supplier dispute, or tenant rename.
The lesson I took from this build is narrow: append-only is not a database trigger. It is a system-wide refusal to let later workflow needs rewrite earlier financial facts.
Top comments (0)