TL;DR
- Reconciliation is the part of a payments stack nobody architects for on day one and everyone pays for on day 200.
- The job: prove that every internal transaction matches the acquirer's settlement file, in the right currency, with the right fees, on the right value date — or surface the diff fast.
- The mechanics: normalize files → land into an events table → project to a read model → diff against the internal read model → buckets for ops to resolve.
- The boring details (file formats, fee parsing, FX rounding, value dates) are where 90% of the work lives.
If you've ever opened a CSV from an acquirer at the end of the month, sorted by amount, and tried to "just match it in Excel" — yes, this post is for you.
What "reconciled" actually means
A transaction is reconciled when, for the same logical payment, three views agree:
- What you sent — your internal record of the charge/payout (your read model).
- What the acquirer says happened — their settlement file or API report.
- What the bank actually credited / debited — the bank statement.
Disagreements are normal. Persistent disagreements are how you lose money slowly and never know.
The shape of a settlement file
Across the major acquirers, settlement files look broadly similar — and broadly different in the places that matter:
| Field | Variants you'll see |
|---|---|
| Transaction reference | acquirer's transaction_id, sometimes plus a merchant_reference round-tripped from you |
| Gross amount | minor units / decimal; transaction currency vs settlement currency |
| Fees | inline per-row, or aggregated at the file footer, or in a separate fees file |
| FX | inline rate vs separate FX file; sometimes only the converted amount |
| Value date | when the bank actually moves money — often T+1/T+2 from event date |
| Adjustments | refunds, chargebacks, fee corrections, reserves — usually mixed in |
| Encoding | UTF-8 if you're lucky; CP1252 / fixed-width / SWIFT MT940 if you're not |
| Granularity | one row per transaction or daily aggregates per merchant or both |
There's no industry-clean schema for this. Plan to write one normalizer per acquirer.
Normalizing into events
The trick that pays off: don't try to reconcile files. Reconcile events, all in the same shape, in one table. Each row in the settlement file becomes an event:
CREATE TABLE settlement_events (
event_id uuid PRIMARY KEY,
source text NOT NULL, -- acq_a / acq_b / bank_x
source_file text NOT NULL, -- filename for traceability
type text NOT NULL, -- charge / refund / chargeback / fee / fx_adjustment / reserve
external_id text NOT NULL, -- acquirer's transaction id
merchant_ref text, -- your charge_id if they round-tripped it
gross_minor bigint NOT NULL,
fee_minor bigint NOT NULL DEFAULT 0,
net_minor bigint NOT NULL,
currency text NOT NULL,
settlement_ccy text NOT NULL,
fx_rate numeric(18,8),
value_date date NOT NULL,
ts_event timestamptz NOT NULL,
raw jsonb NOT NULL
);
CREATE UNIQUE INDEX ON settlement_events (source, external_id, type, value_date);
The raw column is a lifeline: keep the original row as JSON for every event. The first time the normalizer is wrong, you'll need to re-parse without re-downloading.
Two read models
Project to two flat tables — one from your internal events, one from the settlement events:
CREATE TABLE recon_internal (
charge_id uuid PRIMARY KEY,
external_id text,
gross_minor bigint NOT NULL,
currency text NOT NULL,
fee_minor bigint NOT NULL DEFAULT 0,
net_minor bigint NOT NULL,
acquirer text,
status text NOT NULL
);
CREATE TABLE recon_settled (
charge_key text PRIMARY KEY, -- (acquirer, external_id) or fallback (acquirer, merchant_ref)
acquirer text NOT NULL,
external_id text NOT NULL,
gross_minor bigint NOT NULL,
fee_minor bigint NOT NULL,
net_minor bigint NOT NULL,
currency text NOT NULL,
settlement_ccy text NOT NULL,
fx_rate numeric(18,8),
value_date date NOT NULL
);
Now reconciliation is a SQL query against two tables, not a script against a CSV.
The diff
WITH paired AS (
SELECT i.charge_id,
i.external_id,
i.acquirer,
i.gross_minor AS internal_gross,
s.gross_minor AS settled_gross,
i.fee_minor AS internal_fee,
s.fee_minor AS settled_fee,
i.currency AS internal_ccy,
s.currency AS settled_ccy,
s.value_date
FROM recon_internal i
FULL OUTER JOIN recon_settled s
ON s.acquirer = i.acquirer AND s.external_id = i.external_id
)
SELECT *,
CASE
WHEN charge_id IS NULL THEN 'unknown_in_settlement'
WHEN external_id IS NULL THEN 'missing_settlement'
WHEN internal_ccy <> settled_ccy THEN 'currency_mismatch'
WHEN internal_gross <> settled_gross THEN 'gross_mismatch'
WHEN internal_fee <> settled_fee THEN 'fee_mismatch'
ELSE 'ok'
END AS bucket
FROM paired
WHERE charge_id IS NULL OR external_id IS NULL
OR internal_gross <> settled_gross
OR internal_fee <> settled_fee
OR internal_ccy <> settled_ccy;
Six buckets, six runbooks. Ops can drain each bucket independently. The numbers in each bucket are the only health metric reconciliation actually has.
Where the fiddly bits live
The query above is the easy part. The work is everywhere else.
Multi-currency
Two currencies in every settlement row: transaction currency (what the customer paid) and settlement currency (what the acquirer pays you in). Fees can be in either. FX may be inline or in a separate file with a value-date join. Two production rules:
- Store both currencies and the rate. Never store only the converted amount.
- Use a rounding rule chosen once and never argued with again — banker's rounding (half-to-even) is the safe default for finance.
Partial settlements and split files
A high-volume acquirer often splits a day's traffic across multiple files, sometimes across multiple days. Match on external_id, not file boundaries.
Fees that arrive separately
A few acquirers send a fees file independently of transactions. Project both into settlement_events with different types, then aggregate per external_id in the read model. Don't try to keep fees in a parallel structure.
Refunds and chargebacks
Each one is a new event, not an update to the original. They have their own external_id from the acquirer; they reference the original via a parent field that — surprise — is named differently per acquirer (original_transaction_id / refund_for / linked_id). Normalize on ingest.
Value-date drift
The transaction settles in your read model on T+0; the bank credits you on T+2. The reconciliation report can't blow up just because today's file doesn't include yesterday's late-evening charges. Compare by value date window, not "today".
A reusable matching heuristic
When external_id is missing or wrong (it happens), fall back through a small ladder:
def match(charge, candidates):
# 1. exact match on external_id (best)
by_ext = next((c for c in candidates if c.external_id == charge.external_id), None)
if by_ext: return by_ext
# 2. amount + currency + day + acquirer + last4
for c in candidates:
if (c.acquirer == charge.acquirer
and c.gross_minor == charge.gross_minor
and c.currency == charge.currency
and abs((c.value_date - charge.event_date).days) <= 2
and c.last4 == charge.last4):
return c
return None
The last4 is rarely on the settlement row but often on the transaction-detail report; if you can join the two, you've covered ~95% of "missing external_id" rows.
Metrics for a reconciliation function
If you're running this, three numbers belong on the dashboard:
| Metric | Definition | Target |
|---|---|---|
| Match rate (T+1) | % of internal events with a matching settlement event by T+1 close | > 99% steady-state |
| Unmatched aging | oldest unmatched row's age, per bucket | days, not weeks |
| Net delta (per currency, per acquirer) | sum(internal_net) − sum(settled_net) over a day | within rounding |
A reconciliation team's job isn't to do matching — it's to keep these three numbers honest while the engineers fix whichever normalizer is currently lying.
Closing
The settlement file is the truth your books are built on; treat it as a stream of events and the work scales. Treat it as a CSV to wrestle with and the work explodes.
If you're at the point where reconciliation is taking real engineering hours per month, the global payouts solution page lays out the model we use end-to-end (events → read models → diffs → ops queues).
Next in this series: what Verification of Payee and DORA change in your payment integration before the Sept/Jan 2026 deadlines.
*Author: payments engineer at PaynetEasy — we build payment orchestration and global payouts infrastructure → [payneteasy.com](https://payneteasy.com
Top comments (0)