How we moved from a fragile loop-based payout system to a reliable, idempotent, and traceable architecture.
Paying Money Is Easy. Paying It Correctly Is Not.
On paper, payouts sound simple:
- Customer places an order
- Platform collects payment
- Platform pays the seller
That's it.
Until you try to do it at scale.
The Real Problem Behind "Simple" Payouts
In any marketplace or fintech system, money flows across multiple parties:
- Sellers / vendors
- Delivery partners
- Platform fees
- Discounts, vouchers, wallet adjustments
Now you're not just "sending money" — you're managing a financial system.
Which means:
- ✅ No duplicate payouts
- ✅ No missing payouts
- ✅ Full auditability
- ✅ Strong failure recovery
Scale Changes Everything
What starts small:
₹5L/month
Quickly becomes:
₹70L+/month · Hundreds of payouts per cycle · Multiple failure points
At this stage, mistakes are expensive.
I Started with a Simple For-Loop
Like most systems, I started simple:
async function initiatePayouts(entityIds) {
for (const id of entityIds) {
await insertPaymentRecord(id); // DB write
await callBankAPI(id); // Transfer call
await updatePaymentStatus(id); // DB write
}
}
It worked — until it didn't.
What Broke
❌ No Visibility
We couldn't tell what succeeded vs failed.
❌ Duplicate Payouts
Retries caused double payments.
❌ Data Inconsistencies
No transactional guarantees → manual fixes.
❌ No Crash Recovery
Server restart = lost progress.
The Core Problem
A for-loop has no memory.
It doesn't know:
- What it already processed
- What failed
- Where to resume
For financial systems, that's a deal-breaker.
Rethinking Payouts as a System
Redesigned payouts as a lifecycle with checkpoints:
[1] Pre-Payout Reconciliation
[2] Request Creation (Maker)
[3] Approval (Checker)
[4] Execution Pipeline
[5] Status Tracking
[6] Post-Payout Reconciliation
[7] Exception Handling
1. Pre-Payout Reconciliation — Trust Nothing
Before moving money, everything is validated:
- Ledger entries exist
- Discounts & adjustments match
- Wallet usage is correct
- Final payout amount is accurate
Mismatch → block payout.
Fixing a wrong payout is harder than delaying a correct one.
The Ledger — Foundation of Everything
Implemented a double-entry ledger system.
Each financial event creates two entries:
| Entry | Meaning |
|---|---|
| Debit | Liability created |
| Credit | Liability settled |
Outstanding Payable = Total Debits - Total Credits
Guarantees:
- No debit → no payout
- No overpayment
- No duplicate payouts
2–3. Maker–Checker Flow
One person → Creates payout batch
Another → Approves it
Prevents costly human errors before execution.
4. Execution Pipeline — The Real Shift
Before execution, we validate:
- No pending transactions
- No ongoing transfers
- Valid bank details
- Ledger consistency
- No previous payouts for this cycle
Moving to a Queue (Game Changer)
I replaced the for-loop with a queue-based system (Redis + BullMQ).
New Flow:
Insert payment record → pending
↓
Push job to queue
↓
Worker picks job
↓
Re-check status (idempotency)
↓
Execute transfer
↓
Update status
What This Solved
| Problem | For-Loop | Queue System |
|---|---|---|
| Visibility | ❌ None | ✅ Full tracking |
| Duplicates | ❌ High risk | ✅ Idempotent |
| DB Integrity | ❌ Weak | ✅ Transaction-safe |
| Crash Recovery | ❌ Lost jobs | ✅ Persistent |
Retries & Failure Handling
- Automatic retries with exponential backoff
- Permanent failures → alerts + support tickets
- No silent failures
5. Status Tracking — No Unknown States
We track every payout using:
- Webhooks — real-time bank updates
- Polling fallback — for missed webhook events
No payout is ever left in an unknown state.
6. Post-Payout Reconciliation
We verify across three sources:
Internal system ←→ Bank records ←→ Ledger entries
Rule:
System Initiated = Bank Debited = Ledger Credited
Mismatch → alert + investigation.
Daily Audit
Generate daily reports covering:
- Total initiated
- Total debited
- Total credited
Issues are caught before the next payout cycle.
7. Exception Handling
Some cases require manual intervention:
- Invalid bank details
- External gateway failures
- Ledger mismatches
Handled via:
- Ticketing system
- Auto-retry after resolution
Design Trade-offs
| Decision | Choice | Trade-off |
|---|---|---|
| Safety vs Speed | Manual approval checks | Slower to execute |
| Simplicity vs Reliability | Queue-based system | More infrastructure |
| Automation vs Control | Human approvals | Less fully automated |
What We Learned
- Build the ledger first — everything else depends on it
- Reconciliation is not optional — it's the safety net
- For-loops fail at scale — queues are the answer
- Observability is critical — if you can't see it, you can't fix it
- Human oversight still matters — especially at approval gates
Closing Thoughts
We didn't start with a complex system.
We started with:
A loop → Failures → Iteration
As we scaled, the system evolved:
Execution → Validation
Functions → Systems
Assumptions → Guarantees
Today, it's not just about moving money.
It's about proving that every transaction is correct.
— Mitesh Vasoya
Backend Engineer · Fintech Systems
Top comments (0)