DEV Community

Cover image for From a For-Loop to a Fault-Tolerant Payout System (₹70L/month, 0 Duplicate Payments)
Mitesh Vasoya
Mitesh Vasoya

Posted on

From a For-Loop to a Fault-Tolerant Payout System (₹70L/month, 0 Duplicate Payments)

How we moved from a fragile loop-based payout system to a reliable, idempotent, and traceable architecture.


Paying Money Is Easy. Paying It Correctly Is Not.

On paper, payouts sound simple:

  • Customer places an order
  • Platform collects payment
  • Platform pays the seller

That's it.

Until you try to do it at scale.


The Real Problem Behind "Simple" Payouts

In any marketplace or fintech system, money flows across multiple parties:

  • Sellers / vendors
  • Delivery partners
  • Platform fees
  • Discounts, vouchers, wallet adjustments

Now you're not just "sending money" — you're managing a financial system.

Which means:

  • ✅ No duplicate payouts
  • ✅ No missing payouts
  • ✅ Full auditability
  • ✅ Strong failure recovery

Scale Changes Everything

What starts small:

₹5L/month

Quickly becomes:

₹70L+/month · Hundreds of payouts per cycle · Multiple failure points

At this stage, mistakes are expensive.


I Started with a Simple For-Loop

Like most systems, I started simple:

async function initiatePayouts(entityIds) {
  for (const id of entityIds) {
    await insertPaymentRecord(id);   // DB write
    await callBankAPI(id);           // Transfer call
    await updatePaymentStatus(id);   // DB write
  }
}
Enter fullscreen mode Exit fullscreen mode

It worked — until it didn't.


What Broke

❌ No Visibility
We couldn't tell what succeeded vs failed.

❌ Duplicate Payouts
Retries caused double payments.

❌ Data Inconsistencies
No transactional guarantees → manual fixes.

❌ No Crash Recovery
Server restart = lost progress.


The Core Problem

A for-loop has no memory.

It doesn't know:

  • What it already processed
  • What failed
  • Where to resume

For financial systems, that's a deal-breaker.


Rethinking Payouts as a System

Redesigned payouts as a lifecycle with checkpoints:

[1] Pre-Payout Reconciliation
[2] Request Creation  (Maker)
[3] Approval          (Checker)
[4] Execution Pipeline
[5] Status Tracking
[6] Post-Payout Reconciliation
[7] Exception Handling
Enter fullscreen mode Exit fullscreen mode

1. Pre-Payout Reconciliation — Trust Nothing

Before moving money, everything is validated:

  • Ledger entries exist
  • Discounts & adjustments match
  • Wallet usage is correct
  • Final payout amount is accurate

Mismatch → block payout.

Fixing a wrong payout is harder than delaying a correct one.


The Ledger — Foundation of Everything

Implemented a double-entry ledger system.

Each financial event creates two entries:

Entry Meaning
Debit Liability created
Credit Liability settled
Outstanding Payable = Total Debits - Total Credits
Enter fullscreen mode Exit fullscreen mode

Guarantees:

  • No debit → no payout
  • No overpayment
  • No duplicate payouts

2–3. Maker–Checker Flow

One person  →  Creates payout batch
Another     →  Approves it
Enter fullscreen mode Exit fullscreen mode

Prevents costly human errors before execution.


4. Execution Pipeline — The Real Shift

Before execution, we validate:

  • No pending transactions
  • No ongoing transfers
  • Valid bank details
  • Ledger consistency
  • No previous payouts for this cycle

Moving to a Queue (Game Changer)

I replaced the for-loop with a queue-based system (Redis + BullMQ).

New Flow:

Insert payment record → pending
          ↓
    Push job to queue
          ↓
    Worker picks job
          ↓
  Re-check status (idempotency)
          ↓
    Execute transfer
          ↓
      Update status
Enter fullscreen mode Exit fullscreen mode

What This Solved

Problem For-Loop Queue System
Visibility ❌ None ✅ Full tracking
Duplicates ❌ High risk ✅ Idempotent
DB Integrity ❌ Weak ✅ Transaction-safe
Crash Recovery ❌ Lost jobs ✅ Persistent

Retries & Failure Handling

  • Automatic retries with exponential backoff
  • Permanent failures → alerts + support tickets
  • No silent failures

5. Status Tracking — No Unknown States

We track every payout using:

  • Webhooks — real-time bank updates
  • Polling fallback — for missed webhook events

No payout is ever left in an unknown state.


6. Post-Payout Reconciliation

We verify across three sources:

Internal system  ←→  Bank records  ←→  Ledger entries
Enter fullscreen mode Exit fullscreen mode

Rule:

System Initiated = Bank Debited = Ledger Credited
Enter fullscreen mode Exit fullscreen mode

Mismatch → alert + investigation.

Daily Audit

Generate daily reports covering:

  • Total initiated
  • Total debited
  • Total credited

Issues are caught before the next payout cycle.


7. Exception Handling

Some cases require manual intervention:

  • Invalid bank details
  • External gateway failures
  • Ledger mismatches

Handled via:

  • Ticketing system
  • Auto-retry after resolution

Design Trade-offs

Decision Choice Trade-off
Safety vs Speed Manual approval checks Slower to execute
Simplicity vs Reliability Queue-based system More infrastructure
Automation vs Control Human approvals Less fully automated

What We Learned

  1. Build the ledger first — everything else depends on it
  2. Reconciliation is not optional — it's the safety net
  3. For-loops fail at scale — queues are the answer
  4. Observability is critical — if you can't see it, you can't fix it
  5. Human oversight still matters — especially at approval gates

Closing Thoughts

We didn't start with a complex system.

We started with:

A loop → Failures → Iteration
Enter fullscreen mode Exit fullscreen mode

As we scaled, the system evolved:

Execution   →  Validation
Functions   →  Systems
Assumptions →  Guarantees
Enter fullscreen mode Exit fullscreen mode

Today, it's not just about moving money.

It's about proving that every transaction is correct.


— Mitesh Vasoya
Backend Engineer · Fintech Systems

Top comments (0)