Payment failures are rarely just “failed payments” ⚙️
A customer may see money debited.
The app may show timeout.
The backend may keep the transaction pending.
The payment processor may confirm success a few minutes later.
All systems can be correct from their own boundary, but the product experience still breaks.
The real architecture problem
The mistake is treating payment as a single success/failed flag.
In real systems, these are separate flows:
- Retry
- Reversal
- Refund
- Settlement
- Reconciliation
Each flow has different owners, timelines, risks, audit requirements, and customer impact.
What strong payment systems usually need
A production-grade payment architecture should include:
- Durable payment attempt IDs
- Idempotent retry handling
- Controlled state transitions
- Event deduplication
- Pending verification states
- Append-only ledger entries
- Settlement mapping
- Reconciliation jobs
- Support-visible operational status
The goal is not only to process payments.
The goal is to explain what happened when money moved, confirmation was delayed, refund failed, or settlement did not match.
Mobile should not decide payment truth
The mobile app should not be the final source of truth for payment completion.
It should recover safely from:
- Timeout
- App switch
- SDK return
- Browser return
- Delayed callbacks
- Network loss
After returning from a payment flow, the app should ask the backend for the authoritative payment status instead of assuming success or failure locally.
Why this matters
A timeout is not always a failed payment.
A successful payment is not always settled.
A refund initiated event does not always mean money has reached the customer.
A callback should not directly overwrite business state without validation.
That is the difference between integrating a payment gateway and designing a payment system.
Read the full article
I wrote a detailed Medium article on designing payment failure architecture across mobile, backend, platform, finance, and support boundaries.
👉 Read the full article here:
https://medium.com/@vaibhav.shakya786/payment-failure-architecture-designing-retry-reversal-refund-settlement-and-reconciliation-f5556de08038
Top comments (1)
This nails the part most gateway integrations miss — the mobile return is an event, not a verdict. We run open banking payments where the bank confirms async on its own clock, so the app coming back "success" means "go ask the backend," never "show the receipt." The addition I'd make to your list: treat the reconciliation job, not your own DB, as the final arbiter. Your DB being internally consistent is the easy half; staying consistent with money that already moved at the rail is the half that actually pages you. Splitting retry/reversal/refund/settlement into separate state machines instead of one status flag is the best call in here.