Every payment gateway I've ever worked on had the same hidden bug.
A provider API times out. The code says "failure". So you retry. But the original request actually succeeded – the provider just took too long to respond. Now you've double‑charged the customer.
I built Azums, an open‑source payment gateway in Rust, specifically to stop this pattern.
__The fix: make ambiguity explicit.
Instead of pending → success/fail, I designed a state machine with five states:
-
pending(request sent, waiting) -
succeeded(confirmed success) -
failed(confirmed failure) -
retryable(temporary error, safe to retry) -
unknown(timeout or ambiguous response – needs investigation)
When a timeout happens, the system doesn't guess. It marks the transaction as unknown and stops. No blind retries. No double charges.
Why this matters beyond payments
This same pattern applies anywhere you talk to unreliable external systems:
- Blockchain RPCs that timeout after the transaction was submitted
- AI agent API calls that hang but may have executed
- Messaging queues that lose acknowledgements
Treating ambiguity as a real state is the difference between a system that guesses and a system that you can trust.
My full implementation is on GitHub: BlockForge-Dev/Azums
What's your worst "timeout caused a disaster" story? Let me know in the comments.
Top comments (0)