Idempotency, Explained Through the Retry That Doesn't Double-Charge

#webdev #tutorial

You click "Pay," the spinner hangs, and nothing happens. So you click again. Behind the scenes, the first request actually succeeded — your bank approved the charge — but the response never made it back to your phone because the connection dropped. Your second click fires an identical request. Without protection, you just paid twice.

This is the problem idempotency solves. The word sounds academic, but the failure it prevents is concrete: the same operation running more than once and changing your data more than once. We'll walk through it using the payment example because it's the one where the cost of getting it wrong is measured in real dollars and refund tickets.

What "idempotent" actually means

An operation is idempotent if running it once produces the same result as running it ten times. The classic distinction lives in HTTP. GET /account/123 is naturally idempotent — reading your balance a hundred times doesn't change it. DELETE /session/abc is idempotent too: the session is gone after the first call, and the next nine calls find nothing to delete and leave the world unchanged.

The trouble is POST. POST /charges creates a new charge every time it runs. That's the correct default for creating things — you usually want a second POST to make a second resource. But for a payment, a second charge is exactly the bug. The operation is not idempotent by nature, so you have to make it idempotent on purpose.

Idempotency is not the same as "only run once." A retried request might still hit your server twice — the goal is that the second hit produces no additional effect. The card gets charged once even if the request arrives three times.

How an idempotency key works

The pattern that became the industry standard is the idempotency key, popularized by Stripe's API. The client generates a unique value — typically a UUID — and attaches it to the request, usually in an Idempotency-Key header. Critically, the client generates it before sending and reuses the same key on every retry of that one logical operation.

The server's job is then mechanical:

Read the key from the incoming request.
Look it up in a store of keys it has already processed.
If the key is new, process the charge, then save the key alongside the response it produced.
If the key already exists, skip the work entirely and return the saved response from the first time.

That fourth step is the whole trick. The retried request gets back the original "charge succeeded" response — same status code, same charge ID — so the client sees success and stops retrying. The card is never touched a second time.

The key has to be generated by the client, not the server, and it has to be tied to the user's intent — one checkout, one key. If you generate a fresh key on each retry, you've defeated the entire mechanism: every retry looks new, and you're back to double-charging.

Where idempotency quietly breaks

The concept is simple. The production failures are where it gets interesting, because they hide in the gaps between "check the key" and "do the work."

The race between check and write. Two retries can arrive at nearly the same instant. Both look up the key, both find nothing, both proceed to charge. The fix is to make the key reservation atomic — a unique constraint on the key column, or an atomic "insert if not exists." The first request wins the insert; the second hits a conflict and waits for or reads the first one's result instead of charging.

Storing the key after the side effect instead of before. If you charge the card and then save the key, a crash in between leaves you with a completed charge and no record that the key was used. The next retry sees a fresh key and charges again. Reserve the key first, do the work, then record the result against the reserved key.

Caching errors as if they were successes. If the first request fails with a transient 500, you generally do not want to replay that 500 forever. Most implementations only persist the response for successful or definitively-failed operations, and let genuinely transient failures be retried. Decide this deliberately — it's the difference between a retry that recovers and one that's permanently stuck.

Reusing a key for a different request body. A robust server fingerprints the request payload alongside the key. If the same key arrives with a different body, that's a client bug, and returning the old response would be wrong. Stripe, for instance, rejects a key reused with mismatched parameters rather than silently replaying.

Idempotency keys need a lifetime. Storing every key forever is a slow storage leak; expiring them too fast means a legitimate slow retry arrives after the record is gone and double-charges. A bounded window — Stripe keys, for example, expire after 24 hours — covers realistic retry timelines without unbounded growth.

If you're building this into a payment or order flow and want a second pair of eyes on the check-then-write race or the key lifetime, an AI pair-programmer that reads your whole repo can catch the non-atomic lookup before it ships.

The mental model worth keeping: a retry is not a new intention, it's the same intention asking again. Idempotency is how your server tells the difference. Get the key generated on the client, reserve it atomically before the side effect, and replay the stored result — and the retry that used to double-charge becomes a non-event.

Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.