The cart timer expired while they were paying. Now what?

#webdev #ux #architecture #ecommerce

Here is a bug report I have received in some form at three different companies. The customer opens the payment widget, enters their card, and hits pay. While the bank is doing its 3-D Secure dance, the cart's countdown timer hits zero. The frontend, doing exactly what it was told, tears down the session. The payment succeeds a second later against a cart that no longer exists. Now there is money in the account and no order attached to it.

Every engineer who built that timer built it correctly. The countdown protects a limited resource: a concert seat, a hotel night, the last unit in stock. Holding it forever would let one abandoned tab starve every other customer. So you set a timer, and when it fires you release the hold. Clean, defensible, and completely blind to the fact that a real person had already committed their money.

That blindness is the actual subject here. The countdown-expiry bug is small. The habit that produces it is not.

The timer answers a question nobody asked at that moment

Step back and ask what the countdown is for. It exists to resolve contention: two people want the same seat, and the hold decides who gets first refusal. That is the only job. It is a fairness mechanism between customers who have not yet paid.

The moment a customer enters the payment flow, the question the timer answers stops being relevant. Contention is over. This person is not browsing, not hesitating, not sitting on a tab they forgot about. They are actively handing you money for the exact thing the hold was protecting. Firing the timer now does not serve any other customer, because no other customer can be served: the seat is about to be sold. It only serves the abstraction.

This is the tell. An engineer thinking in systems sees a timer that reached zero and a rule that says "release on zero". An engineer thinking about the product sees a person who did everything right and is about to be punished for the bank's latency. Same event, two completely different readings, and only one of them keeps the customer.

So, do you let them finish?

Yes. Almost always, yes. If the payment authorizes, honor it.

The reasoning is not sentimental. A successful authorization is the strongest possible signal of intent, far stronger than the "still holding" state the timer was guarding. Rejecting a payment you already accepted creates the worst outcome for everyone: you now have to refund, the customer sees a charge-then-refund cycle that reads as "this company is sketchy", and you have burned trust to enforce a rule that protected nobody. Forcing that refund by letting a timer kill a live payment is a self-inflicted wound.

The one case where you genuinely cannot honor it is true oversell: while this customer was in 3-D Secure, the last unit actually went to someone who completed faster. That is a real conflict, not a timer artifact. Handle it as an inventory failure with an immediate, automatic reversal and an honest message, not as a "your time ran out" error. The customer's experience should be "we were one second too slow and here is your money back instantly", never "you were too slow".

The two follow-up questions

Once you accept that a live payment should win, two design questions follow, and they are where most implementations quietly go wrong.

Do you re-lock the offer right after they paid? No. This one trips people up because the instinct is to keep the state machine consistent: the hold expired, so on the way out we should re-acquire it. But re-locking something the customer already bought is nonsense. They own it now. Payment is the terminal state, not another step that needs its slot reserved. If your code path re-locks after capture, you have a state machine that does not know the transaction is over, and that will produce its own class of bugs (double-holds, phantom availability) down the line.

Do you run a separate timer for the final step? Yes, and this is the actual fix. The browsing countdown and the payment window are two different clocks measuring two different things. The cart timer manages contention while the customer decides. The moment they commit to paying, you switch to a payment grace window: a separate, more generous timer whose only job is to give the authorization time to resolve. Bank redirects, 3-D Secure, wallet confirmations, slow networks: a 30-second cart hold that felt urgent while browsing is absurd once someone is staring at their banking app's confirmation screen. Give the final step its own clock, sized for how long a real payment actually takes.

Make expiry server-authoritative

The bug at the top of this article only reaches production when the timer lives in the browser and the browser is allowed to decide the session is dead. A frontend countdown is a display. It should never be the thing that releases inventory.

Put the authority on the server. The hold has an expiry timestamp the server owns. The frontend counts down for the human, but the decision to release is a server-side check that runs when it actually matters: when someone else tries to grab the seat, or when the payment webhook arrives. When the capture webhook lands, the server compares it against the hold and the inventory, and decides. A visual timer hitting zero is not an event that should mutate anything. It is a hint to the user, nothing more.

That single architectural move, holds expire on the server, the frontend only displays, eliminates the entire category. The countdown can hit zero and reset to "processing your payment" without ever touching the underlying reservation, because the reservation was never the frontend's to release.

The habit behind the bug

None of the above is hard. The reason it ships broken so often is not technical difficulty, it is that the timer was built as a pure systems problem. Resource, contention, TTL, release. Every line of that is correct and the sum of it fails a customer, because the model never included the customer's experience of time. To the system, zero is zero. To the person, they were three-quarters of the way through paying.

This is the engineer-versus-product gap, and it does not show up in the happy path. It shows up in exactly these seams: what happens when two clocks disagree, when the network is slow, when the user does the right thing at the wrong instant. Those edge cases are where the product actually lives, and they are the first thing a systems-only mindset drops, because from inside the system they look like correctly handled states, not like a human being told "too late" for something they already did.

The fix is not "care more". It is to make one question part of the definition of done for anything user-facing: what does the person on the other end experience when this fires? Ask it of the timer and you invent the grace window on your own. Skip it and you ship a state machine that is right about everything except the one thing it was for.

The rule I follow now

Any timer, lock, or expiry that a user can be actively working against needs two things: a server that owns the real deadline, and an answer to "what happens to someone mid-action when this fires". A countdown that can cancel a payment already in flight has neither. Give the final step its own clock, keep the authority on the server, and let a committed customer finish. The abstraction does not need protecting. The person does.

Originally published on jguillaumesio.com.