DEV Community

Cover image for Why exponential backoff is silly
Dario Mannu
Dario Mannu

Posted on • Edited on

Why exponential backoff is silly

So you learnt expoential backoff is the clever way to handle retries when calling an API.

Are you sure?

You have a large number of clients who are going to hit a server. So, in case it fails, we let them retry, but not immediately, becuase that's a self-inflicted DDoS. Not every 2 seconds because that's still a DoS every 2 seconds, not just exponentially growing because there are always new clients coming, so it's just the old ones who become less pushy over time, not just at randomly paced intervals to spread the evil more evenly, etc...


What if...

What if we had a 500 error? Does it make sense to retry at all?
If it does, we're admitting the endpoint we're calling is bad and we're not even taking steps to fix it. 🤡

What if we have a network issue? Is retrying at randomly-spaced intervals a good idea? Why should network come back after 2 or 5 seconds?
So, we're assuming network's going to be back in a few seconds, but can we reasonably assume it? Do we not have any better means to solve the problem than just keep trying? Imagine the user's browser, network is down, 200 open tabs retrying to send some data every couple of seconds. Have you checked your battery degradation level yet? 🪫

What if we're getting a 40x from the server? That's the polite speak of a server to say "This is wrong, you idiot, don't ever try this way again!". If we retry, we're clearly demonstrating we're those idiots. If we retry with exponential backoff, we're just distinguishing ourselves as exponential idiots. 🪿

What if it fails with NXDOMAIN? Are we going to retry hoping the owner re-registers the domain for us, maybe within the next 2s? 🤌

Have you ever noticed how many libraries have that incredibly convenient .retry(n) method, so universally baked-in that it now feels like a solved problem?
What does that make you think, actually? 🧠

Well, the problem can't be less serious than it looks, can it?

Alternatives?

The most sensible approach, intuitively, seems to be the event-driven approach: only retry when there's an event suggesting it makes sense, such as:

  • the user has resubmitted a form with different values
  • the browser says network is available
  • the user clicked "yes" on a popup saying "no network: retry?"
  • the app is closing, we give it a last chance...

If you think about it, there are lots of events you can hook into and leverage, which are better than retrying.

Here are an example or two that wrap everything in reactive streams so all the logic complexity, conditional retries, raising popups to the user, is all abstracted away, leaving the main code just a single input-output stream to reference.

Check out the examples and leave a comment below if you find them useful.

Finally, if you like what you've seen here please don't forget to leave a Star in Github ⭐ so we can continue evolving it for you!


Learn More

Top comments (5)

Collapse
 
framemuse profile image
Valery Zinchenko

Makes sense if think about it. If there's something wrong why retrying if user has no intent to do so. Give a retry button and hook to meaningful events for suitable retrial.

Collapse
 
dariomannu profile image
Dario Mannu

Definitely. We should encourage more and more people to move away from the magic retry(3) approach. End users will be grateful :)

Collapse
 
kurealnum profile image
Oscar

There's no reason not to try again though. The worst thing we can do is tell the user "nope, try again in a few hours!". Especially in today's extremely fast paced tech environment, where users expect things to load in less than a second.

Collapse
 
dariomannu profile image
Dario Mannu

So if you hit an HTTP 400, 401, 404, 410, etc, you're suggesting we still retry anyway and just tell the user to come back in a few hours?

Collapse
 
kurealnum profile image
Oscar

No, I'm saying that exponential backoff is reasonable in some places. Just from what you listed off, expoential back off on 404s, and maybe 400s and 401s, would be reasonable IMO.