DEV Community

Cover image for Biased: Fixed Window rate limiting algorithm explained
Roman Voloboev
Roman Voloboev

Posted on

Biased: Fixed Window rate limiting algorithm explained

Fixed Window rate limiting algorithm enforces a limit on the number of events allowed within a time window. "Maximum 5 password attempts in 10 minutes" is a classic rate limiting example.

The common misconception about Fixed Window is that people describe it as "allows N requests per fixed calendar window" and then warn about the "burst at boundary" problem.

In rate-limiter-flexible Node.js package, a Flexible Fixed Window Algorithm is implemented. To be honest, I shouldn't call it "flexible," but I don't have much choice. Somehow the developer community has managed to portray the fixed window algorithm as if it were anchored to specific calendar time, as if it starts at 12:00 PM and ends at 12:10 PM. Not necessarily. For every unique client, IP address, or fingerprint, the window start time varies. It begins when the first request arrives, and the counter expires after a specified duration, e.g. 10 minutes. The window start is always variable.

Many articles describe the Fixed Window Algorithm incorrectly. That's why I deliberately call it "flexible."

The false belief that gets copied over and over, from one article to another: it leads to bursts at the boundary of windows. Let's take a closer look.

If every unique client has its own window start time, then boundaries don't matter. In many services you expect traffic spikes allowed. Here's the deeper explanation:

  1. When you use rate-limited services, like AI agents, do you care if you spend your daily allowed tokens at the end of the day and then continue working after midnight, consuming tokens from the next day? Of course you do. It's natural to expect that once a limiter resets, tokens become available. Nobody calls this a boundary problem. On the contrary, you'd be frustrated if that spike on the window boundary weren't allowed.
  2. Creating a boundary spike is less probable statistically than it seems. To pull it off, a client would have to try 1 password, wait 9 minutes and 59 seconds, try another 4, and then immediately try 5 more. The probability of that event is quite low. That's the magic of a flexible window start.
  3. Statistically, different clients send requests at different times. They are not in perfect sync. Even if one client manages to create a spike on a window boundary, it isn't an issue: other clients follow different patterns, causing overall requests to scatter across time. Fixed time frame starts at different times for users A, B, and C.
  4. Sure, a malicious user could control 100 accounts and coordinate attacks on boundaries. But does Token Bucket protect against that? No. The same attacker can simply wait for the bucket to refill and then unleash a burst. Speaking of bursts, before Token Bucket was introduced in the 1980s, Leaky Bucket was the primary pattern for limiting traffic. Its problem was that it didn't allow traffic bursts at all. Token Bucket does. And that's never mentioned as an issue. It's a feature.
  5. There is one case you should be careful of. If you're limiting requests because of infrastructure constraints and traffic spikes could degrade performance, create two limiters: one for unique clients and one for total traffic per second. This approach keeps your application running under pressure, with some users' experience degraded rather than everyone's. You don't have many options here. You either make the user experience worse by disallowing traffic spikes entirely, or you mitigate the consequences of allowing them. No rate limiting algorithm can win the fight between your infrastructure limitations — limited budget, in fact — and an overwhelming volume of malicious requests. To manage spikes even more effectively, take a look at BurstyRateLimiter. It allows spontaneous traffic bursts with finer control.

After years of studying and applying different rate limiting algorithms, I've found that any algorithm can be adapted for specific needs. What I value about the Flexible Fixed Window Algorithm is that it provides clear control over application behavior and the ability to build custom solutions across multiple dimensions of traffic using two or more combined limiters. And it is always predictable in terms of performance.

Never forget to question the basics. Take control over the information you consume.
Happy coding!

Top comments (0)