Making Rate Limiting Correct Under Concurrency
Most rate limiting tutorials stop at the single-instance case.
That’s fine for learning, but it breaks quickly in production.
Once you have multiple instances and real traffic patterns, the problem changes.
It’s no longer just about picking an algorithm — it’s about correctness under concurrency.
This article walks through what actually goes wrong and how to fix it.
The In-Memory Trap
The first implementation most people write looks like this:
- keep a counter in memory
- increment on each request
- reject when the limit is reached
This works perfectly in a single instance.
Now deploy two instances.
Each instance has its own counter. A client can exceed your intended limit just by hitting different instances.
At that point, you don’t have a rate limiter anymore.
You have a suggestion.
Redis Fixes Distribution, Not Concurrency
The next step is moving state to Redis.
Now all instances share the same counters. Good.
A typical implementation looks like this:
- Read current count from Redis
- Check against limit
- Increment and write back
This seems correct, but it isn’t.
These are separate operations. Under concurrent load:
- two requests read the same value
- both pass the check
- both increment
Now your limit is no longer strict. It’s approximate.
The Real Problem: Atomicity
The issue isn’t Redis.
It’s that the decision is split across multiple steps.
What you need is:
a single, atomic operation that reads state, applies logic, and updates state
The Fix: Lua Scripts in Redis
Redis supports Lua scripts that execute atomically.
No other command runs between the start and end of the script.
Instead of multiple round trips:
- read state
- apply limiter logic
- update state
- return decision
You do everything inside one script.
Example (simplified):
local current = redis.call("GET", KEYS[1]) or 0
if tonumber(current) >= tonumber(ARGV[1]) then
return {0, current}
end
current = redis.call("INCR", KEYS[1])
redis.call("EXPIRE", KEYS[1], ARGV[2])
return {1, current}
This ensures:
- no race conditions
- consistent decisions across instances
- predictable behavior under load
Where Algorithms Fit In
At this point, you can plug in different strategies:
- Token Bucket → allows bursts, smooths over time
- Sliding Window → more accurate but heavier
- Leaky Bucket → enforces steady flow
But here’s the key point:
The algorithm matters less than where the decision happens.
If your logic isn’t atomic, the algorithm won’t save you.
Static Limits Miss Real Traffic Behavior
Even with correct enforcement, static limits are too rigid.
Real traffic looks like:
- legitimate bursts
- scrapers probing endpoints
- repeated identical requests
- denial loops
A fixed limit treats all of these the same.
Adding a Behavior Layer
A simple improvement is to track short-term behavior:
- request volume over a short window (burst detection)
- repeated request fingerprints
- number of unique routes hit (scan detection)
- repeated denials
This produces a basic risk score.
That score maps to tiers:
- normal
- elevated
- suspicious
- blocked
The important part is separation:
- Limiter → enforces limits
- Policy → decides how strict to be
This keeps the system easier to reason about and tune.
Tradeoffs
This approach is not free.
- Lua scripts add complexity
- debugging moves closer to Redis
- Redis becomes a critical dependency
But for systems that need consistency under concurrency, the tradeoff is worth it.
Key Takeaway
The biggest lesson is not about token buckets or sliding windows.
It’s this:
Correctness in rate limiting comes from atomic decision-making.
Once you ensure:
- a single source of truth
- atomic execution
- consistent state across instances
the rest becomes much easier.
Closing
I built this approach into a small system to explore the problem end-to-end.
If you’re interested in seeing a full implementation (TypeScript + Redis + Lua), you can check it out here:
👉 https://github.com/debjit450/arce
If you’ve dealt with this problem in production, I’d be interested to hear how you approached it.

Top comments (0)