Building a Distributed Rate Limiter for FastAPI with Redis
Every API eventually runs into the same problem.
A bot, scraper, or even a buggy client suddenly starts sending thousands of requests per second. When that happens, your server slows down, your database struggles, and real users start seeing errors.
This is exactly the kind of situation rate limiting is meant to prevent.
Recently I built RateGuard, a small Python library that adds distributed rate limiting to FastAPI using Redis. In this post I want to walk through how it works and the design decisions behind it.
Why I Built RateGuard
While working with FastAPI, I looked at a few rate limiting libraries. Most of them had at least one issue.
Some only support in-memory limits, which means they break once your API runs on multiple servers.
Others work in distributed setups but require more infrastructure than I wanted.
So I decided to build something simple with a few goals in mind:
- easy to plug into FastAPI
- works across multiple servers
- accurate under heavy traffic
- simple enough to understand and maintain
That became RateGuard.
What Is Rate Limiting?
Rate limiting controls how many requests a user can send to an API within a certain time period.
For example:
- a user can send 10 requests per minute
- after the limit is reached they receive a
429 Too Many Requestsresponse - after the time window passes the limit resets
A simple analogy is a coffee shop rule:
One free coffee per customer per hour.
The barista keeps track of who got a coffee and when. If you come back too soon, you have to wait.
Why Not Just Use a Counter?
A very common approach is to use a simple counter that resets every minute.
The problem is that this method can be abused.
Imagine your limit is 10 requests per minute and the counter resets at exactly 12:00:00.
A user could do this:
- send 10 requests at 11:59:55
- the counter resets at 12:00:00
- send another 10 requests at 12:00:05
That ends up being 20 requests in about 10 seconds, even though the limit is supposed to be 10 per minute.
This issue is called the fixed window problem.
The Sliding Window Approach
To avoid this problem, RateGuard uses the sliding window algorithm.
Instead of resetting at fixed times, it always looks back a certain number of seconds from the current request.
The logic looks like this:
- a request arrives at time
T - check all requests between
T - windowandT - remove anything older than the window
- count the remaining requests
- if the count is below the limit, allow the request
- otherwise return a
429response
Going back to the coffee shop example, instead of resetting every hour on the clock, the barista asks:
Did this person get a coffee in the last 60 minutes?
The time window moves forward with every request.
Basic Architecture
RateGuard sits between incoming requests and your FastAPI application.
Client Request
|
v
FastAPI Server
|
v
RateGuard Middleware
|
v
Redis Sorted Set
|
v
Allow or Block Request
Redis stores request timestamps so every server in the system can see them.
Why Redis?
Redis is a good fit for rate limiting for two main reasons.
Speed
Rate limiting runs on every request, so it has to be fast. Redis is an in-memory data store and can handle a huge number of operations per second.
Shared state
If your API runs on several servers, each one needs to know how many requests have already been made. Redis works as a shared store that all servers can read from and write to.
Using Redis Sorted Sets
RateGuard stores request data inside a Redis Sorted Set.
A sorted set stores values with a score. The score determines the order.
In this case:
- the score is the request timestamp
- the value is a unique request ID
Example data:
Key: ratelimit:192.168.1.1
1709856060000 -> req_abc123
1709856080000 -> req_def456
1709856100000 -> req_xyz789
For each request, RateGuard:
- removes entries older than the time window
- counts the remaining entries
- decides whether to allow or block the request
- records the new request
This approach works well even when multiple servers are handling traffic.
Installing RateGuard
pip install rate-guardian
You will also need a Redis instance. I used Upstash Redis, which has a generous free tier.
Quick Example
import os
from fastapi import FastAPI, Request
from rateguard import RateGuard, RateLimitMiddleware, rate_limit
app = FastAPI()
limiter = RateGuard(
redis_url=os.environ["UPSTASH_REDIS_REST_URL"],
redis_token=os.environ["UPSTASH_REDIS_REST_TOKEN"],
)
app.add_middleware(
RateLimitMiddleware,
limiter=limiter,
limit=10,
window=60
)
@app.get("/")
async def home():
return {"message": "API is protected by RateGuard"}
After adding the middleware, every endpoint is automatically protected.
Per Route Limits
Sometimes you want stricter limits for certain endpoints.
For example, a search endpoint that queries a database.
@app.get("/search")
@rate_limit(limiter, limit=5, window=60)
async def search(request: Request, q: str = ""):
return {"query": q, "results": []}
Now /search only allows 5 requests per minute.
Response Headers
RateGuard includes useful headers in responses.
| Header | Description |
|---|---|
| X-RateLimit-Limit | Maximum allowed requests |
| X-RateLimit-Remaining | Requests left |
| X-RateLimit-Reset | Seconds until reset |
| Retry-After | Only present on 429 responses |
These help clients know when to slow down.
What Happens If Redis Fails?
One design decision I made was to fail open.
If Redis is temporarily unavailable, requests are allowed instead of blocked.
For most APIs, blocking all traffic because Redis is down would be worse than briefly running without rate limiting.
Core Logic Example
Here is a simplified version of the main logic:
import time
import uuid
def is_allowed(self, key: str, limit: int, window: int):
now = int(time.time() * 1000)
oldest = now - (window * 1000)
pipe = self.redis.pipeline()
pipe.zremrangebyscore(key, 0, oldest)
pipe.zcard(key)
pipe.zadd(key, {str(uuid.uuid4()): now})
pipe.expire(key, window)
results = pipe.exec()
count = results[1]
allowed = count < limit
return allowed
Using a pipeline groups the Redis operations together and avoids race conditions.
What's Next
A few improvements I plan to add:
- support for standard Redis deployments
- rate limiting by user ID
- an optional token bucket algorithm
- better metrics and monitoring
Try It Out
pip install rate-guardian
GitHub: https://github.com/Jpeg-create/rate-guard
PyPI: https://pypi.org/project/rate-guardian/
If you find it useful, a ⭐ on GitHub means a lot.
Top comments (0)