Omja sharma

Posted on Apr 10

Rate Limiter Explained: Everything You Need in 5 Minutes

#api #backend #beginners #systemdesign

What happens if one user sends 10,000 requests per second to your API?

Your system crashes.
Unless you have a rate limiter.

What is Rate Limiting?

Rate limiting controls how many requests a user can make in a given time.

Example:
100 requests per minute per user
If the limit is exceeded:

requests are blocked
or delayed

Why It Matters

Without rate limiting:

APIs get overloaded
systems crash under traffic spikes
abuse (spam, brute force) increases

Common Approaches

1. Fixed Window

Limit requests in a fixed time window.

Example:
100 requests per minute
Problem:
Burst traffic at window edges

2. Sliding Window

Tracks requests over a rolling time window.

Better accuracy than fixed window

Slightly more complex

3. Token Bucket

Tokens are added at a fixed rate.

Each request consumes a token.

If no tokens:
request is rejected

Best for handling bursts

4. Leaky Bucket

Requests are processed at a constant rate.

Extra requests are queued or dropped.

Good for smoothing traffic

How It’s Implemented

Typical setup:

API Gateway or middleware
Redis for storing counters
Key = user/IP
Value = request count

Redis is used because:

fast
supports atomic operations
works well at scale

Trade-offs

accuracy vs performance
memory usage
handling bursts

No single approach is perfect.

Where It’s Used

login attempts
public APIs
payment systems
search endpoints

Rate limiting is not just about blocking users.

It’s about protecting your system from overload.

A small feature that prevents big failures.

DEV Community