DEV Community

Omja sharma
Omja sharma

Posted on

Rate Limiter Explained: Everything You Need in 5 Minutes

What happens if one user sends 10,000 requests per second to your API?

Your system crashes.
Unless you have a rate limiter.


What is Rate Limiting?

Rate limiting controls how many requests a user can make in a given time.

Example:
100 requests per minute per user
If the limit is exceeded:

  • requests are blocked
  • or delayed

Why It Matters

Without rate limiting:

  • APIs get overloaded
  • systems crash under traffic spikes
  • abuse (spam, brute force) increases

Common Approaches

1. Fixed Window

Limit requests in a fixed time window.

Example:
100 requests per minute
Problem:
Burst traffic at window edges


2. Sliding Window

Tracks requests over a rolling time window.

Better accuracy than fixed window

Slightly more complex


3. Token Bucket

Tokens are added at a fixed rate.

Each request consumes a token.

If no tokens:
request is rejected

Best for handling bursts


4. Leaky Bucket

Requests are processed at a constant rate.

Extra requests are queued or dropped.

Good for smoothing traffic


How It’s Implemented

Typical setup:

  • API Gateway or middleware
  • Redis for storing counters
  • Key = user/IP
  • Value = request count

Redis is used because:

  • fast
  • supports atomic operations
  • works well at scale

Trade-offs

  • accuracy vs performance
  • memory usage
  • handling bursts

No single approach is perfect.


Where It’s Used

  • login attempts
  • public APIs
  • payment systems
  • search endpoints

Rate limiting is not just about blocking users.

It’s about protecting your system from overload.

A small feature that prevents big failures.

Top comments (0)