DEV Community

DevCorner
DevCorner

Posted on

1 1 1 1

Rate Limiting Algorithms: A Deep Dive 2

Introduction

Rate limiting is a technique used to control the number of requests a system processes within a specific time frame. It helps prevent abuse, ensures fair usage, protects against DDoS attacks, and maintains system stability.

This blog explores the most commonly used rate-limiting algorithms, their advantages and disadvantages, and how to implement them in Java.


1️⃣ Token Bucket Algorithm

How It Works

  • A bucket holds a fixed number of tokens.
  • Tokens are added to the bucket at a constant rate.
  • Each request consumes a token.
  • If the bucket is empty, excess requests are denied until new tokens are added.

Pros & Cons

✅ Allows short bursts of traffic while controlling overall request rate.

✅ Efficient for distributed systems.

❌ If the bucket drains quickly, requests may be blocked until tokens are refilled.

Java Implementation

import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

class TokenBucketRateLimiter {
    private final Semaphore tokens;
    private final int capacity;

    public TokenBucketRateLimiter(int capacity, int refillRatePerSecond) {
        this.capacity = capacity;
        this.tokens = new Semaphore(capacity);

        // Refill tokens periodically
        new Thread(() -> {
            while (true) {
                tokens.release(refillRatePerSecond);
                if (tokens.availablePermits() > capacity) {
                    tokens.drainPermits();
                    tokens.release(capacity);
                }
                try {
                    TimeUnit.SECONDS.sleep(1);
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                }
            }
        }).start();
    }

    public boolean allowRequest() {
        return tokens.tryAcquire();
    }
}
Enter fullscreen mode Exit fullscreen mode

2️⃣ Leaky Bucket Algorithm

How It Works

  • Requests enter a queue (bucket).
  • Requests are processed at a fixed rate (like water leaking from a bucket).
  • If the queue overflows, extra requests are discarded.

Pros & Cons

✅ Ensures a steady flow of requests.
✅ Prevents sudden traffic spikes from overloading the system.
❌ Can introduce delays if the queue is full.

Java Implementation

import java.util.LinkedList;
import java.util.Queue;

class LeakyBucketRateLimiter {
    private final Queue<Long> queue;
    private final int capacity;
    private final long leakRateMillis;

    public LeakyBucketRateLimiter(int capacity, int leakRatePerSecond) {
        this.capacity = capacity;
        this.leakRateMillis = 1000L / leakRatePerSecond;
        this.queue = new LinkedList<>();
    }

    public synchronized boolean allowRequest() {
        long currentTime = System.currentTimeMillis();
        while (!queue.isEmpty() && queue.peek() <= currentTime - leakRateMillis) {
            queue.poll();
        }
        if (queue.size() < capacity) {
            queue.add(currentTime);
            return true;
        }
        return false;
    }
}
Enter fullscreen mode Exit fullscreen mode

3️⃣ Fixed Window Counter Algorithm

How It Works

  • A counter tracks the number of requests per fixed time window (e.g., per minute).
  • If the count exceeds the allowed limit, further requests are rejected until the next window.

Pros & Cons

✅ Easy to implement.

✅ Works well for predictable traffic patterns.

❌ Can lead to spikes at window boundaries.

Java Implementation

import java.util.concurrent.atomic.AtomicInteger;

class FixedWindowRateLimiter {
    private final int limit;
    private final long windowSizeMillis;
    private long windowStart;
    private final AtomicInteger requestCount;

    public FixedWindowRateLimiter(int limit, int windowSizeSeconds) {
        this.limit = limit;
        this.windowSizeMillis = windowSizeSeconds * 1000L;
        this.windowStart = System.currentTimeMillis();
        this.requestCount = new AtomicInteger(0);
    }

    public synchronized boolean allowRequest() {
        long now = System.currentTimeMillis();
        if (now - windowStart >= windowSizeMillis) {
            windowStart = now;
            requestCount.set(0);
        }
        return requestCount.incrementAndGet() <= limit;
    }
}
Enter fullscreen mode Exit fullscreen mode

4️⃣ Sliding Window Counter Algorithm

How It Works

  • Uses smaller sub-windows within a fixed window.
  • More accurate and distributes requests evenly.

Pros & Cons

✅ More accurate than Fixed Window Counter.

✅ Reduces request bursts at window boundaries.

❌ More complex to implement.


5️⃣ Sliding Window Log Algorithm

How It Works

  • Stores timestamps of each request.
  • Removes timestamps outside the allowed time window.

Pros & Cons

✅ High accuracy in rate limiting.

❌ High memory usage due to storing request timestamps.


6️⃣ Adaptive Rate Limiting

How It Works

  • Uses machine learning or heuristics to adjust rate limits dynamically.
  • Can consider factors like server load, request patterns, and user behavior.

Pros & Cons

✅ Dynamically adjusts limits for better efficiency.

❌ Complex implementation.


Comparison Table

Algorithm Burst Handling Smoothness Memory Usage Complexity
Token Bucket ✅ Yes ✅ Yes 🔹 Low 🔹 Simple
Leaky Bucket ❌ No ✅ Yes 🔹 Low 🔹 Simple
Fixed Window Counter ❌ No ❌ No 🔹 Low 🔹 Simple
Sliding Window Counter ✅ Yes ✅ Yes 🔹 Medium 🔹 Medium
Sliding Window Log ✅ Yes ✅ Yes 🔴 High 🔴 Complex
Adaptive Rate Limiting ✅ Yes ✅ Yes 🔴 High 🔴 Complex

Interview Preparation

Commonly Asked Questions:

1️⃣ What is the difference between Token Bucket and Leaky Bucket?

2️⃣ Which algorithm prevents bursts better?

3️⃣ How would you implement rate limiting in a distributed system?

4️⃣ How do you ensure fairness in rate limiting?


Conclusion

Choosing the right rate-limiting algorithm depends on your system's requirements. For APIs, Token Bucket is widely used due to its efficiency. If smooth processing is essential, Leaky Bucket is a good choice. For better accuracy, Sliding Window methods work well.

Want a deeper dive into API Gateway-based rate limiting? Let me know! 🚀

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

Qodo Takeover

Introducing Qodo Gen 1.0: Transform Your Workflow with Agentic AI

Rather than just generating snippets, our agents understand your entire project context, can make decisions, use tools, and carry out tasks autonomously.

Read full post