DEV Community

Vincent Tommi
Vincent Tommi

Posted on

5 Common Rate Limiting Algorithms: Pros, Cons, and Implementations day 47 of learning system design

Rate limiting is a critical mechanism for protecting services from being overwhelmed by excessive requests from a single user or client. By controlling the rate of incoming requests, rate limiting ensures system stability, prevents abuse, and maintains fair usage. In this article, we'll explore five of the most common rate limiting algorithms, discuss their pros and cons, provide visual illustrations using diagrams, and include Python Django code samples for testing.

1. Token Bucket
Token Bucket is one of the most popular rate limiting algorithms approaches due to it's simplicity and effectiveness.

How it Works

  • Imagine a bucket that holds tokens with maximum capacity.
  • Tokens are added to the bucket at a fixed rate (e.g tokens per second)
  • When a request arrives s it must consumes a token before it proceeds. -if tokens are available ,the request is allowed and the token is removed.
  • if no token is available the request is rejected.

pros

  • Simple to Implement and understand
  • Support burst of requests up-to bucket capacity, handling short-term traffic spikes effectively.

cons

  • Does not guarantee a perfectly smooth rate
  • memory usage scales with the number of request per-user.

Django Implementation
This Django view implements Leaky Bucket using a queue-like structure to process requests at a fixed rate.

import time
from collections import deque
from django.http import HttpResponse
from django.views.decorators.http import require_GET

# In-memory store for leaky bucket (use Redis in production)
leaky_buckets = {}

@require_GET
def leaky_bucket_limit(request):
    user_id = request.session.session_key or "anonymous"
    bucket_size = 10  # Max requests in bucket
    leak_rate = 1  # Requests per second
    current_time = time.time()

    # Initialize bucket
    if user_id not in leaky_buckets:
        leaky_buckets[user_id] = {"queue": deque(), "last_leak": current_time}

    # Leak requests
    elapsed = current_time - leaky_buckets[user_id]["last_leak"]
    leaks = int(elapsed * leak_rate)
    for _ in range(min(leaks, len(leaky_buckets[user_id]["queue"]))):
        leaky_buckets[user_id]["queue"].popleft()
    leaky_buckets[user_id]["last_leak"] = current_time

    # Add request to bucket
    if len(leaky_buckets[user_id]["queue"]) < bucket_size:
        leaky_buckets[user_id]["queue"].append(current_time)
        return HttpResponse("Request allowed", status=200)
    return HttpResponse("Rate limit exceeded", status=429)
Enter fullscreen mode Exit fullscreen mode

Test this endpoint at /rate-limit/leaky-bucket. It process upto 10 queued requests at a rate of 1 per second.

  1. Leaky Bucket Leaky Bucket algorithm focuses on smoothing out bursty traffic to ensure a steady request rate.

How it works

  • picture a bucket with a small hole at the bottom.
  • Requests enter the bucket at the top.
  • The bucket process requests at a constant rate through the leaky hole.
  • if the bucket is full the new requests are discarded.


pros

  • ensure a consistent,predictable processing rate.
  • Prevent sudden bursts from overwhelming the system.

cons

  • slightly more to implement than Token Bucket.
  • Poor handling of sudden bursts,excess requests are dropped immedialy.

Django Implementation
This Django view implements Leaky Bucket using a queue-like structure to process requests at a fixed rate.

import time
from collections import deque
from django.http import HttpResponse
from django.views.decorators.http import require_GET

# In-memory store for leaky bucket (use Redis in production)
leaky_buckets = {}

@require_GET
def leaky_bucket_limit(request):
    user_id = request.session.session_key or "anonymous"
    bucket_size = 10  # Max requests in bucket
    leak_rate = 1  # Requests per second
    current_time = time.time()

    # Initialize bucket
    if user_id not in leaky_buckets:
        leaky_buckets[user_id] = {"queue": deque(), "last_leak": current_time}

    # Leak requests
    elapsed = current_time - leaky_buckets[user_id]["last_leak"]
    leaks = int(elapsed * leak_rate)
    for _ in range(min(leaks, len(leaky_buckets[user_id]["queue"]))):
        leaky_buckets[user_id]["queue"].popleft()
    leaky_buckets[user_id]["last_leak"] = current_time

    # Add request to bucket
    if len(leaky_buckets[user_id]["queue"]) < bucket_size:
        leaky_buckets[user_id]["queue"].append(current_time)
        return HttpResponse("Request allowed", status=200)
    return HttpResponse("Rate limit exceeded", status=429)

Enter fullscreen mode Exit fullscreen mode

Test this endpoint at /rate-limit/leaky-bucket. It processes up to 10 queued requests at a rate of 1 per second.

  1. Fixed Window Counter Fixed window counter divide Time into fixed intervals and tracks request into each window.

How it Works

  • Times is divide into fixed intervals e.g 1 minute interval.
  • Each window has a counter starting at zero.
  • Incoming window increment the counter with current window. -if the counter exceeds the limit, the requests are denied until the next window.

pros

  • Simple to Implement and understand
  • Provide clear rate limit for each window

cons

  • Struggle with bursts at window boundaries,potentially allowing twice the rate limit at edges.

Django Implementation
This view implements Fixed Window Counter with a 60-second window and a limit of 10 requests.

import time
from django.http import HttpResponse
from django.views.decorators.http import require_GET

# In-memory store for counters (use Redis in production)
fixed_counters = {}

@require_GET
def fixed_window_limit(request):
    user_id = request.session.session_key or "anonymous"
    window_size = 60  # 60 seconds
    max_requests = 10
    current_time = int(time.time())
    window_start = current_time - (current_time % window_size)

    # Initialize or reset window
    if user_id not in fixed_counters or fixed_counters[user_id]["window_start"] != window_start:
        fixed_counters[user_id] = {"window_start": window_start, "count": 0}

    # Check and increment counter
    if fixed_counters[user_id]["count"] < max_requests:
        fixed_counters[user_id]["count"] += 1
        return HttpResponse("Request allowed", status=200)
    return HttpResponse("Rate limit exceeded", status=429)

Enter fullscreen mode Exit fullscreen mode
  1. Sliding Window Log The sliding window log algorithm maintains a log of request timestamps to enforce precise rate limits.

*How it Works *

  • A log stores timestamps for each request.
  • when a request arrives , timestamps older than the window size are removed.
  • The remaining timestamps are counted.
  • if the count is below the limit, the request is allowed and timestamp added.
  • if the count exceeds the limit the request are denied.

  1. Sliding Window Counter sliding window counter algorithm combines Fixed Window Counter and Sliding window Log for a balanced approach.

How it works

  • Tracks request counts for current and previous windows.
  • Calculates a weighted sum based on the overlap with the sliding window.

For example, if 75% of the current window has passed, 25% of the weight comes from the previous window, and 75% from the current window.

  • If the weighted sum is below the limit, the request is allowed.

pros

  • More accurate than Fixed window
  • More memory efficient than Fixed Window Log
  • smoothes out at window boundaries.

cons

  • Slightly more complex to implement.

Django Implementation
This view implements Sliding Window Counter with a 60-second window and a 10-request limit.

import time
from django.http import HttpResponse
from django.views.decorators.http import require_GET

# In-memory store for counters (use Redis in production)
sliding_counters = {}

@require_GET
def sliding_window_counter_limit(request):
    user_id = request.session.session_key or "anonymous"
    window_size = 60  # 60 seconds
    max_requests = 10
    current_time = time.time()
    current_window = int(current_time / window_size) * window_size

    # Initialize counters
    if user_id not in sliding_counters:
        sliding_counters[user_id] = {"current_window": current_window, "current_count": 0, "previous_count": 0}

    # Update windows
    if sliding_counters[user_id]["current_window"] != current_window:
        sliding_counters[user_id]["previous_count"] = sliding_counters[user_id]["current_count"]
        sliding_counters[user_id]["current_count"] = 0
        sliding_counters[user_id]["current_window"] = current_window

    # Calculate weighted sum
    elapsed = current_time - current_window
    weight = elapsed / window_size
    weighted_count = sliding_counters[user_id]["previous_count"] * (1 - weight) + sliding_counters[user_id]["current_count"]

    # Check and increment
    if weighted_count < max_requests:
        sliding_counters[user_id]["current_count"] += 1
        return HttpResponse("Request allowed", status=200)
    return HttpResponse("Rate limit exceeded", status=429)
Enter fullscreen mode Exit fullscreen mode

Test this endpoint at /rate-limit/sliding-window-counter. It allows 10 requests within a sliding 60-second window.

*Setting Up the Django Project
*

To test these implementations, set up a Django project with the following steps:

  1. Create a Django project: django-admin startproject rate_limit_demo.
  2. Create an app: python manage.py startapp rate_limiter
  3. Add the app to INSTALLED_APPS in settings.py:
INSTALLED_APPS = [
    ...
    'rate_limiter',
]

Enter fullscreen mode Exit fullscreen mode
  1. Enable sessions in settings.py:
MIDDLEWARE = [
    ...
    'django.contrib.sessions.middleware.SessionMiddleware',
]

Enter fullscreen mode Exit fullscreen mode
  1. Add the views to rate_limiter/views.py.
  2. Configure URLs in rate_limiter/urls.py:
from django.urls import path
from . import views

urlpatterns = [
    path('token-bucket/', views.token_bucket_limit, name='token_bucket'),
    path('leaky-bucket/', views.leaky_bucket_limit, name='leaky_bucket'),
    path('fixed-window/', views.fixed_window_limit, name='fixed_window'),
    path('sliding-window-log/', views.sliding_window_log_limit, name='sliding_window_log'),
    path('sliding-window-counter/', views.sliding_window_counter_limit, name='sliding_window_counter'),
]
Enter fullscreen mode Exit fullscreen mode
  1. Include the app URLs in the project’s urls.py:
from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path('admin/', admin.site.urls),
    path('rate-limit/', include('rate_limiter.urls')),
]
Enter fullscreen mode Exit fullscreen mode
  1. Run the server: python manage.py runserver
  2. Test endpoints using curl or a browser (e.g., http://localhost:8000/rate-limit/token-bucket/).

Note: The in-memory stores (token_buckets, leaky_buckets, etc.) are for demonstration. In production, use Redis or a database to persist state across requests.

*Choosing the Right Algorithm
*

When implementing rate limiting, consider:

  • System Scale: High-volume APIs may require memory-efficient algorithms like Token Bucket or Sliding Window Counter.

  • Traffic Patterns: Bursty traffic benefits from Token Bucket, while steady traffic suits Leaky Bucket.

  • Granularity Needs: Sliding Window Log offers precision but at a memory cost.

  • Communicate rate limits clearly to API users through response headers (e.g., X-Rate-Limit-Remaining). This enables clients to implement effective retry and backoff strategies.

Conclusion
Rate limiting is essential for maintaining service reliability and fairness. The five algorithms—Token Bucket, Leaky Bucket, Fixed Window Counter, Sliding Window Log, and Sliding Window Counter—offer unique trade-offs in simplicity, accuracy, and efficiency. Use the provided Django code samples to test these algorithms and the diagrams to visualize their mechanics.

Top comments (0)