Arun Sai Veerisetty

Posted on Jun 2

# How to Throttle Like a Pro: 5 Rate Limiting Patterns in Python You Should Know 🚦🐍

In today’s world of high-scale APIs, bots, and distributed systems, rate limiting is not just a nice-to-have—it’s essential. Whether you're protecting your server from abuse or controlling how often a user can take action, rate limiting is the key to reliability and fairness.

In this blog, we’ll explore 5 powerful rate limiting patterns with hands-on Python implementations. By the end, you’ll not only understand when and why to use each pattern but also walk away with real code to apply in your own projects.

🧠 What is Rate Limiting?

Rate limiting is the process of restricting how many requests or actions a system allows over a period of time. For example, “No more than 5 login attempts per minute” or “Only 100 API calls per hour”.

This is crucial for:

Avoiding abuse or spam.
Managing traffic spikes.
Fair resource usage.
Avoiding overloads and DDoS attacks.

🧩 Overview of Patterns

Here’s a quick glance at the patterns we'll cover:

Pattern	Allows Bursts?	Description	Best For
Fixed Window	❌	Simple time window	Basic rate limiting
Sliding Window	❌	Fairer than fixed window	API fairness
Leaky Bucket	✅ (Smooth)	Queues excess traffic	Traffic shaping
Token Bucket	✅	Token-based burst tolerance	Most flexible rate limits
Distributed (Redis)	✅	Multi-server rate limiting	Scalable systems

1. 🪟 Fixed Window

Concept: Allow N actions per fixed time window (e.g., per minute).

Analogy: Like a parking garage that resets at midnight — doesn’t matter when you arrived, just how many came during the time.

Pros: Simple to implement.

Cons: Susceptible to bursts at window edges.

2. 🪟 Sliding Window

Concept: Records timestamps of requests and checks the rolling window.

Analogy: Like keeping a log of visitors for the last 60 seconds — fair and accurate.

Pros: Fairer than fixed window.

Cons: Slightly more complex.

3. 🪣 Leaky Bucket

Concept: Adds requests to a queue, and processes them at a fixed rate.

Analogy: Like a faucet dripping water at a steady rate, even if you pour a bucket into it.

Pros: Smoothens traffic.

Cons: Can introduce latency.

4. 🎟️ Token Bucket

Concept: Tokens are added at a fixed rate; each request consumes a token.

Analogy: Like a vending machine that refills slowly — if you have tokens, you can burst; otherwise, wait.

Pros: Flexible and burst-tolerant.

Cons: Requires token logic and state.

5. 🌐 Distributed Rate Limiting (with Redis)

Concept: Use a shared data store like Redis to manage limits across servers.

Analogy: Like a shared notebook in the cloud tracking user activity.

Pros: Scalable, central tracking.

Cons: Needs external Redis setup.

🛠️ Make sure Redis is running locally or remotely before testing.

💻 GitHub Project

Explore all these patterns in code here:

👉 GitHub Repo

Each rate limiter is implemented in Python with comments and test files to help you understand and experiment.

🔧 How to Run the Code

Clone the repo

git clone https://github.com/arunsaiv/rate-limiter-patterns.git
cd rate-limiter-patterns

2. install dependencies

pip install -r requirements.txt

3. Run any pattern script

python fixed_window.py
python token_bucket.py

DEV Community