DEV Community

Cover image for 🛡️ Rate Limiting in Web Applications
Fazal Mansuri
Fazal Mansuri

Posted on • Edited on

🛡️ Rate Limiting in Web Applications

In today’s API-driven world, applications face thousands—or even millions—of requests every day. While high traffic is great, it can also create challenges:

  • 🔒 Security risks (brute force, DDoS attacks)
  • ⚖️ Fair usage enforcement (avoid abuse of free tiers)
  • 🚀 Performance stability (prevent one user from hogging resources)

This is where Rate Limiting comes in.

Rate limiting ensures a user, IP or client can only make a defined number of requests within a specific time window. It’s a cornerstone of modern, scalable APIs.


🧩 Why Do We Need Rate Limiting?

  • 🔐 Security → Protect against brute force login attempts, API scraping, and DDoS attacks.
  • ⚖️ Fair Usage → Prevent abuse of APIs, especially for freemium services.
  • 🚀 Performance → Ensure backend resources are shared fairly among all users.
  • 💰 Cost Control → Reduce infrastructure bills by blocking excessive or abusive requests.

⚙️ Common Rate Limiting Algorithms

1️⃣ Token Bucket

  • Requests are allowed if tokens are available.
  • Tokens refill at a fixed rate.
  • Commonly used (flexible + efficient).

🔧 Example: Allow 10 requests per second. If unused, tokens accumulate up to a max limit (burst handling).


2️⃣ Leaky Bucket

  • Works like water dripping from a bucket at a fixed rate.
  • Bursts are smoothed out.
  • Useful for evenly distributing traffic.

3️⃣ Fixed Window Counter

  • Count requests in a fixed time window (e.g., 100 requests per minute).
  • ❌ Edge case: allows bursts at window boundaries.

4️⃣ Sliding Window Log

  • Keeps timestamps of requests.
  • Precise but memory-heavy for large scale.

5️⃣ Sliding Window Counter

  • Hybrid approach: averages counts across windows.
  • More accurate than fixed window, less heavy than logs.

🔧 Implementing Rate Limiting in Applications

Example in Go (Gin Framework)

package main

import (
    "github.com/gin-gonic/gin"
    "golang.org/x/time/rate"
    "net/http"
    "time"
)

func main() {
    r := gin.Default()

    // Create a limiter: 5 requests/sec, burst up to 10
    limiter := rate.NewLimiter(5, 10)

    r.GET("/api", func(c *gin.Context) {
        if !limiter.Allow() {
            c.JSON(http.StatusTooManyRequests, gin.H{"error": "Too many requests"})
            return
        }
        c.JSON(http.StatusOK, gin.H{"message": "Request successful"})
    })

    r.Run(":8080")
}
Enter fullscreen mode Exit fullscreen mode

✅ Each client is limited to 5 requests per second with a burst allowance of 10.


🚦 Rate Limiting with NGINX

NGINX can handle rate limiting at the web server or reverse proxy level, making it a powerful tool to protect your backend before requests hit your app.

⚙️ How It Works

NGINX uses two main directives:

  1. limit_req_zone → Defines a shared memory zone to track requests (by IP or custom key).
  2. limit_req → Applies the limit to endpoints.

🔧 Basic Example

http {
  # 1. Define rate limit zone (1 request/sec, burst 5)
  limit_req_zone $binary_remote_addr zone=api_limit:10m rate=1r/s;

  server {
    location /api/ {
      # 2. Apply rate limit
      limit_req zone=api_limit burst=5 nodelay;

      proxy_pass http://backend_service;
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

🔍 Explanation

  • Key: $binary_remote_addr → Track by client IP.
  • Rate: 1r/s → Allows 1 request per second per IP.
  • Burst: 5 → Short bursts up to 5 requests allowed.
  • nodelay: Burst requests are processed immediately.

✅ Advantages

  • Filters traffic before hitting your backend.
  • Can apply limits per endpoint (e.g., /login stricter than /products).
  • Lightweight & fast.

📊 Logs & Monitoring

  • Exceeded requests → logged as 503 (Service Unavailable).
  • Useful for tracking abuse & fine-tuning limits.

⚠️ Considerations

  • Avoid overly strict global limits (may block valid traffic).
  • Use $realip_remote_addr when behind a proxy/load balancer.
  • Always test before production rollout.

🔒 Best Practices for Rate Limiting

  • 🎯 Apply stricter limits on sensitive endpoints (like /login).
  • 🌍 Use different limits per client type (mobile app vs. server-to-server).
  • 🧩 Combine server-level (NGINX/API Gateway) + app-level checks.
  • 📊 Monitor logs & dashboards for blocked traffic trends.
  • 🚀 Use distributed stores (Redis) for shared limits in multi-instance apps.

💡 Bonus Tips

  • 429 Status Code: Always return HTTP 429 Too Many Requests with a helpful message.
  • Retry-After Header: Tell clients when to retry. Example:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Enter fullscreen mode Exit fullscreen mode
  • Graceful Degradation: Don’t just block—offer reduced functionality for non-critical requests.

🎯 Final Thoughts

Rate limiting is not just a performance tool—it’s a security guard, cost saver and reliability booster.

Whether you use algorithms in code, NGINX or API Gateways (Kong, Apigee, AWS API Gateway), implementing proper rate limiting ensures:

✔️ Fair usage
✔️ Secure endpoints
✔️ Scalable systems

🚀 The bottom line? Smart rate limiting makes APIs faster, safer, and fairer—for everyone.

💬 What approach do you use for rate limiting in your backend? Have you tried using Redis or NGINX rules? Let’s discuss in the comments!

Top comments (0)