API Rate Limiting Playbook: Protect Your Backend From Abuse
The Problem
Your API is live in production. Traffic is growing. Then one day, a bot discovers your endpoint and starts hammering it with 100,000 requests per second. Your database melts. Your users see 500 errors. You lose revenue and reputation.
Or worse: a malicious actor uses your API to brute-force user accounts. You didn't have rate limiting in place. You're liable.
This is the silent killer of indie SaaS. You ship the product. You don't ship the protection. Then production breaks.
Why Most Indie Teams Skip Rate Limiting
Rate limiting sounds complicated. "Distributed rate limiting"? "Token bucket algorithm"? "Redis backing stores"?
In reality, it's simple. And you don't need expensive tools. You don't need AWS API Gateway ($0.35 per million requests). You don't need third-party middleware.
You need a methodology. Once you have methodology, the implementation is trivial.
The Three-Layer Strategy
Layer 1: IP-Based Rate Limiting (Nginx)
First line of defense: block obvious bots and abusers at the edge.
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=auth:10m rate=1r/s;
server {
location /api/ {
limit_req zone=general burst=20 nodelay;
}
location /api/auth/login {
limit_req zone=auth burst=3 nodelay;
}
}
Cost: $0 (Nginx is free).
Setup time: 15 minutes.
Blocks: 95% of bot traffic and accidental DDoS.
Layer 2: User/Token-Based Rate Limiting (Redis + Python)
Your authenticated users have legitimate spikes. A single IP-based rule punishes them unfairly.
Instead, rate limit per API key or user ID:
import redis
from datetime import datetime, timedelta
r = redis.Redis()
def is_rate_limited(user_id, limit=100, window_seconds=3600):
key = f"rate_limit:{user_id}:{int(datetime.now().timestamp() // window_seconds)}"
current = r.incr(key)
r.expire(key, window_seconds)
return current > limit
@app.route('/api/resource')
def get_resource():
if is_rate_limited(current_user.id):
return {'error': 'Rate limit exceeded'}, 429
return process_request()
Cost: Redis Cloud free tier (up to 30MB).
Setup time: 30 minutes.
Blocks: Authenticated abuse, account enumeration, brute-force attacks.
Layer 3: Endpoint-Specific Thresholds
Different endpoints have different abuse vectors:
- Public endpoints (search, info): 100 req/min per IP
- Auth endpoints (login, signup): 5 req/min per IP + distributed rate limit
- Resource creation (write APIs): 10 req/min per user
- Admin endpoints: 1000 req/day per user (tight control)
Document these in your API spec. Expose rate limit headers to clients:
response.headers['X-RateLimit-Limit'] = '100'
response.headers['X-RateLimit-Remaining'] = '87'
response.headers['X-RateLimit-Reset'] = unix_timestamp
Real-World Cost Breakdown
| Component | Cost |
|---|---|
| Nginx configuration | $0 |
| Redis Cloud (free tier) | $0 |
| Monitoring + alerts | $0–10/month (CloudWatch or Datadog free tier) |
| Total | $0–10/month |
Compare to AWS API Gateway: $0.35 per million requests = $3,500/month at scale.
Implementation Checklist
- [ ] Deploy Nginx rate limiting (zone + limit_req directive)
- [ ] Set up Redis account (free tier)
- [ ] Write rate limit middleware in your framework
- [ ] Define endpoint-specific limits
- [ ] Add rate limit headers to responses
- [ ] Test with Apache Bench or Vegeta load testing tool
- [ ] Set up alerts (Slack notification when a user hits limits)
- [ ] Document rate limits in your API docs
Time to implement: 2–4 hours.
Cost: $0 (for 95% of use cases).
Common Mistakes to Avoid
- Only IP-based limiting: Punishes corporate networks and VPNs.
- No graduated response: Ban immediately instead of throttling first.
- Storing counts in database: Too slow. Use Redis or in-memory cache.
- Not exposing rate limit headers: Clients can't intelligently back off.
- Ignoring health check endpoints: Don't rate limit your own monitoring.
Debugging Rate Limit Issues
When a user reports "API blocked", here's how to troubleshoot:
- Check Redis keys:
redis-cli KEYS "rate_limit:*" - Inspect their request pattern: high burst vs sustained?
- Whitelist their IP/user if it's a legitimate use case
- Adjust thresholds based on real traffic patterns
Next Steps
This playbook includes:
- Ready-to-deploy Nginx configs for all major frameworks
- Redis setup guide (AWS ElastiCache, DigitalOcean, Heroku)
- Complete Python/Node.js middleware code
- GitHub Actions workflow for load testing
- Real abuse patterns from production SaaS systems
- Cost optimization strategies (cache tiers, fallback limits)
- Comprehensive debugging guide
- Whitelist/bypass strategies for trusted partners
Implementing rate limiting takes 2–4 hours. Ignoring it costs you production incidents and security breaches.
Deploy today.
Top comments (0)