Charlie Hadley

Posted on May 18

How to Catch LLM Regressions in CI: The Rubric-Based Eval System That Works

API Rate Limiting Playbook: Protect Your Backend From Abuse

The Problem

Your API is live in production. Traffic is growing. Then one day, a bot discovers your endpoint and starts hammering it with 100,000 requests per second. Your database melts. Your users see 500 errors. You lose revenue and reputation.

Or worse: a malicious actor uses your API to brute-force user accounts. You didn't have rate limiting in place. You're liable.

This is the silent killer of indie SaaS. You ship the product. You don't ship the protection. Then production breaks.

Why Most Indie Teams Skip Rate Limiting

Rate limiting sounds complicated. "Distributed rate limiting"? "Token bucket algorithm"? "Redis backing stores"?

In reality, it's simple. And you don't need expensive tools. You don't need AWS API Gateway ($0.35 per million requests). You don't need third-party middleware.

You need a methodology. Once you have methodology, the implementation is trivial.

The Three-Layer Strategy

Layer 1: IP-Based Rate Limiting (Nginx)

First line of defense: block obvious bots and abusers at the edge.

limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=auth:10m rate=1r/s;

server {
    location /api/ {
        limit_req zone=general burst=20 nodelay;
    }

    location /api/auth/login {
        limit_req zone=auth burst=3 nodelay;
    }
}

Cost: $0 (Nginx is free).

Setup time: 15 minutes.

Blocks: 95% of bot traffic and accidental DDoS.

Layer 2: User/Token-Based Rate Limiting (Redis + Python)

Your authenticated users have legitimate spikes. A single IP-based rule punishes them unfairly.

Instead, rate limit per API key or user ID:

import redis
from datetime import datetime, timedelta

r = redis.Redis()

def is_rate_limited(user_id, limit=100, window_seconds=3600):
    key = f"rate_limit:{user_id}:{int(datetime.now().timestamp() // window_seconds)}"
    current = r.incr(key)
    r.expire(key, window_seconds)
    return current > limit

@app.route('/api/resource')
def get_resource():
    if is_rate_limited(current_user.id):
        return {'error': 'Rate limit exceeded'}, 429
    return process_request()

Cost: Redis Cloud free tier (up to 30MB).

Setup time: 30 minutes.

Blocks: Authenticated abuse, account enumeration, brute-force attacks.

Layer 3: Endpoint-Specific Thresholds

Different endpoints have different abuse vectors:

Public endpoints (search, info): 100 req/min per IP
Auth endpoints (login, signup): 5 req/min per IP + distributed rate limit
Resource creation (write APIs): 10 req/min per user
Admin endpoints: 1000 req/day per user (tight control)

Document these in your API spec. Expose rate limit headers to clients:

response.headers['X-RateLimit-Limit'] = '100'
response.headers['X-RateLimit-Remaining'] = '87'
response.headers['X-RateLimit-Reset'] = unix_timestamp

Real-World Cost Breakdown

Component	Cost
Nginx configuration	$0
Redis Cloud (free tier)	$0
Monitoring + alerts	$0–10/month (CloudWatch or Datadog free tier)
Total	$0–10/month

Compare to AWS API Gateway: $0.35 per million requests = $3,500/month at scale.

Implementation Checklist

[ ] Deploy Nginx rate limiting (zone + limit_req directive)
[ ] Set up Redis account (free tier)
[ ] Write rate limit middleware in your framework
[ ] Define endpoint-specific limits
[ ] Add rate limit headers to responses
[ ] Test with Apache Bench or Vegeta load testing tool
[ ] Set up alerts (Slack notification when a user hits limits)
[ ] Document rate limits in your API docs

Time to implement: 2–4 hours.

Cost: $0 (for 95% of use cases).

Common Mistakes to Avoid

Only IP-based limiting: Punishes corporate networks and VPNs.
No graduated response: Ban immediately instead of throttling first.
Storing counts in database: Too slow. Use Redis or in-memory cache.
Not exposing rate limit headers: Clients can't intelligently back off.
Ignoring health check endpoints: Don't rate limit your own monitoring.

Debugging Rate Limit Issues

When a user reports "API blocked", here's how to troubleshoot:

Check Redis keys: redis-cli KEYS "rate_limit:*"
Inspect their request pattern: high burst vs sustained?
Whitelist their IP/user if it's a legitimate use case
Adjust thresholds based on real traffic patterns

Next Steps

This playbook includes:

Ready-to-deploy Nginx configs for all major frameworks
Redis setup guide (AWS ElastiCache, DigitalOcean, Heroku)
Complete Python/Node.js middleware code
GitHub Actions workflow for load testing
Real abuse patterns from production SaaS systems
Cost optimization strategies (cache tiers, fallback limits)
Comprehensive debugging guide
Whitelist/bypass strategies for trusted partners

Implementing rate limiting takes 2–4 hours. Ignoring it costs you production incidents and security breaches.

Deploy today.

DEV Community