`
title: "Building Rate-Limited APIs That Scale: A Practical Guide"
published: true
description: "Learn how to implement rate limiting for your APIs with in-memory, Redis-based, and API Gateway approaches. Includes token bucket algorithm and monitoring tips."
Rate limiting is critical for API stability, but implementing it wrong can frustrate users or fail when you need it most. Here's how to build rate limiting that actually works.
Why Rate Limiting Matters
Without rate limiting, a single misbehaving client can bring down your entire API. I learned this the hard way when a buggy integration made 10,000 requests per second to our production API. Spoiler: it didn't end well.
Three Approaches to Rate Limiting
1. In-Memory (Simple but Limited)
Good for single-instance APIs or development:
`javascript
const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per window
message: 'Too many requests, please try again later.'
});
app.use('/api/', limiter);
`
Pros: Easy to set up, no external dependencies
Cons: Doesn't work across multiple servers, resets on restart
2. Redis-Based (Production Ready)
Best for distributed systems:
`javascript
const RedisStore = require('rate-limit-redis');
const redis = require('redis');
const client = redis.createClient({
host: process.env.REDIS_HOST,
port: 6379
});
const limiter = rateLimit({
store: new RedisStore({
client: client,
prefix: 'rl:'
}),
windowMs: 15 * 60 * 1000,
max: 100
});
app.use('/api/', limiter);
`
Pros: Works across multiple instances, persistent across restarts
Cons: Requires Redis infrastructure
3. API Gateway Level (Enterprise)
Leverage your cloud provider's API Gateway:
`yaml
AWS API Gateway example
x-amazon-apigateway-request-validators:
all:
validateRequestBody: true
validateRequestParameters: true
paths:
/users:
get:
x-amazon-apigateway-integration:
type: aws_proxy
httpMethod: POST
uri: arn:aws:apigateway:region:lambda:path/function/arn
x-amazon-apigateway-throttle:
rateLimit: 100
burstLimit: 200
`
Pros: No code changes, scales automatically, DDoS protection
Cons: Vendor lock-in, potential costs
Advanced: Token Bucket Algorithm
For more sophisticated rate limiting, implement a token bucket:
`javascript
class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.refillRate = refillRate;
this.lastRefill = Date.now();
}
consume(tokens = 1) {
this.refill();
if (this.tokens >= tokens) {
this.tokens -= tokens;
return true;
}
return false;
}
refill() {
const now = Date.now();
const timePassed = (now - this.lastRefill) / 1000;
const tokensToAdd = timePassed * this.refillRate;
this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
this.lastRefill = now;
}
}
// Usage
const buckets = new Map();
app.use((req, res, next) => {
const ip = req.ip;
if (!buckets.has(ip)) {
buckets.set(ip, new TokenBucket(100, 10)); // 100 capacity, 10 tokens/sec
}
const bucket = buckets.get(ip);
if (bucket.consume(1)) {
next();
} else {
res.status(429).json({ error: 'Rate limit exceeded' });
}
});
`
User-Friendly Rate Limiting
Always include helpful headers:
javascript
app.use((req, res, next) => {
res.setHeader('X-RateLimit-Limit', '100');
res.setHeader('X-RateLimit-Remaining', '95');
res.setHeader('X-RateLimit-Reset', new Date(Date.now() + 900000).toISOString());
next();
});
Different Limits for Different Users
Implement tiered rate limiting:
`javascript
const getRateLimit = (user) => {
if (user.tier === 'premium') return 1000;
if (user.tier === 'standard') return 100;
return 10; // free tier
};
app.use(async (req, res, next) => {
const user = await authenticateUser(req);
const limit = getRateLimit(user);
// Apply user-specific limit
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: limit,
keyGenerator: (req) => user.id
});
limiter(req, res, next);
});
`
Monitoring Your Rate Limits
Track who's hitting limits:
javascriptRate limit exceeded for ${req.ip}`);
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
handler: (req, res) => {
// Log for analysis
console.log(
// Send to monitoring
metrics.increment('rate_limit.exceeded', {
ip: req.ip,
endpoint: req.path
});
res.status(429).json({
error: 'Too many requests',
retryAfter: 900 // seconds
});
}
});
`
Quick Implementation Checklist
- ✅ Choose the right approach for your scale
- ✅ Use Redis for multi-instance deployments
- ✅ Include rate limit headers in responses
- ✅ Implement different tiers for different users
- ✅ Monitor rate limit violations
- ✅ Provide clear error messages with retry information
- ✅ Consider geographic rate limiting for global APIs
Common Pitfalls to Avoid
- Rate limiting by IP alone: Use user IDs when available
- Too aggressive limits: Start generous, tighten based on data
- No burst allowance: Allow short traffic spikes
- Ignoring authenticated vs anonymous: Different limits for each
Conclusion
Rate limiting is essential for API reliability. Start with a simple in-memory solution, move to Redis as you scale, and consider API Gateway for enterprise needs. Always include helpful headers and monitor violations to fine-tune your limits.
How do you handle rate limiting in your APIs? Any horror stories to share? 🚦
`
Top comments (0)