Abinash Sharma

Posted on Dec 28, 2025

Building Rate-Limited APIs That Scale - A Practical Guide

#api #backend #webdev #node

`

title: "Building Rate-Limited APIs That Scale: A Practical Guide"
published: true
description: "Learn how to implement rate limiting for your APIs with in-memory, Redis-based, and API Gateway approaches. Includes token bucket algorithm and monitoring tips."

Rate limiting is critical for API stability, but implementing it wrong can frustrate users or fail when you need it most. Here's how to build rate limiting that actually works.

Why Rate Limiting Matters

Without rate limiting, a single misbehaving client can bring down your entire API. I learned this the hard way when a buggy integration made 10,000 requests per second to our production API. Spoiler: it didn't end well.

Three Approaches to Rate Limiting

1. In-Memory (Simple but Limited)

Good for single-instance APIs or development:
`javascript
const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per window
message: 'Too many requests, please try again later.'
});

app.use('/api/', limiter);
`

Pros: Easy to set up, no external dependencies

Cons: Doesn't work across multiple servers, resets on restart

2. Redis-Based (Production Ready)

Best for distributed systems:
`javascript
const RedisStore = require('rate-limit-redis');
const redis = require('redis');

const client = redis.createClient({
host: process.env.REDIS_HOST,
port: 6379
});

const limiter = rateLimit({
store: new RedisStore({
client: client,
prefix: 'rl:'
}),
windowMs: 15 * 60 * 1000,
max: 100
});

app.use('/api/', limiter);
`

Pros: Works across multiple instances, persistent across restarts

Cons: Requires Redis infrastructure

3. API Gateway Level (Enterprise)

Leverage your cloud provider's API Gateway:
`yaml

AWS API Gateway example

x-amazon-apigateway-request-validators:
all:
validateRequestBody: true
validateRequestParameters: true

paths:
/users:
get:
x-amazon-apigateway-integration:
type: aws_proxy
httpMethod: POST
uri: arn:aws:apigateway:region:lambda:path/function/arn
x-amazon-apigateway-throttle:
rateLimit: 100
burstLimit: 200
`

Pros: No code changes, scales automatically, DDoS protection

Cons: Vendor lock-in, potential costs

Advanced: Token Bucket Algorithm

For more sophisticated rate limiting, implement a token bucket:
`javascript
class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.refillRate = refillRate;
this.lastRefill = Date.now();
}

consume(tokens = 1) {
this.refill();

if (this.tokens >= tokens) {
  this.tokens -= tokens;
  return true;
}
return false;

}

refill() {
const now = Date.now();
const timePassed = (now - this.lastRefill) / 1000;
const tokensToAdd = timePassed * this.refillRate;

this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
this.lastRefill = now;

}
}

// Usage
const buckets = new Map();

app.use((req, res, next) => {
const ip = req.ip;

if (!buckets.has(ip)) {
buckets.set(ip, new TokenBucket(100, 10)); // 100 capacity, 10 tokens/sec
}

const bucket = buckets.get(ip);

if (bucket.consume(1)) {
next();
} else {
res.status(429).json({ error: 'Rate limit exceeded' });
}
});
`

User-Friendly Rate Limiting

Always include helpful headers:
javascript app.use((req, res, next) => { res.setHeader('X-RateLimit-Limit', '100'); res.setHeader('X-RateLimit-Remaining', '95'); res.setHeader('X-RateLimit-Reset', new Date(Date.now() + 900000).toISOString()); next(); });

Different Limits for Different Users

Implement tiered rate limiting:
`javascript
const getRateLimit = (user) => {
if (user.tier === 'premium') return 1000;
if (user.tier === 'standard') return 100;
return 10; // free tier
};

app.use(async (req, res, next) => {
const user = await authenticateUser(req);
const limit = getRateLimit(user);

// Apply user-specific limit
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: limit,
keyGenerator: (req) => user.id
});

limiter(req, res, next);
});
`

Monitoring Your Rate Limits

Track who's hitting limits:
javascript const limiter = rateLimit({ windowMs: 15 * 60 * 1000, max: 100, handler: (req, res) => { // Log for analysis console.log(Rate limit exceeded for ${req.ip}`);

// Send to monitoring
metrics.increment('rate_limit.exceeded', {
  ip: req.ip,
  endpoint: req.path
});

res.status(429).json({
  error: 'Too many requests',
  retryAfter: 900 // seconds
});

}
});
`

Quick Implementation Checklist

✅ Choose the right approach for your scale
✅ Use Redis for multi-instance deployments
✅ Include rate limit headers in responses
✅ Implement different tiers for different users
✅ Monitor rate limit violations
✅ Provide clear error messages with retry information
✅ Consider geographic rate limiting for global APIs

Common Pitfalls to Avoid

Rate limiting by IP alone: Use user IDs when available
Too aggressive limits: Start generous, tighten based on data
No burst allowance: Allow short traffic spikes
Ignoring authenticated vs anonymous: Different limits for each

Conclusion

Rate limiting is essential for API reliability. Start with a simple in-memory solution, move to Redis as you scale, and consider API Gateway for enterprise needs. Always include helpful headers and monitor violations to fine-tune your limits.

How do you handle rate limiting in your APIs? Any horror stories to share? 🚦

DEV Community