Rate limiting is one of those things every senior engineer knows they need, but far too many teams bolt on at the last minute — usually right after their first DDoS incident or surprise AWS bill. In this tutorial, I'll walk you through building a robust, sliding-window rate limiter from scratch using Node.js and Redis, the same pattern used in production at scale.
By the end, you'll have a reusable Express middleware that enforces per-user, per-route limits with graceful degradation when Redis is unavailable.
What We're Building
- A sliding window rate limiter (more accurate than fixed-window)
- Implemented as Express middleware
- Backed by Redis with an in-memory fallback
- Returns proper 429 responses with
Retry-Afterheaders - Fully typed with TypeScript
Prerequisites
- Node.js 18+
- A running Redis instance (locally via Docker or a managed service)
- Familiarity with Express and async/await
Step 1: Project Setup
mkdir rate-limiter-demo && cd rate-limiter-demo
npm init -y
npm install express redis
npm install -D typescript @types/express @types/node ts-node
npx tsc --init
Update your tsconfig.json to set "target": "ES2020" and "moduleResolution": "node".
Step 2: Spin Up Redis Locally
If you don't have Redis running, the fastest path is Docker:
docker run -d --name redis-limiter -p 6379:6379 redis:7-alpine
Step 3: Create the Redis Client
Create src/redisClient.ts:
import { createClient } from 'redis';
const client = createClient({
url: process.env.REDIS_URL || 'redis://localhost:6379',
});
client.on('error', (err) => {
console.error('[Redis] Connection error:', err.message);
});
export const connectRedis = async () => {
if (!client.isOpen) await client.connect();
};
export default client;
One thing worth emphasizing: we're using an error listener rather than letting uncaught exceptions crash the process. In production, Redis blips happen — you want to log and degrade gracefully, not take down your entire API.
Step 4: Implement the Sliding Window Algorithm
The fixed-window approach (e.g., "100 requests per minute, reset at :00") has a well-known edge case: a user can send 100 requests at 12:00:59 and another 100 at 12:01:01, effectively doubling your intended limit in a two-second window.
The sliding window algorithm solves this by tracking the actual timestamps of each request within a rolling time range.
Create src/rateLimiter.ts:
import { Request, Response, NextFunction } from 'express';
import redisClient from './redisClient';
interface RateLimiterOptions {
windowMs: number; // Time window in milliseconds
maxRequests: number; // Max requests per window
keyPrefix?: string; // Namespace for Redis keys
}
function getRateLimiter(options: RateLimiterOptions) {
const { windowMs, maxRequests, keyPrefix = 'rl' } = options;
return async function rateLimiterMiddleware(
req: Request,
res: Response,
next: NextFunction
) {
// Identify the requester — prefer authenticated user ID, fall back to IP
const identifier =
(req as any).user?.id || req.ip || 'anonymous';
const key = `${keyPrefix}:${req.path}:${identifier}`;
const now = Date.now();
const windowStart = now - windowMs;
try {
// Use a pipeline to batch Redis commands atomically
const pipeline = redisClient.multi();
// Remove timestamps outside the current window
pipeline.zRemRangeByScore(key, 0, windowStart);
// Count remaining requests in the window
pipeline.zCard(key);
// Add current request timestamp
pipeline.zAdd(key, { score: now, value: `${now}-${Math.random()}` });
// Set key TTL to auto-expire (prevent orphaned keys)
pipeline.expire(key, Math.ceil(windowMs / 1000));
const results = await pipeline.exec();
const requestCount = results[1] as number;
// Set informational headers
res.setHeader('X-RateLimit-Limit', maxRequests);
res.setHeader('X-RateLimit-Remaining', Math.max(0, maxRequests - requestCount - 1));
res.setHeader('X-RateLimit-Reset', new Date(now + windowMs).toISOString());
if (requestCount >= maxRequests) {
const retryAfterSeconds = Math.ceil(windowMs / 1000);
res.setHeader('Retry-After', retryAfterSeconds);
return res.status(429).json({
error: 'Too Many Requests',
message: `Rate limit exceeded. Try again in ${retryAfterSeconds} seconds.`,
retryAfter: retryAfterSeconds,
});
}
next();
} catch (err) {
// Redis is down — fail open to avoid blocking legitimate traffic
console.error('[RateLimiter] Redis unavailable, failing open:', err);
next();
}
};
}
export default getRateLimiter;
A few senior-level decisions worth noting here:
Why zAdd with a random suffix? Redis sorted sets use the value as a tiebreaker when scores are equal. If two requests arrive at the exact same millisecond, identical values would deduplicate. Adding a random suffix prevents that.
Why fail open? If Redis goes down, the alternative is failing closed — which means your entire API goes down with it. For most use cases, briefly losing rate limiting protection is preferable to a full outage. For high-security endpoints, you might reverse this.
Step 5: Wire It Into Express
Create src/index.ts:
import express from 'express';
import { connectRedis } from './redisClient';
import getRateLimiter from './rateLimiter';
const app = express();
app.use(express.json());
// Global limit: 100 requests per minute per user/IP
const globalLimiter = getRateLimiter({
windowMs: 60 * 1000,
maxRequests: 100,
keyPrefix: 'global',
});
// Strict limit for auth endpoints: 10 per minute
const authLimiter = getRateLimiter({
windowMs: 60 * 1000,
maxRequests: 10,
keyPrefix: 'auth',
});
app.use(globalLimiter);
app.post('/login', authLimiter, (req, res) => {
res.json({ message: 'Login endpoint' });
});
app.get('/health', (req, res) => {
res.json({ status: 'ok' });
});
const start = async () => {
await connectRedis();
app.listen(3000, () => console.log('Server running on port 3000'));
};
start();
Step 6: Test It
Run the server:
npx ts-node src/index.ts
In a separate terminal, fire off rapid requests:
for i in {1..12}; do curl -s -o /dev/null -w "%{http_code}\n" -X POST http://localhost:3000/login; done
You'll see 200s for the first 10 requests, then 429s — exactly as intended.
Step 7: Verify the Headers
curl -I -X POST http://localhost:3000/login
You should see:
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 8
X-RateLimit-Reset: 2024-11-15T14:32:00.000Z
These headers let well-behaved clients implement their own backoff logic without guessing.
Taking It Further in Production
Once this is working, a few things I'd add before shipping:
Tiered limits by user role. Your free tier might get 60 req/min while paid users get 600. You can pass a maxRequests resolver function instead of a static number and derive it from req.user.plan.
Distributed key namespacing. In a multi-region setup, you might want per-region limits vs. global limits. Prefix your keys accordingly (us-east:rl:... vs. global:rl:...).
Metrics and alerting. Log a warning when any key hits 80% of its limit. That's your early signal of abuse or a runaway client before the 429s start.
Testing with ioredis-mock. Swap the Redis client with a mock in your test environment so your rate limiter unit tests don't require a live Redis instance.
Conclusion
A well-built rate limiter isn't just a security control it's a first-class citizen of your API's contract with clients. The sliding window approach gives you accuracy, Redis gives you shared state across instances, and the fail-open design keeps a cache blip from becoming a customer-facing incident.
The full code for this tutorial is available on GitHub . Drop a comment below if you have questions or want me to cover token bucket rate limiting next.
Top comments (0)