DEV Community

Muhammad
Muhammad

Posted on

How to Build a Production-Ready API Rate Limiter with Node.js and Redis

Rate limiting is one of those things every senior engineer knows they need, but far too many teams bolt on at the last minute — usually right after their first DDoS incident or surprise AWS bill. In this tutorial, I'll walk you through building a robust, sliding-window rate limiter from scratch using Node.js and Redis, the same pattern used in production at scale.

By the end, you'll have a reusable Express middleware that enforces per-user, per-route limits with graceful degradation when Redis is unavailable.


What We're Building

  • A sliding window rate limiter (more accurate than fixed-window)
  • Implemented as Express middleware
  • Backed by Redis with an in-memory fallback
  • Returns proper 429 responses with Retry-After headers
  • Fully typed with TypeScript

Prerequisites

  • Node.js 18+
  • A running Redis instance (locally via Docker or a managed service)
  • Familiarity with Express and async/await

Step 1: Project Setup

mkdir rate-limiter-demo && cd rate-limiter-demo
npm init -y
npm install express redis
npm install -D typescript @types/express @types/node ts-node
npx tsc --init
Enter fullscreen mode Exit fullscreen mode

Update your tsconfig.json to set "target": "ES2020" and "moduleResolution": "node".


Step 2: Spin Up Redis Locally

If you don't have Redis running, the fastest path is Docker:

docker run -d --name redis-limiter -p 6379:6379 redis:7-alpine
Enter fullscreen mode Exit fullscreen mode

Step 3: Create the Redis Client

Create src/redisClient.ts:

import { createClient } from 'redis';

const client = createClient({
  url: process.env.REDIS_URL || 'redis://localhost:6379',
});

client.on('error', (err) => {
  console.error('[Redis] Connection error:', err.message);
});

export const connectRedis = async () => {
  if (!client.isOpen) await client.connect();
};

export default client;
Enter fullscreen mode Exit fullscreen mode

One thing worth emphasizing: we're using an error listener rather than letting uncaught exceptions crash the process. In production, Redis blips happen — you want to log and degrade gracefully, not take down your entire API.


Step 4: Implement the Sliding Window Algorithm

The fixed-window approach (e.g., "100 requests per minute, reset at :00") has a well-known edge case: a user can send 100 requests at 12:00:59 and another 100 at 12:01:01, effectively doubling your intended limit in a two-second window.

The sliding window algorithm solves this by tracking the actual timestamps of each request within a rolling time range.

Create src/rateLimiter.ts:

import { Request, Response, NextFunction } from 'express';
import redisClient from './redisClient';

interface RateLimiterOptions {
  windowMs: number;   // Time window in milliseconds
  maxRequests: number; // Max requests per window
  keyPrefix?: string;  // Namespace for Redis keys
}

function getRateLimiter(options: RateLimiterOptions) {
  const { windowMs, maxRequests, keyPrefix = 'rl' } = options;

  return async function rateLimiterMiddleware(
    req: Request,
    res: Response,
    next: NextFunction
  ) {
    // Identify the requester — prefer authenticated user ID, fall back to IP
    const identifier =
      (req as any).user?.id || req.ip || 'anonymous';
    const key = `${keyPrefix}:${req.path}:${identifier}`;
    const now = Date.now();
    const windowStart = now - windowMs;

    try {
      // Use a pipeline to batch Redis commands atomically
      const pipeline = redisClient.multi();

      // Remove timestamps outside the current window
      pipeline.zRemRangeByScore(key, 0, windowStart);

      // Count remaining requests in the window
      pipeline.zCard(key);

      // Add current request timestamp
      pipeline.zAdd(key, { score: now, value: `${now}-${Math.random()}` });

      // Set key TTL to auto-expire (prevent orphaned keys)
      pipeline.expire(key, Math.ceil(windowMs / 1000));

      const results = await pipeline.exec();
      const requestCount = results[1] as number;

      // Set informational headers
      res.setHeader('X-RateLimit-Limit', maxRequests);
      res.setHeader('X-RateLimit-Remaining', Math.max(0, maxRequests - requestCount - 1));
      res.setHeader('X-RateLimit-Reset', new Date(now + windowMs).toISOString());

      if (requestCount >= maxRequests) {
        const retryAfterSeconds = Math.ceil(windowMs / 1000);
        res.setHeader('Retry-After', retryAfterSeconds);
        return res.status(429).json({
          error: 'Too Many Requests',
          message: `Rate limit exceeded. Try again in ${retryAfterSeconds} seconds.`,
          retryAfter: retryAfterSeconds,
        });
      }

      next();
    } catch (err) {
      // Redis is down — fail open to avoid blocking legitimate traffic
      console.error('[RateLimiter] Redis unavailable, failing open:', err);
      next();
    }
  };
}

export default getRateLimiter;
Enter fullscreen mode Exit fullscreen mode

A few senior-level decisions worth noting here:

Why zAdd with a random suffix? Redis sorted sets use the value as a tiebreaker when scores are equal. If two requests arrive at the exact same millisecond, identical values would deduplicate. Adding a random suffix prevents that.

Why fail open? If Redis goes down, the alternative is failing closed — which means your entire API goes down with it. For most use cases, briefly losing rate limiting protection is preferable to a full outage. For high-security endpoints, you might reverse this.


Step 5: Wire It Into Express

Create src/index.ts:

import express from 'express';
import { connectRedis } from './redisClient';
import getRateLimiter from './rateLimiter';

const app = express();
app.use(express.json());

// Global limit: 100 requests per minute per user/IP
const globalLimiter = getRateLimiter({
  windowMs: 60 * 1000,
  maxRequests: 100,
  keyPrefix: 'global',
});

// Strict limit for auth endpoints: 10 per minute
const authLimiter = getRateLimiter({
  windowMs: 60 * 1000,
  maxRequests: 10,
  keyPrefix: 'auth',
});

app.use(globalLimiter);

app.post('/login', authLimiter, (req, res) => {
  res.json({ message: 'Login endpoint' });
});

app.get('/health', (req, res) => {
  res.json({ status: 'ok' });
});

const start = async () => {
  await connectRedis();
  app.listen(3000, () => console.log('Server running on port 3000'));
};

start();
Enter fullscreen mode Exit fullscreen mode

Step 6: Test It

Run the server:

npx ts-node src/index.ts
Enter fullscreen mode Exit fullscreen mode

In a separate terminal, fire off rapid requests:

for i in {1..12}; do curl -s -o /dev/null -w "%{http_code}\n" -X POST http://localhost:3000/login; done
Enter fullscreen mode Exit fullscreen mode

You'll see 200s for the first 10 requests, then 429s — exactly as intended.


Step 7: Verify the Headers

curl -I -X POST http://localhost:3000/login
Enter fullscreen mode Exit fullscreen mode

You should see:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 8
X-RateLimit-Reset: 2024-11-15T14:32:00.000Z
Enter fullscreen mode Exit fullscreen mode

These headers let well-behaved clients implement their own backoff logic without guessing.


Taking It Further in Production

Once this is working, a few things I'd add before shipping:

Tiered limits by user role. Your free tier might get 60 req/min while paid users get 600. You can pass a maxRequests resolver function instead of a static number and derive it from req.user.plan.

Distributed key namespacing. In a multi-region setup, you might want per-region limits vs. global limits. Prefix your keys accordingly (us-east:rl:... vs. global:rl:...).

Metrics and alerting. Log a warning when any key hits 80% of its limit. That's your early signal of abuse or a runaway client before the 429s start.

Testing with ioredis-mock. Swap the Redis client with a mock in your test environment so your rate limiter unit tests don't require a live Redis instance.


Conclusion

A well-built rate limiter isn't just a security control it's a first-class citizen of your API's contract with clients. The sliding window approach gives you accuracy, Redis gives you shared state across instances, and the fail-open design keeps a cache blip from becoming a customer-facing incident.

The full code for this tutorial is available on GitHub . Drop a comment below if you have questions or want me to cover token bucket rate limiting next.

Top comments (0)