DEV Community

Wilson Xu
Wilson Xu

Posted on

Building a Production-Ready Rate Limiter in Node.js

Building a Production-Ready Rate Limiter in Node.js

Rate limiting is one of those things developers ignore until they get hit by a botnet, a runaway script, or a competitor scraping their API. By then, it's too late — your server is melting, your database is overwhelmed, and legitimate users are getting errors.

In this guide, we'll build a production-ready rate limiter from scratch in Node.js. We'll implement the token bucket algorithm, integrate Redis for distributed rate limiting across multiple servers, add a sliding window counter for precision, and package everything as a reusable Express middleware.

This isn't a "just install express-rate-limit" tutorial. We're going deep — understanding the algorithms, their tradeoffs, and how to make rate limiting work reliably at scale.


Why Rate Limiting Matters

Before we write code, let's be clear about what we're protecting against:

  1. DDoS mitigation — Limit how fast any single IP can send requests
  2. API abuse — Prevent one customer from consuming all your capacity
  3. Brute force protection — Slow down password-guessing attacks
  4. Cost control — Cap expensive operations (AI calls, SMS, emails) per user
  5. Fair usage — Ensure all users get their fair share of resources

Different threats need different strategies. A login endpoint needs aggressive limits (10 attempts/minute). A read API can be more generous (1000 requests/minute). We'll build a flexible system that handles both.


Algorithm Overview: Four Approaches

1. Fixed Window Counter

Simplest approach: count requests in fixed time windows (e.g., 0:00-0:01, 0:01-0:02).

Problem: A user can send 1000 requests at 0:00:59 and another 1000 at 0:01:01 — 2000 requests in 2 seconds, bypassing a "1000/minute" limit.

2. Sliding Window Log

Store timestamps of every request. Count how many fall within the last N seconds.

Problem: Memory-intensive. Storing timestamps for millions of users is expensive.

3. Sliding Window Counter (Hybrid)

Best of both worlds. Use two fixed windows and interpolate based on position within the current window. Very accurate, memory-efficient.

4. Token Bucket

Each user has a "bucket" of tokens. Each request consumes one token. Tokens refill at a fixed rate. Users can burst up to the bucket size.

Advantage: Allows natural bursting while enforcing average rate limits. Great for APIs.

We'll implement token bucket as our primary algorithm and sliding window as an alternative.


Setting Up the Project

mkdir rate-limiter-demo && cd rate-limiter-demo
npm init -y
npm install express redis ioredis
npm install -D typescript @types/express @types/node ts-node
Enter fullscreen mode Exit fullscreen mode

Part 1: In-Memory Token Bucket

Let's start with a pure in-memory implementation to understand the algorithm:

// src/algorithms/tokenBucket.ts

interface BucketState {
  tokens: number;
  lastRefill: number;
}

interface TokenBucketOptions {
  capacity: number;        // Max tokens (burst limit)
  refillRate: number;      // Tokens added per second
  refillInterval?: number; // How often to refill (ms), default 1000
}

export class TokenBucket {
  private buckets = new Map<string, BucketState>();
  private options: Required<TokenBucketOptions>;

  constructor(options: TokenBucketOptions) {
    this.options = {
      refillInterval: 1000,
      ...options
    };

    // Clean up stale buckets periodically
    setInterval(() => this.cleanup(), 60_000);
  }

  /**
   * Attempt to consume tokens for a given key.
   * Returns { allowed: true } if within limit, or
   * { allowed: false, retryAfter: ms } if limited.
   */
  consume(key: string, tokens = 1): { allowed: boolean; remaining: number; retryAfter?: number } {
    const now = Date.now();
    const bucket = this.buckets.get(key) ?? { tokens: this.options.capacity, lastRefill: now };

    // Calculate tokens to add since last refill
    const elapsed = now - bucket.lastRefill;
    const tokensToAdd = (elapsed / 1000) * this.options.refillRate;

    // Refill the bucket (don't exceed capacity)
    const currentTokens = Math.min(
      this.options.capacity,
      bucket.tokens + tokensToAdd
    );

    if (currentTokens >= tokens) {
      // Allow the request
      this.buckets.set(key, {
        tokens: currentTokens - tokens,
        lastRefill: now
      });

      return {
        allowed: true,
        remaining: Math.floor(currentTokens - tokens)
      };
    } else {
      // Deny the request
      // Calculate how long until they have enough tokens
      const deficit = tokens - currentTokens;
      const waitMs = Math.ceil((deficit / this.options.refillRate) * 1000);

      // Update lastRefill even when denied (to track time accurately)
      this.buckets.set(key, {
        tokens: currentTokens,
        lastRefill: now
      });

      return {
        allowed: false,
        remaining: Math.floor(currentTokens),
        retryAfter: waitMs
      };
    }
  }

  // Remove stale entries to prevent memory leaks
  private cleanup(): void {
    const now = Date.now();
    const staleThreshold = 5 * 60 * 1000; // 5 minutes

    for (const [key, bucket] of this.buckets) {
      if (now - bucket.lastRefill > staleThreshold) {
        this.buckets.delete(key);
      }
    }
  }

  getStats(key: string): BucketState | null {
    return this.buckets.get(key) ?? null;
  }
}
Enter fullscreen mode Exit fullscreen mode

Let's test this works correctly:

// Quick test
const bucket = new TokenBucket({ capacity: 10, refillRate: 2 }); // 2 tokens/sec, burst of 10

// Burst: first 10 requests succeed
for (let i = 0; i < 10; i++) {
  const result = bucket.consume('user-123');
  console.log(`Request ${i + 1}: ${result.allowed}, remaining: ${result.remaining}`);
}

// 11th request fails
const limited = bucket.consume('user-123');
console.log('11th:', limited); // { allowed: false, remaining: 0, retryAfter: 500 }

// After 1 second, 2 new tokens available
setTimeout(() => {
  const result = bucket.consume('user-123');
  console.log('After 1s:', result); // { allowed: true, remaining: 1 }
}, 1000);
Enter fullscreen mode Exit fullscreen mode

Part 2: Redis-Backed Token Bucket for Production

In-memory rate limiting breaks the moment you run multiple server instances. If you have 3 servers, a user can send 3x the limit. Redis solves this — all servers share the same rate limit state.

The key is making the token consumption atomic using Redis Lua scripts:

// src/algorithms/redisTokenBucket.ts
import Redis from 'ioredis';

const TOKEN_BUCKET_SCRIPT = `
-- KEYS[1]: the bucket key (e.g., "ratelimit:user:123")
-- ARGV[1]: bucket capacity
-- ARGV[2]: refill rate (tokens per second)
-- ARGV[3]: tokens to consume
-- ARGV[4]: current timestamp (milliseconds)

local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local requested = tonumber(ARGV[3])
local now = tonumber(ARGV[4])

-- Get current bucket state
local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local current_tokens = tonumber(bucket[1])
local last_refill = tonumber(bucket[2])

-- Initialize if first request
if current_tokens == nil then
  current_tokens = capacity
  last_refill = now
end

-- Calculate token refill
local elapsed = (now - last_refill) / 1000  -- convert to seconds
local tokens_to_add = elapsed * refill_rate
current_tokens = math.min(capacity, current_tokens + tokens_to_add)

local allowed = 0
local retry_after = 0

if current_tokens >= requested then
  -- Allow the request
  current_tokens = current_tokens - requested
  allowed = 1
else
  -- Deny: calculate wait time
  local deficit = requested - current_tokens
  retry_after = math.ceil((deficit / refill_rate) * 1000)
end

-- Save updated state with TTL (auto-cleanup)
redis.call('HMSET', key,
  'tokens', current_tokens,
  'last_refill', now
)
redis.call('PEXPIRE', key, math.ceil(capacity / refill_rate * 1000) + 5000)

return {allowed, math.floor(current_tokens), retry_after}
`;

interface RateLimitResult {
  allowed: boolean;
  remaining: number;
  retryAfter?: number;
  resetAt?: Date;
}

interface RedisTokenBucketOptions {
  capacity: number;
  refillRate: number;
  keyPrefix?: string;
}

export class RedisTokenBucket {
  private redis: Redis;
  private options: Required<RedisTokenBucketOptions>;
  private scriptSha?: string;

  constructor(redis: Redis, options: RedisTokenBucketOptions) {
    this.redis = redis;
    this.options = {
      keyPrefix: 'ratelimit',
      ...options
    };
  }

  async initialize(): Promise<void> {
    // Load the Lua script and cache its SHA for efficiency
    this.scriptSha = await this.redis.script('LOAD', TOKEN_BUCKET_SCRIPT) as string;
    console.log('Rate limiter script loaded, SHA:', this.scriptSha);
  }

  async consume(identifier: string, tokens = 1): Promise<RateLimitResult> {
    const key = `${this.options.keyPrefix}:${identifier}`;
    const now = Date.now();

    try {
      let result: [number, number, number];

      if (this.scriptSha) {
        try {
          result = await this.redis.evalsha(
            this.scriptSha,
            1,
            key,
            this.options.capacity,
            this.options.refillRate,
            tokens,
            now
          ) as [number, number, number];
        } catch (err: any) {
          // Script may have been flushed, reload and retry
          if (err.message.includes('NOSCRIPT')) {
            this.scriptSha = undefined;
            return this.consume(identifier, tokens);
          }
          throw err;
        }
      } else {
        // Fallback: load inline (slower but safe)
        result = await this.redis.eval(
          TOKEN_BUCKET_SCRIPT,
          1,
          key,
          this.options.capacity,
          this.options.refillRate,
          tokens,
          now
        ) as [number, number, number];
      }

      const [allowed, remaining, retryAfter] = result;

      return {
        allowed: allowed === 1,
        remaining,
        retryAfter: retryAfter > 0 ? retryAfter : undefined,
        resetAt: retryAfter > 0 ? new Date(now + retryAfter) : undefined
      };
    } catch (err) {
      // On Redis failure, fail open (allow the request) rather than
      // block all traffic. Log the error for monitoring.
      console.error('Rate limiter Redis error:', err);
      return { allowed: true, remaining: -1 };
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Part 3: Sliding Window Counter

For endpoints where precise rate limiting matters (login, payment, sensitive operations), use the sliding window algorithm:

// src/algorithms/slidingWindow.ts
import Redis from 'ioredis';

const SLIDING_WINDOW_SCRIPT = `
-- Sliding window rate limiter using two fixed windows
-- KEYS[1]: current window key
-- KEYS[2]: previous window key
-- ARGV[1]: max requests per window
-- ARGV[2]: window size in seconds
-- ARGV[3]: current timestamp (seconds)
-- ARGV[4]: window index (current_time // window_size)

local current_key = KEYS[1]
local previous_key = KEYS[2]
local max_requests = tonumber(ARGV[1])
local window_size = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local current_window = tonumber(ARGV[4])

-- Get counts from both windows
local current_count = tonumber(redis.call('GET', current_key) or '0')
local previous_count = tonumber(redis.call('GET', previous_key) or '0')

-- Calculate weight of previous window
-- If we're 30% into the current window, previous window contributes 70%
local window_start = current_window * window_size
local elapsed_in_window = now - window_start
local previous_weight = 1 - (elapsed_in_window / window_size)

-- Weighted request count
local weighted_count = math.floor(
  previous_count * previous_weight + current_count
)

if weighted_count >= max_requests then
  -- Calculate when the window resets enough to allow a request
  local next_window_start = (current_window + 1) * window_size
  local retry_after = next_window_start - now
  return {0, max_requests - weighted_count, retry_after}
end

-- Increment current window
local new_count = redis.call('INCR', current_key)
if new_count == 1 then
  -- Set TTL on first increment (2x window to keep previous window)
  redis.call('EXPIRE', current_key, window_size * 2)
end

return {1, max_requests - weighted_count - 1, 0}
`;

interface SlidingWindowOptions {
  windowSize: number;   // seconds
  maxRequests: number;
  keyPrefix?: string;
}

export class SlidingWindowRateLimiter {
  private redis: Redis;
  private options: Required<SlidingWindowOptions>;

  constructor(redis: Redis, options: SlidingWindowOptions) {
    this.redis = redis;
    this.options = { keyPrefix: 'sw', ...options };
  }

  async consume(identifier: string): Promise<{
    allowed: boolean;
    remaining: number;
    retryAfter?: number;
  }> {
    const { windowSize, maxRequests, keyPrefix } = this.options;
    const now = Math.floor(Date.now() / 1000);
    const currentWindow = Math.floor(now / windowSize);

    const currentKey = `${keyPrefix}:${identifier}:${currentWindow}`;
    const previousKey = `${keyPrefix}:${identifier}:${currentWindow - 1}`;

    const result = await this.redis.eval(
      SLIDING_WINDOW_SCRIPT,
      2,
      currentKey,
      previousKey,
      maxRequests,
      windowSize,
      now,
      currentWindow
    ) as [number, number, number];

    const [allowed, remaining, retryAfter] = result;

    return {
      allowed: allowed === 1,
      remaining: Math.max(0, remaining),
      retryAfter: retryAfter > 0 ? retryAfter : undefined
    };
  }
}
Enter fullscreen mode Exit fullscreen mode

Part 4: Express Middleware

Now let's wrap everything in a clean, configurable Express middleware:

// src/middleware/rateLimiter.ts
import { Request, Response, NextFunction } from 'express';
import Redis from 'ioredis';
import { RedisTokenBucket } from '../algorithms/redisTokenBucket';
import { SlidingWindowRateLimiter } from '../algorithms/slidingWindow';

type Algorithm = 'token-bucket' | 'sliding-window';

interface RateLimiterMiddlewareOptions {
  // Algorithm selection
  algorithm?: Algorithm;

  // Token bucket options
  capacity?: number;
  refillRate?: number;

  // Sliding window options
  windowSize?: number;
  maxRequests?: number;

  // Key extraction: what identifies a "user"?
  keyExtractor?: (req: Request) => string;

  // Skip rate limiting for certain requests
  skip?: (req: Request) => boolean;

  // Custom response when limited
  onLimited?: (req: Request, res: Response) => void;

  // Redis connection
  redis: Redis;

  // Namespace for this limiter (allows multiple limiters)
  name?: string;
}

// Default key extractor: use IP address
const defaultKeyExtractor = (req: Request): string => {
  const ip =
    req.headers['x-forwarded-for']?.toString().split(',')[0] ||
    req.headers['x-real-ip']?.toString() ||
    req.socket.remoteAddress ||
    'unknown';
  return `ip:${ip}`;
};

export function createRateLimiter(options: RateLimiterMiddlewareOptions) {
  const {
    algorithm = 'token-bucket',
    capacity = 100,
    refillRate = 10,
    windowSize = 60,
    maxRequests = 100,
    keyExtractor = defaultKeyExtractor,
    skip,
    onLimited,
    redis,
    name = 'default'
  } = options;

  // Initialize the rate limiter based on algorithm
  let limiter: RedisTokenBucket | SlidingWindowRateLimiter;

  if (algorithm === 'token-bucket') {
    const bucket = new RedisTokenBucket(redis, {
      capacity,
      refillRate,
      keyPrefix: `ratelimit:${name}`
    });

    // Initialize async (load Lua script)
    bucket.initialize().catch(console.error);
    limiter = bucket;
  } else {
    limiter = new SlidingWindowRateLimiter(redis, {
      windowSize,
      maxRequests,
      keyPrefix: `ratelimit:${name}`
    });
  }

  // Return the Express middleware
  return async (req: Request, res: Response, next: NextFunction): Promise<void> => {
    // Skip if configured
    if (skip?.(req)) {
      return next();
    }

    const key = keyExtractor(req);

    try {
      const result = await limiter.consume(key);

      // Set rate limit headers (standard RFC 6585)
      res.setHeader('X-RateLimit-Limit', capacity || maxRequests);
      res.setHeader('X-RateLimit-Remaining', result.remaining);

      if (result.retryAfter) {
        res.setHeader('Retry-After', Math.ceil(result.retryAfter / 1000));
        res.setHeader('X-RateLimit-Reset', Date.now() + result.retryAfter);
      }

      if (!result.allowed) {
        if (onLimited) {
          onLimited(req, res);
        } else {
          res.status(429).json({
            error: 'Too Many Requests',
            message: 'You have exceeded the rate limit. Please slow down.',
            retryAfter: result.retryAfter
              ? Math.ceil(result.retryAfter / 1000)
              : undefined
          });
        }
        return;
      }

      next();
    } catch (err) {
      // Log but don't block traffic on limiter failure
      console.error(`Rate limiter error for key ${key}:`, err);
      next();
    }
  };
}
Enter fullscreen mode Exit fullscreen mode

Part 5: Real-World Usage

Here's how to use this in a real Express application:

// src/app.ts
import express from 'express';
import Redis from 'ioredis';
import { createRateLimiter } from './middleware/rateLimiter';

const app = express();
const redis = new Redis(process.env.REDIS_URL || 'redis://localhost:6379');

// Global rate limiter: 1000 requests/minute per IP
const globalLimiter = createRateLimiter({
  redis,
  name: 'global',
  algorithm: 'token-bucket',
  capacity: 1000,
  refillRate: 16.67, // ~1000/minute
});

// Auth limiter: strict sliding window for login
const authLimiter = createRateLimiter({
  redis,
  name: 'auth',
  algorithm: 'sliding-window',
  windowSize: 900,  // 15 minutes
  maxRequests: 10,  // 10 attempts per 15 min
  keyExtractor: (req) => {
    // Rate limit by IP AND username together
    const ip = req.socket.remoteAddress || 'unknown';
    const username = req.body?.username || 'unknown';
    return `${ip}:${username}`;
  },
  onLimited: (req, res) => {
    res.status(429).json({
      error: 'Account temporarily locked',
      message: 'Too many failed login attempts. Please wait 15 minutes.',
      unlockAt: new Date(Date.now() + 900_000).toISOString()
    });
  }
});

// API limiter: per-user token bucket
const apiLimiter = createRateLimiter({
  redis,
  name: 'api',
  algorithm: 'token-bucket',
  capacity: 500,
  refillRate: 8.33, // 500/minute
  keyExtractor: (req) => {
    // Use authenticated user ID if available, fall back to IP
    const userId = (req as any).user?.id;
    return userId ? `user:${userId}` : `ip:${req.socket.remoteAddress}`;
  },
  skip: (req) => {
    // Don't rate limit health checks
    return req.path === '/health';
  }
});

// AI/expensive operation limiter: very strict
const aiLimiter = createRateLimiter({
  redis,
  name: 'ai',
  algorithm: 'sliding-window',
  windowSize: 3600,  // 1 hour
  maxRequests: 20,   // 20 AI calls per hour per user
  keyExtractor: (req) => `user:${(req as any).user?.id || 'anon'}`
});

// Apply limiters
app.use(globalLimiter);

app.post('/auth/login', authLimiter, async (req, res) => {
  // Login logic here
});

app.use('/api', apiLimiter);

app.post('/api/ai/generate', aiLimiter, async (req, res) => {
  // Expensive AI operation
});

app.listen(3000, () => console.log('Server running on port 3000'));
Enter fullscreen mode Exit fullscreen mode

Part 6: Tiered Rate Limits

Real APIs have different limits for different plan tiers:

// src/middleware/tieredRateLimiter.ts
interface PlanLimits {
  capacity: number;
  refillRate: number;
}

const PLAN_LIMITS: Record<string, PlanLimits> = {
  free:       { capacity: 100,   refillRate: 1.67  }, // 100/min
  starter:    { capacity: 1000,  refillRate: 16.67 }, // 1000/min
  pro:        { capacity: 10000, refillRate: 166.7 }, // 10k/min
  enterprise: { capacity: 100000, refillRate: 1667 }  // 100k/min
};

export function createTieredRateLimiter(redis: Redis) {
  // Create a bucket per plan
  const limiters = new Map<string, RedisTokenBucket>();

  for (const [plan, limits] of Object.entries(PLAN_LIMITS)) {
    const bucket = new RedisTokenBucket(redis, {
      ...limits,
      keyPrefix: `ratelimit:${plan}`
    });
    bucket.initialize().catch(console.error);
    limiters.set(plan, bucket);
  }

  return async (req: Request, res: Response, next: NextFunction) => {
    const user = (req as any).user;
    const plan = user?.plan || 'free';
    const limiter = limiters.get(plan) || limiters.get('free')!;

    const result = await limiter.consume(`user:${user?.id || req.ip}`);

    res.setHeader('X-RateLimit-Plan', plan);
    res.setHeader('X-RateLimit-Limit', PLAN_LIMITS[plan].capacity);
    res.setHeader('X-RateLimit-Remaining', result.remaining);

    if (!result.allowed) {
      return res.status(429).json({
        error: 'Rate limit exceeded',
        plan,
        upgradeUrl: 'https://myapp.com/pricing'
      });
    }

    next();
  };
}
Enter fullscreen mode Exit fullscreen mode

Part 7: Monitoring Rate Limits

Rate limiting generates valuable signals. Track them:

// src/middleware/rateLimitMonitor.ts
export function wrapWithMonitoring(limiter: RateLimiterMiddleware) {
  const metrics = {
    totalRequests: 0,
    blockedRequests: 0,
    blocksByKey: new Map<string, number>()
  };

  return async (req: Request, res: Response, next: NextFunction) => {
    metrics.totalRequests++;

    const originalJson = res.json.bind(res);
    res.json = (body: any) => {
      if (res.statusCode === 429) {
        metrics.blockedRequests++;
        const key = req.ip || 'unknown';
        metrics.blocksByKey.set(key, (metrics.blocksByKey.get(key) || 0) + 1);

        // Alert on suspicious patterns (same IP blocked 100+ times)
        const blockCount = metrics.blocksByKey.get(key)!;
        if (blockCount > 100 && blockCount % 100 === 0) {
          console.warn(`Possible attack: ${key} has been blocked ${blockCount} times`);
          // Could trigger IP ban, alert Slack, etc.
        }
      }
      return originalJson(body);
    };

    return limiter(req, res, next);
  };
}

// Expose metrics endpoint
app.get('/internal/rate-limit-stats', (req, res) => {
  const topBlockedIPs = Array.from(metrics.blocksByKey.entries())
    .sort((a, b) => b[1] - a[1])
    .slice(0, 10);

  res.json({
    totalRequests: metrics.totalRequests,
    blockedRequests: metrics.blockedRequests,
    blockRate: metrics.blockedRequests / metrics.totalRequests,
    topBlockedIPs
  });
});
Enter fullscreen mode Exit fullscreen mode

Testing the Rate Limiter

// tests/rateLimiter.test.ts
import { createClient } from 'redis';
import { TokenBucket } from '../src/algorithms/tokenBucket';

describe('TokenBucket', () => {
  let bucket: TokenBucket;

  beforeEach(() => {
    bucket = new TokenBucket({ capacity: 10, refillRate: 2 });
  });

  it('should allow requests up to capacity', () => {
    for (let i = 0; i < 10; i++) {
      const result = bucket.consume('test-user');
      expect(result.allowed).toBe(true);
    }
  });

  it('should block requests over capacity', () => {
    for (let i = 0; i < 10; i++) bucket.consume('test-user');
    const result = bucket.consume('test-user');
    expect(result.allowed).toBe(false);
    expect(result.retryAfter).toBeGreaterThan(0);
  });

  it('should refill tokens over time', async () => {
    // Exhaust tokens
    for (let i = 0; i < 10; i++) bucket.consume('test-user');

    // Wait for refill (mock or real time)
    await new Promise(resolve => setTimeout(resolve, 1000));

    const result = bucket.consume('test-user');
    expect(result.allowed).toBe(true);
    // After 1s at 2 tokens/sec, should have ~2 tokens
    expect(result.remaining).toBeCloseTo(1, 0);
  });

  it('should track different users independently', () => {
    for (let i = 0; i < 10; i++) bucket.consume('user-a');

    // user-b should not be affected
    const result = bucket.consume('user-b');
    expect(result.allowed).toBe(true);
  });
});
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls

1. Not accounting for clock skew in distributed systems
Redis uses server time, not your application server time. This is generally fine since Redis is authoritative, but be aware of potential clock drift issues.

2. Forgetting to fail open
If Redis is down, should you block all requests? Almost certainly not. Fail open (allow requests) and log the error. The alternative — blocking all traffic when Redis restarts — is much worse.

3. Rate limiting by IP behind a load balancer
req.socket.remoteAddress gives you the load balancer's IP, not the user's. Always use X-Forwarded-For (and trust only your own load balancer's value).

4. Not setting proper TTLs
Without TTLs on Redis keys, your rate limit data grows forever. Every Lua script above sets appropriate TTLs.


Conclusion

A production rate limiter is more than a simple counter. The token bucket algorithm handles bursting gracefully; Redis atomicity ensures correctness under concurrent load; the sliding window gives precision for sensitive endpoints.

The middleware pattern we built is:

  • Algorithm-agnostic — swap implementations without changing your routes
  • Fail-safe — Redis failures don't bring down your API
  • Observable — rate limit events generate actionable metrics
  • Flexible — per-plan, per-endpoint, per-user limits with custom key extraction

Start with the in-memory version for local development, add Redis for staging and production. Layer in monitoring early — the block patterns you see will tell you a lot about how your API is being used (and abused).


Wilson Xu is a backend engineer who builds distributed systems and developer tools. He writes about Node.js, Redis, and API design.

Top comments (0)