DEV Community

Cover image for Protect Your Node.js API: Rate Limiting with Fixed Window, Sliding Window, and Token Bucket
Odunayo Dada
Odunayo Dada

Posted on

Protect Your Node.js API: Rate Limiting with Fixed Window, Sliding Window, and Token Bucket

Rate limiting is a strategy for limiting the number of requests a client or user can make to a network, application or API within a specified time (per minute, per second).

Why is Rate Limiting Important?

1. Protects Resources from Misuse

Without rate limiting, a single client (or bot) would be able to make thousands of requests within seconds. This can crash your server, increase expense (if you pay per API call or compute time), and reduce performance for every other user. With rate limiting, you block any single client from taking over your system’s resources.

2. Stops Denial-of-Service (DoS) Attacks

Attackers will normally try to flood servers with traffic in an effort to make the service unavailable. Rate limiting counteracts the impact of such an attack by turning off abusive requests before they consume all of your bandwidth, memory, or CPU. While it’s not a complete solution against volumetric Distributed DoS (DDoS), it’s an essential first line of defense.

3. Enforces Fair Use

In multi-user systems or public APIs, you want everyone to have a fair share of access. Rate limiting ensures that one client doesn’t hog the service at the expense of others. For example, in an API that allows 100 requests per minute, every user gets the same opportunity, preventing abuse and maintaining a consistent experience across the board.

In this article, we’ll look at three (3) popular approaches for implementing rate limiting.

  1. Fixed window

  2. Sliding window

  3. Token bucket / Leaky bucket

1. Fixed Window

The Fixed Window strategy counts requests within a strict time window (like seconds, minutes or hour). When the window resets, the count resets too.

For example, if an API request is limited to 5 requests per minute. The API cannot take more than 5 requests in every minutes and this threshold resets every 1 minute.

import { Request, Response, NextFunction } from 'express';

// Configuration
const WINDOW_SIZE_IN_MS = 60_000; // 1 minute
const MAX_REQUESTS = 5; // per IP per window

// Store: { ip -> { count, windowStart } }
type Entry = { count: number; windowStart: number };
const store = new Map<string, Entry>();

export function fixedWindowLimiter(req: Request, res: Response, next: NextFunction) {
  const ip = req.ip || req.connection.remoteAddress || 'unknown';
  const now = Date.now();

  let entry = store.get(ip);

  if (!entry) {
    // First request for this IP
    store.set(ip, { count: 1, windowStart: now });
    return next();
  }

  // If current window expired → reset counter
  if (now - entry.windowStart >= WINDOW_SIZE_IN_MS) {
    entry = { count: 1, windowStart: now };
    store.set(ip, entry);
    return next();
  }

  // Still inside the window
  entry.count++;

  if (entry.count > MAX_REQUESTS) {
    const retryAfter = Math.ceil((entry.windowStart + WINDOW_SIZE_IN_MS - now) / 1000);

    res.setHeader('Retry-After', retryAfter.toString());
    return res.status(429).json({
      success: false,
      message: `Too many requests. Try again in ${retryAfter}s`
    });
  }

  return next();
}
Enter fullscreen mode Exit fullscreen mode

2. Sliding window

Unlike fixed window rate limiting (which resets counters at regular intervals), the sliding window method continuously evaluates requests based on a moving time window.

For example, an API with a limit of 100 requests per minute:

If a user sends 90 requests in the last 50 seconds, they can only send 10 more in the next 10 seconds.

Every second, the window slides forward, dropping old requests and including new ones.

import { Request, Response, NextFunction } from 'express';
import Redis from 'ioredis';

const redis = new Redis();
const WINDOW_SIZE = 60; // seconds
const MAX_REQUESTS = 100;

export async function slidingWindowLimiter(req: Request, res: Response, next: NextFunction) {
  const key = `sliding:${req.ip}`;
  const now = Date.now();

  const windowStart = now - WINDOW_SIZE * 1000;

  // Remove old requests outside window
  await redis.zremrangebyscore(key, 0, windowStart);

  // Count requests in window
  const count = await redis.zcard(key);

  if (count >= MAX_REQUESTS) {
    res.setHeader('Retry-After', String(WINDOW_SIZE));
    return res.status(429).json({ message: 'Too many requests' });
  }

  // Add current request timestamp
  await redis.zadd(key, now, now.toString());
  await redis.expire(key, WINDOW_SIZE);

  next();
}
Enter fullscreen mode Exit fullscreen mode

3. Token bucket / Leaky bucket

Allows bursts up to a capacity; then refills gradually. It enforces a strict, constant rate of processing, smoothing out traffic.

Here is how it works:

  1. Requests are added to a queue (the bucket).
  2. The bucket leaks at a fixed rate (requests are processed one at a time at. regular intervals).

  3. If the bucket overflows (too many requests), excess requests are dropped.

import { Request, Response, NextFunction } from 'express';
import { RateLimiterRedis } from 'rate-limiter-flexible';
import Redis from 'ioredis';

const redis = new Redis();

const limiter = new RateLimiterRedis({
  storeClient: redis,
  keyPrefix: 'bucket',
  points: 150, // bucket capacity
  duration: 60, // refill window (60s → 150 tokens per minute)
  execEvenly: true // smooth out evenly
});

export async function tokenBucketLimiter(req: Request, res: Response, next: NextFunction) {
  try {
    await limiter.consume(req.ip, 1);
    next();
  } catch (rejRes) {
    const retrySecs = Math.ceil(rejRes.msBeforeNext / 1000) || 1;
    res.set('Retry-After', String(retrySecs));
    res.status(429).json({ message: 'Too many requests, retry later' });
  }
}
Enter fullscreen mode Exit fullscreen mode

Rate limiting is one of the simplest yet most effective tools for protecting your APIs. Whether you choose fixed window, sliding window or token bucket, the right strategy depends on your app’s traffic patterns and scaling needs.

Fixed Window :easy to implement, good for small projects.
Sliding Window: fairer distribution, great for APIs with steady load.
Token Bucket / Leaky Bucket: best for production, balances bursts and consistency.

Conclusion

Rate limiting is more than just a performance trick — it’s a safeguard against abuse, downtime, and unfair usage. We’ve explored three powerful strategies: Fixed Window, Sliding Window, and Token/Leaky Bucket.

But this is just the beginning. In my YouTube video, I’ll show you how to set up real rate limiting in a Node.js app, step by step, with live coding examples and best practices to keep your APIs secure and efficient.

👉 Watch the full video here

Top comments (0)