Ali

Posted on Jan 30

Rate Limiting in Node.js: The Complete Guide for Production APIs (2026)

#node #ratelimiting #backend #express

TL;DR: Rate limiting protects your API from abuse and ensures fair usage. This guide covers algorithms, implementation patterns, distributed systems, and common pitfalls—with practical code examples.

Every production API needs rate limiting. Without it, a single bad actor (or a bug in a client) can take down your entire service. This guide covers everything you need to implement rate limiting correctly in Node.js.

Why Rate Limiting Matters

Rate limiting serves three purposes:

1. Protection Against Abuse

Bots, scrapers, and attackers can flood your API with requests. Rate limiting caps how much damage they can do.

2. Fair Resource Allocation

Without limits, one aggressive client can starve others. Rate limiting ensures everyone gets a fair share.

3. API Monetization

SaaS products use tiered rate limits to differentiate pricing tiers. Free users get 100 req/hour, paid users get 10,000.

Real Examples of What Goes Wrong

Twitter (2013): API abuse caused widespread outages. They introduced aggressive rate limiting.
GitHub: Rate limits all API calls. Unauthenticated: 60 req/hour. Authenticated: 5,000 req/hour.
Stripe: 100 read operations/sec, 25 write operations/sec per API key.

If these companies need rate limiting, so do you.

Rate Limiting Algorithms Explained

Three main algorithms power rate limiters. Each has tradeoffs.

Fixed Window

Divide time into fixed slots (0:00-1:00, 1:00-2:00). Count requests per slot. Reset count at slot boundary.

Window 1 (0:00-1:00): ████████░░ 80/100 ✓
Window 2 (1:00-2:00): ██████████ 100/100 ✓
Window 3 (2:00-3:00): ███░░░░░░░ 30/100 ✓

Pros:

Simple to implement
O(1) time and space complexity
Easy to understand and debug

Cons:

Burst at boundaries (100 at 0:59 + 100 at 1:01 = 200 in 2 seconds)

Best for: Most general-purpose APIs where exact precision isn't critical.

Sliding Window Log

Store the timestamp of every request. Count requests in the last N seconds.

Requests: [0:15, 0:22, 0:45, 0:58, 1:03, 1:15]
Window (last 60s at 1:20): [0:22, 0:45, 0:58, 1:03, 1:15] = 5 requests

Pros:

Precise rate limiting
No burst at boundaries
Smooth traffic shaping

Cons:

Memory grows with traffic (store every timestamp)
O(n) lookups
Not practical for high-traffic APIs

Best for: Low-traffic, precision-critical scenarios (billing, security).

Sliding Window Counter

Hybrid approach. Use weighted average of current and previous window counts.

Previous window (0:00-1:00): 80 requests
Current window (1:00-2:00): 30 requests so far
Time into current window: 20 seconds (33%)

Estimated count = (80 * 0.67) + 30 = 83.6 requests

Pros:

Smooths the burst problem
O(1) operations
Good accuracy without storing every timestamp

Cons:

Approximation, not exact
Slightly more complex to implement

Best for: High-traffic APIs needing smoother limiting than fixed window.

Token Bucket

Tokens added at fixed rate. Each request consumes one token. Requests denied when bucket empty.

Bucket capacity: 100 tokens
Refill rate: 10 tokens/second

Time 0: 100 tokens
Burst of 100 requests: 0 tokens
After 5 seconds: 50 tokens (refilled)

Pros:

Allows controlled bursts
Smooth rate limiting
Flexible configuration

Cons:

More complex to implement
Requires tracking token count and last refill time

Best for: APIs that want to allow short bursts while limiting sustained traffic.

Decision Tree

Need precision? → Sliding Window Log
High traffic + allow bursts? → Token Bucket
High traffic + simple? → Fixed Window
High traffic + smooth? → Sliding Window Counter

Most APIs? Start with Fixed Window. It's simple, fast, and good enough.

Implementing Rate Limiting in Node.js

The Naive Approach (Don't Do This)

// Don't do this in production
const requests = {}

app.use((req, res, next) => {
  const ip = req.ip
  const now = Date.now()

  if (!requests[ip]) {
    requests[ip] = { count: 1, startTime: now }
  } else if (now - requests[ip].startTime > 60000) {
    requests[ip] = { count: 1, startTime: now }
  } else {
    requests[ip].count++
  }

  if (requests[ip].count > 100) {
    return res.status(429).json({ error: 'Too many requests' })
  }

  next()
})

Problems:

Memory leak (entries never cleaned up)
No headers (clients don't know their limits)
No persistence (resets on server restart)
Race conditions in production

Using HitLimit (Recommended)

HitLimit handles all the edge cases:

import express from 'express'
import { hitlimit } from '@joint-ops/hitlimit'

const app = express()

// Zero-config: 100 requests per minute per IP
app.use(hitlimit())

// Or with custom limits
app.use(hitlimit({
  limit: 100,
  window: '15m',   // 15 minutes
  // Adds these headers to responses:
  // RateLimit-Limit: 100
  // RateLimit-Remaining: 95
  // RateLimit-Reset: 1706547600
}))

app.listen(3000)

Protecting Specific Routes

Don't rate limit everything equally. Health checks and static assets don't need limits. Auth endpoints need strict limits.

// Global default
app.use(hitlimit({ limit: 1000, window: '1h' }))

// Strict limit on login
app.use('/auth/login', hitlimit({
  limit: 5,
  window: '15m',
  // After 5 failed logins, wait 15 minutes
}))

// Strict limit on registration
app.use('/auth/register', hitlimit({
  limit: 3,
  window: '1h'
}))

// Skip health checks entirely
app.use(hitlimit({
  skip: (req) => req.path === '/health'
}))

Handling 429 Responses

When rate limited, return useful information:

app.use(hitlimit({
  limit: 100,
  window: '1h',
  response: (info) => ({
    error: 'RATE_LIMITED',
    message: 'Too many requests. Please slow down.',
    limit: info.limit,
    remaining: 0,
    resetAt: new Date(info.resetAt).toISOString(),
    retryAfter: info.resetIn  // seconds until reset
  })
}))

Clients receive:

{
  "error": "RATE_LIMITED",
  "message": "Too many requests. Please slow down.",
  "limit": 100,
  "remaining": 0,
  "resetAt": "2026-01-30T15:00:00.000Z",
  "retryAfter": 1847
}

Tiered Rate Limits

SaaS products need different limits for different users. HitLimit has this built-in:

app.use(hitlimit({
  tiers: {
    anonymous: { limit: 10, window: '1h' },
    free: { limit: 100, window: '1h' },
    pro: { limit: 5000, window: '1h' },
    enterprise: { limit: Infinity }
  },
  tier: (req) => {
    if (!req.user) return 'anonymous'
    return req.user.plan || 'free'
  }
}))

Now your API automatically applies the right limits based on user context.

Custom Rate Limit Keys

By default, rate limiting uses IP address. But IPs aren't always the right key:

Shared IPs: Corporate networks might share one IP for thousands of users
API keys: You want to limit by API key, not IP
User accounts: Logged-in users should be limited by user ID

app.use(hitlimit({
  key: (req) => {
    // 1. API key takes precedence
    if (req.headers['x-api-key']) {
      return `api:${req.headers['x-api-key']}`
    }

    // 2. Logged-in user ID
    if (req.user?.id) {
      return `user:${req.user.id}`
    }

    // 3. Fall back to IP
    return `ip:${req.ip}`
  }
}))

Distributed Rate Limiting

Single-server rate limiting is easy. Multiple servers is hard.

The Problem

You have 5 servers. Each has its own in-memory rate limiter set to 100 req/min.

Result? Users can make 500 req/min (100 per server).

Solution: Shared Store

Use Redis as a shared counter:

import { hitlimit } from '@joint-ops/hitlimit'
import { redisStore } from '@joint-ops/hitlimit/stores/redis'

app.use(hitlimit({
  limit: 100,
  window: '1m',
  store: redisStore({
    url: 'redis://localhost:6379',
    prefix: 'rl:'  // Key prefix in Redis
  })
}))

All servers increment the same counter. True distributed rate limiting.

Handling Redis Failures

Redis goes down. What happens?

Fail-closed: Reject all requests. Safe but your API stops working.

Fail-open: Allow all requests. Risky but service continues.

HitLimit lets you decide:

app.use(hitlimit({
  store: redisStore({ url: 'redis://localhost:6379' }),
  onStoreError: (error, req) => {
    console.error('Redis error:', error)

    // Protect critical routes even if Redis is down
    if (req.path.startsWith('/admin')) return 'deny'

    // Allow other traffic to continue
    return 'allow'
  }
}))

SQLite: Persistence Without Redis

Don't want to run Redis? SQLite gives you persistence without the ops overhead:

import { hitlimit } from '@joint-ops/hitlimit'
import { sqliteStore } from '@joint-ops/hitlimit/stores/sqlite'

app.use(hitlimit({
  store: sqliteStore({
    path: './rate-limits.db'  // File-based persistence
  })
}))

Good for single-server deployments where you want limits to survive restarts.

Common Pitfalls

1. Trusting X-Forwarded-For Blindly

Behind a proxy, req.ip might be wrong. But don't blindly trust X-Forwarded-For—it can be spoofed.

// Configure Express to trust your proxy
app.set('trust proxy', 1)  // Trust first proxy

// Or be specific about which proxies to trust
app.set('trust proxy', 'loopback, 10.0.0.0/8')

Only trust headers from proxies you control.

2. Not Handling Proxy Chains

Multiple proxies? The header looks like:

X-Forwarded-For: client, proxy1, proxy2

The leftmost IP is the client. But if you trust the wrong number of proxies, you'll rate limit the wrong IP.

3. Overly Strict Limits

Too strict = frustrated legitimate users. Start generous and tighten based on data:

// Start here
{ limit: 1000, window: '1h' }

// Tighten if you see abuse
{ limit: 500, window: '1h' }

// But monitor for false positives

4. Memory Leaks from Unbounded Stores

Custom in-memory stores must clean up old entries. Otherwise, memory grows forever.

HitLimit handles this automatically with setTimeout-based cleanup.

5. Not Rate Limiting Before Authentication

Attackers can brute-force login endpoints. Rate limit BEFORE authentication middleware:

// Good: Rate limit hits before auth check
app.use('/auth/login', hitlimit({ limit: 5, window: '15m' }))
app.post('/auth/login', authenticateUser)

// Bad: Auth runs first, then rate limit (attacker already hit your DB)
app.post('/auth/login', authenticateUser, hitlimit(...))

Testing Your Rate Limiter

Verify Limits Work

// test/rate-limit.test.js
import request from 'supertest'
import app from '../app.js'

it('blocks after limit exceeded', async () => {
  // Make 100 requests (the limit)
  for (let i = 0; i < 100; i++) {
    await request(app).get('/api/test')
  }

  // Request 101 should be blocked
  const response = await request(app).get('/api/test')
  expect(response.status).toBe(429)
})

Load Testing

Use tools like autocannon or k6:

npx autocannon -c 100 -d 30 http://localhost:3000/api/test

Watch for:

429 responses appearing at expected rates
Memory not growing unbounded
Response times staying consistent

Monitor in Production

Track these metrics:

429 response rate
Rate limit header values
Store latency (if using Redis)
Memory usage of rate limit store

Quick Start with HitLimit

Installation

npm install @joint-ops/hitlimit

Basic Setup

import express from 'express'
import { hitlimit } from '@joint-ops/hitlimit'

const app = express()

// Zero-config default: 100 req/min per IP
app.use(hitlimit())

// Your routes
app.get('/api/data', (req, res) => {
  res.json({ message: 'Hello!' })
})

app.listen(3000)

That's it. Your API is now rate limited with sensible defaults.

Summary

Rate limiting is essential for production APIs. The key decisions:

Algorithm: Start with fixed window. It's simple and works.
Limits: Start generous, tighten based on data.
Keys: IP for anonymous, user ID or API key for authenticated.
Distribution: Use Redis for multi-server. SQLite for single-server persistence.
Failures: Usually fail-open. Your API working matters more than perfect rate limiting.

HitLimit handles all of this in 7KB with zero configuration required.

Why Rate Limiting Matters

1. Protection Against Abuse

2. Fair Resource Allocation

3. API Monetization

Real Examples of What Goes Wrong

Rate Limiting Algorithms Explained

Fixed Window

Sliding Window Log

Sliding Window Counter

Token Bucket

Decision Tree

Implementing Rate Limiting in Node.js

The Naive Approach (Don't Do This)

Using HitLimit (Recommended)

Protecting Specific Routes

Handling 429 Responses

Tiered Rate Limits

Custom Rate Limit Keys

Distributed Rate Limiting

The Problem

Solution: Shared Store

Handling Redis Failures

SQLite: Persistence Without Redis

Common Pitfalls

1. Trusting X-Forwarded-For Blindly

2. Not Handling Proxy Chains

3. Overly Strict Limits

4. Memory Leaks from Unbounded Stores

5. Not Rate Limiting Before Authentication

Testing Your Rate Limiter

Verify Limits Work

Load Testing

Monitor in Production

Quick Start with HitLimit

Installation

Basic Setup

Summary

Links