DEV Community

myougaTheAxo
myougaTheAxo

Posted on

API Rate Limiting with Claude Code: Redis Sliding Window Implementation

Without rate limiting, your API is one misconfigured client away from infrastructure failure. Claude Code can implement consistent rate limiting across every endpoint — if you give it the right rules in CLAUDE.md.


Why Rate Limiting Matters

Three failure modes hit APIs without rate limiting:

  • DDoS vulnerability: Attackers send thousands of requests per second, taking down your service
  • Cost explosion: A buggy client hammering your OpenAI or Stripe backend can generate hundreds of dollars in charges overnight
  • Fairness collapse: One heavy user monopolizes bandwidth while others get timeouts

Fixed Window rate limiting is a common first attempt — but it has a critical flaw: a burst of requests at the window boundary effectively doubles your limit. Sliding Window with Redis Sorted Sets solves this cleanly.


CLAUDE.md Rules for Rate Limiting

Add these rules to your CLAUDE.md so Claude Code enforces consistent implementation:

## API Rate Limiting Rules

### Mandatory
- Every public API endpoint MUST have rate limiting
- Implementation: Redis + Sliding Window only (Fixed Window is banned — burst vulnerable)
- Authenticated users: limited by user_id
- Unauthenticated: limited by IP address
- Redis key format: rl:{identifier}:{group}

### Default Limits
- GET (read):             100 requests/minute
- POST/PUT/PATCH (write): 20 requests/minute
- DELETE:                 10 requests/minute
- /auth/* endpoints:      5 requests/minute

### Required Response Headers
- X-RateLimit-Limit: the group maximum
- X-RateLimit-Remaining: requests left (never below 0)
- X-RateLimit-Reset: Unix timestamp (seconds) when window resets
- Retry-After: seconds to wait (429 responses only)

### 429 Response Body
{ "error": "rate_limit_exceeded", "retryAfter": N }
Enter fullscreen mode Exit fullscreen mode

Sliding Window Implementation Prompt

Implement a Sliding Window rate limiting middleware using Redis Sorted Sets.

Requirements:
- TypeScript + Express + ioredis
- Follow CLAUDE.md rate limiting rules exactly
- Auto-detect endpoint group: auth / read / write / delete
- Attach X-RateLimit-* headers to ALL responses
- Attach Retry-After header only on 429 responses

Redis key design:
- Key: rl:{identifier}:{group}
- Value: Sorted Set (score = timestamp ms, member = unique request ID)
- TTL: window size in seconds

Group detection logic:
- Path starts with /auth/  auth (5 req/min)
- Method is DELETE  delete (10 req/min)
- Method is POST, PUT, or PATCH  write (20 req/min)
- Otherwise  read (100 req/min)
Enter fullscreen mode Exit fullscreen mode

Generated middleware:

// middleware/rateLimiter.ts
import { Request, Response, NextFunction } from 'express';
import Redis from 'ioredis';
import { v4 as uuidv4 } from 'uuid';

const redis = new Redis(process.env.REDIS_URL!);

type Group = 'auth' | 'read' | 'write' | 'delete';

const LIMITS: Record<Group, { max: number; windowSec: number }> = {
  auth:   { max: 5,   windowSec: 60 },
  read:   { max: 100, windowSec: 60 },
  write:  { max: 20,  windowSec: 60 },
  delete: { max: 10,  windowSec: 60 },
};

function getGroup(req: Request): Group {
  if (req.path.startsWith('/auth/')) return 'auth';
  if (req.method === 'DELETE') return 'delete';
  if (['POST', 'PUT', 'PATCH'].includes(req.method)) return 'write';
  return 'read';
}

function getIdentifier(req: Request): string {
  const userId = (req as any).user?.id;
  const ip = req.ip ?? req.socket.remoteAddress ?? 'unknown';
  return userId ? `user_${userId}` : `ip_${ip.replace(/:/g, '_')}`;
}

export async function rateLimiter(
  req: Request,
  res: Response,
  next: NextFunction
): Promise<void> {
  const group = getGroup(req);
  const { max, windowSec } = LIMITS[group];
  const key = `rl:${getIdentifier(req)}:${group}`;
  const now = Date.now();
  const windowStart = now - windowSec * 1000;

  // Atomic pipeline: remove expired → add current → count → set TTL
  const pipeline = redis.pipeline();
  pipeline.zremrangebyscore(key, 0, windowStart);
  pipeline.zadd(key, now, `${now}-${uuidv4()}`);
  pipeline.zcard(key);
  pipeline.expire(key, windowSec);
  const results = await pipeline.exec();

  const count = (results?.[2]?.[1] as number) ?? 0;
  const remaining = Math.max(0, max - count);
  const resetAt = Math.ceil((now + windowSec * 1000) / 1000);

  res.setHeader('X-RateLimit-Limit', max);
  res.setHeader('X-RateLimit-Remaining', remaining);
  res.setHeader('X-RateLimit-Reset', resetAt);

  if (count > max) {
    const retryAfter = Math.ceil(windowSec - (now - windowStart) / 1000);
    res.setHeader('Retry-After', retryAfter);
    res.status(429).json({ error: 'rate_limit_exceeded', retryAfter });
    return;
  }
  next();
}
Enter fullscreen mode Exit fullscreen mode

Rate Limit Header Prompt

Review the rateLimiter middleware and ensure:

1. X-RateLimit-Limit  reflects the current endpoint group's maximum
2. X-RateLimit-Remaining — Math.max(0, max - count), never negative
3. X-RateLimit-Reset — Unix timestamp in seconds (not milliseconds)
4. Retry-After — integer seconds, only on 429 responses
5. X-RateLimit-Policy — add RFC 6585 format: "100;w=60"

All successful (2xx) responses must also carry X-RateLimit-* headers.
Enter fullscreen mode Exit fullscreen mode

Per-Endpoint Override Prompt

Authentication endpoints need stricter limits than generic endpoints:

Add per-endpoint overrides to the rate limiter:

Strict (authentication):
- POST /auth/login:           5 req/min  (brute force protection)
- POST /auth/register:        3 req/hour
- POST /auth/forgot-password: 3 req/hour (keyed by email, not just IP)

Relaxed for specific use cases:
- GET /api/search:   30 req/min  (heavy database query)
- POST /api/export:  5 req/hour  (background job trigger)

Admin endpoints:
- /admin/*: require authentication + 200 req/min for admin role

Implementation:
- Use an ENDPOINT_OVERRIDES map keyed by "METHOD:/path"
- Override merges with group defaults (partial override OK)
- Admin role check: req.user?.role === 'admin'
Enter fullscreen mode Exit fullscreen mode

Override implementation:

type LimitConfig = { max: number; windowSec: number };

const ENDPOINT_OVERRIDES: Record<string, Partial<LimitConfig>> = {
  'POST:/auth/login':            { max: 5,  windowSec: 60   },
  'POST:/auth/register':         { max: 3,  windowSec: 3600 },
  'POST:/auth/forgot-password':  { max: 3,  windowSec: 3600 },
  'GET:/api/search':             { max: 30, windowSec: 60   },
  'POST:/api/export':            { max: 5,  windowSec: 3600 },
};

function resolveLimit(req: Request): LimitConfig {
  const overrideKey = `${req.method}:${req.path}`;
  const groupDefaults = LIMITS[getGroup(req)];

  if (ENDPOINT_OVERRIDES[overrideKey]) {
    return { ...groupDefaults, ...ENDPOINT_OVERRIDES[overrideKey] };
  }
  if (req.path.startsWith('/admin/') && (req as any).user?.role === 'admin') {
    return { max: 200, windowSec: 60 };
  }
  return groupDefaults;
}
Enter fullscreen mode Exit fullscreen mode

Why Sliding Window over Fixed Window?

Approach Burst behavior Accuracy
Fixed Window 2× limit at boundaries Low
Sliding Window Exact enforcement High
Token Bucket Configurable burst Medium

Fixed Window counts requests in discrete buckets (e.g., 0:00–1:00, 1:00–2:00). A client sending 100 requests at 0:59 and 100 at 1:01 bypasses a "100/min" limit entirely. Sliding Window tracks the actual 60-second window behind each request, closing this gap.

The Redis Sorted Set approach is O(log N) per request — negligible overhead for most workloads.


Summary

Encoding rate limiting rules in CLAUDE.md gives you three advantages:

  1. Consistency: Every new endpoint automatically inherits the right limits
  2. Reviewability: Code reviewers can check "does this match CLAUDE.md?" in seconds
  3. Security by Design: Rate limiting becomes part of your architecture, not an afterthought

The combination of Sliding Window + Redis Sorted Sets + per-endpoint overrides covers 95% of real-world rate limiting needs without adding significant complexity.


Security Pack (¥1,480) includes /security-check for automated rate limiting gap detection. 👉 https://prompt-works.jp

Myouga (@myougatheaxo) — Security-focused Claude Code engineer.

Top comments (0)