Without rate limiting, your API is one misconfigured client away from infrastructure failure. Claude Code can implement consistent rate limiting across every endpoint — if you give it the right rules in CLAUDE.md.
Why Rate Limiting Matters
Three failure modes hit APIs without rate limiting:
- DDoS vulnerability: Attackers send thousands of requests per second, taking down your service
- Cost explosion: A buggy client hammering your OpenAI or Stripe backend can generate hundreds of dollars in charges overnight
- Fairness collapse: One heavy user monopolizes bandwidth while others get timeouts
Fixed Window rate limiting is a common first attempt — but it has a critical flaw: a burst of requests at the window boundary effectively doubles your limit. Sliding Window with Redis Sorted Sets solves this cleanly.
CLAUDE.md Rules for Rate Limiting
Add these rules to your CLAUDE.md so Claude Code enforces consistent implementation:
## API Rate Limiting Rules
### Mandatory
- Every public API endpoint MUST have rate limiting
- Implementation: Redis + Sliding Window only (Fixed Window is banned — burst vulnerable)
- Authenticated users: limited by user_id
- Unauthenticated: limited by IP address
- Redis key format: rl:{identifier}:{group}
### Default Limits
- GET (read): 100 requests/minute
- POST/PUT/PATCH (write): 20 requests/minute
- DELETE: 10 requests/minute
- /auth/* endpoints: 5 requests/minute
### Required Response Headers
- X-RateLimit-Limit: the group maximum
- X-RateLimit-Remaining: requests left (never below 0)
- X-RateLimit-Reset: Unix timestamp (seconds) when window resets
- Retry-After: seconds to wait (429 responses only)
### 429 Response Body
{ "error": "rate_limit_exceeded", "retryAfter": N }
Sliding Window Implementation Prompt
Implement a Sliding Window rate limiting middleware using Redis Sorted Sets.
Requirements:
- TypeScript + Express + ioredis
- Follow CLAUDE.md rate limiting rules exactly
- Auto-detect endpoint group: auth / read / write / delete
- Attach X-RateLimit-* headers to ALL responses
- Attach Retry-After header only on 429 responses
Redis key design:
- Key: rl:{identifier}:{group}
- Value: Sorted Set (score = timestamp ms, member = unique request ID)
- TTL: window size in seconds
Group detection logic:
- Path starts with /auth/ → auth (5 req/min)
- Method is DELETE → delete (10 req/min)
- Method is POST, PUT, or PATCH → write (20 req/min)
- Otherwise → read (100 req/min)
Generated middleware:
// middleware/rateLimiter.ts
import { Request, Response, NextFunction } from 'express';
import Redis from 'ioredis';
import { v4 as uuidv4 } from 'uuid';
const redis = new Redis(process.env.REDIS_URL!);
type Group = 'auth' | 'read' | 'write' | 'delete';
const LIMITS: Record<Group, { max: number; windowSec: number }> = {
auth: { max: 5, windowSec: 60 },
read: { max: 100, windowSec: 60 },
write: { max: 20, windowSec: 60 },
delete: { max: 10, windowSec: 60 },
};
function getGroup(req: Request): Group {
if (req.path.startsWith('/auth/')) return 'auth';
if (req.method === 'DELETE') return 'delete';
if (['POST', 'PUT', 'PATCH'].includes(req.method)) return 'write';
return 'read';
}
function getIdentifier(req: Request): string {
const userId = (req as any).user?.id;
const ip = req.ip ?? req.socket.remoteAddress ?? 'unknown';
return userId ? `user_${userId}` : `ip_${ip.replace(/:/g, '_')}`;
}
export async function rateLimiter(
req: Request,
res: Response,
next: NextFunction
): Promise<void> {
const group = getGroup(req);
const { max, windowSec } = LIMITS[group];
const key = `rl:${getIdentifier(req)}:${group}`;
const now = Date.now();
const windowStart = now - windowSec * 1000;
// Atomic pipeline: remove expired → add current → count → set TTL
const pipeline = redis.pipeline();
pipeline.zremrangebyscore(key, 0, windowStart);
pipeline.zadd(key, now, `${now}-${uuidv4()}`);
pipeline.zcard(key);
pipeline.expire(key, windowSec);
const results = await pipeline.exec();
const count = (results?.[2]?.[1] as number) ?? 0;
const remaining = Math.max(0, max - count);
const resetAt = Math.ceil((now + windowSec * 1000) / 1000);
res.setHeader('X-RateLimit-Limit', max);
res.setHeader('X-RateLimit-Remaining', remaining);
res.setHeader('X-RateLimit-Reset', resetAt);
if (count > max) {
const retryAfter = Math.ceil(windowSec - (now - windowStart) / 1000);
res.setHeader('Retry-After', retryAfter);
res.status(429).json({ error: 'rate_limit_exceeded', retryAfter });
return;
}
next();
}
Rate Limit Header Prompt
Review the rateLimiter middleware and ensure:
1. X-RateLimit-Limit — reflects the current endpoint group's maximum
2. X-RateLimit-Remaining — Math.max(0, max - count), never negative
3. X-RateLimit-Reset — Unix timestamp in seconds (not milliseconds)
4. Retry-After — integer seconds, only on 429 responses
5. X-RateLimit-Policy — add RFC 6585 format: "100;w=60"
All successful (2xx) responses must also carry X-RateLimit-* headers.
Per-Endpoint Override Prompt
Authentication endpoints need stricter limits than generic endpoints:
Add per-endpoint overrides to the rate limiter:
Strict (authentication):
- POST /auth/login: 5 req/min (brute force protection)
- POST /auth/register: 3 req/hour
- POST /auth/forgot-password: 3 req/hour (keyed by email, not just IP)
Relaxed for specific use cases:
- GET /api/search: 30 req/min (heavy database query)
- POST /api/export: 5 req/hour (background job trigger)
Admin endpoints:
- /admin/*: require authentication + 200 req/min for admin role
Implementation:
- Use an ENDPOINT_OVERRIDES map keyed by "METHOD:/path"
- Override merges with group defaults (partial override OK)
- Admin role check: req.user?.role === 'admin'
Override implementation:
type LimitConfig = { max: number; windowSec: number };
const ENDPOINT_OVERRIDES: Record<string, Partial<LimitConfig>> = {
'POST:/auth/login': { max: 5, windowSec: 60 },
'POST:/auth/register': { max: 3, windowSec: 3600 },
'POST:/auth/forgot-password': { max: 3, windowSec: 3600 },
'GET:/api/search': { max: 30, windowSec: 60 },
'POST:/api/export': { max: 5, windowSec: 3600 },
};
function resolveLimit(req: Request): LimitConfig {
const overrideKey = `${req.method}:${req.path}`;
const groupDefaults = LIMITS[getGroup(req)];
if (ENDPOINT_OVERRIDES[overrideKey]) {
return { ...groupDefaults, ...ENDPOINT_OVERRIDES[overrideKey] };
}
if (req.path.startsWith('/admin/') && (req as any).user?.role === 'admin') {
return { max: 200, windowSec: 60 };
}
return groupDefaults;
}
Why Sliding Window over Fixed Window?
| Approach | Burst behavior | Accuracy |
|---|---|---|
| Fixed Window | 2× limit at boundaries | Low |
| Sliding Window | Exact enforcement | High |
| Token Bucket | Configurable burst | Medium |
Fixed Window counts requests in discrete buckets (e.g., 0:00–1:00, 1:00–2:00). A client sending 100 requests at 0:59 and 100 at 1:01 bypasses a "100/min" limit entirely. Sliding Window tracks the actual 60-second window behind each request, closing this gap.
The Redis Sorted Set approach is O(log N) per request — negligible overhead for most workloads.
Summary
Encoding rate limiting rules in CLAUDE.md gives you three advantages:
- Consistency: Every new endpoint automatically inherits the right limits
- Reviewability: Code reviewers can check "does this match CLAUDE.md?" in seconds
- Security by Design: Rate limiting becomes part of your architecture, not an afterthought
The combination of Sliding Window + Redis Sorted Sets + per-endpoint overrides covers 95% of real-world rate limiting needs without adding significant complexity.
Security Pack (¥1,480) includes /security-check for automated rate limiting gap detection. 👉 https://prompt-works.jp
Myouga (@myougatheaxo) — Security-focused Claude Code engineer.
Top comments (0)