Redis Patterns for Node.js: Caching, Pub/Sub, and Rate Limiting in Production
Redis is the Swiss Army knife of backend infrastructure. It's fast, versatile, and battle-tested — but using it well in production requires more than slapping a SET and GET around your database calls. In this guide, we'll walk through five patterns that appear repeatedly in production Node.js systems, each with working code, error handling, and the gotchas that only emerge after things have broken.
All examples use ioredis, the de facto Node.js Redis client with solid cluster support, TLS, and built-in retry logic.
Prerequisites and Setup
npm install ioredis
Throughout this article we'll share a Redis client instance. In production you'd inject this via dependency injection or a module singleton:
// redis.js
const Redis = require('ioredis');
const redis = new Redis({
host: process.env.REDIS_HOST || '127.0.0.1',
port: parseInt(process.env.REDIS_PORT || '6379'),
password: process.env.REDIS_PASSWORD,
maxRetriesPerRequest: 3,
enableReadyCheck: true,
lazyConnect: false,
// Reconnect automatically with exponential backoff
retryStrategy(times) {
const delay = Math.min(times * 50, 2000);
return delay;
},
});
redis.on('error', (err) => {
console.error('[Redis] Connection error:', err.message);
});
redis.on('connect', () => {
console.log('[Redis] Connected');
});
module.exports = redis;
This gives you automatic reconnection without crashing your process on transient network errors.
Pattern 1: Cache-Aside with TTL and Stampede Prevention
The cache-aside pattern is the most common Redis usage: check the cache first, fall back to the database on a miss, then populate the cache for next time. Simple in theory. Brutal in production when your cache expires and 500 simultaneous requests all miss at once, hammering your database — the cache stampede (also called thundering herd).
Basic Cache-Aside
async function getUserById(userId) {
const cacheKey = `user:${userId}`;
// 1. Check cache
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// 2. Cache miss — hit the database
const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);
if (!user) return null;
// 3. Populate cache with 5-minute TTL
await redis.setex(cacheKey, 300, JSON.stringify(user));
return user;
}
This works fine at low traffic. Under high concurrency, the gap between step 1 (miss) and step 3 (populate) means hundreds of requests can all read a null cache and all query the database simultaneously.
Stampede Prevention with a Distributed Lock
The fix: use a short-lived lock so only one request regenerates the cache while others wait.
const { promisify } = require('util');
async function getUserByIdSafe(userId) {
const cacheKey = `user:${userId}`;
const lockKey = `lock:${cacheKey}`;
const lockTTL = 5000; // 5 seconds max lock hold time (ms)
const pollInterval = 100;
const maxWait = 3000;
// 1. Try cache first
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
// 2. Try to acquire lock using SET NX PX (atomic)
const lockValue = `${process.pid}-${Date.now()}-${Math.random()}`;
const acquired = await redis.set(lockKey, lockValue, 'PX', lockTTL, 'NX');
if (acquired === 'OK') {
// We hold the lock — fetch and populate
try {
const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);
if (user) {
await redis.setex(cacheKey, 300, JSON.stringify(user));
}
return user;
} finally {
// Release lock only if we still own it (Lua script for atomicity)
const releaseLua = `
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
`;
await redis.eval(releaseLua, 1, lockKey, lockValue);
}
}
// 3. Lock is held — wait for cache to be populated
const deadline = Date.now() + maxWait;
while (Date.now() < deadline) {
await new Promise((r) => setTimeout(r, pollInterval));
const waited = await redis.get(cacheKey);
if (waited) return JSON.parse(waited);
}
// 4. Fallback: lock timed out, fetch directly
return db.query('SELECT * FROM users WHERE id = $1', [userId]);
}
The Lua script for lock release is critical — it prevents a race condition where your lock expires, another process acquires it, and you accidentally delete their lock.
Production Tip: Soft TTL with Stale-While-Revalidate
For less critical data, serve stale content and refresh in the background:
async function getCachedWithSWR(key, fetchFn, ttl = 300, staleTTL = 60) {
const data = await redis.get(key);
const meta = await redis.get(`${key}:meta`);
if (data) {
const { expiresAt } = meta ? JSON.parse(meta) : { expiresAt: 0 };
const isStale = Date.now() > expiresAt - staleTTL * 1000;
if (isStale) {
// Return stale, refresh async
setImmediate(async () => {
try {
const fresh = await fetchFn();
await redis.setex(key, ttl, JSON.stringify(fresh));
await redis.setex(
`${key}:meta`,
ttl,
JSON.stringify({ expiresAt: Date.now() + ttl * 1000 })
);
} catch (err) {
console.error('[SWR] Background refresh failed:', err.message);
}
});
}
return JSON.parse(data);
}
const fresh = await fetchFn();
await redis.setex(key, ttl, JSON.stringify(fresh));
await redis.setex(
`${key}:meta`,
ttl,
JSON.stringify({ expiresAt: Date.now() + ttl * 1000 })
);
return fresh;
}
Pattern 2: Pub/Sub for Real-Time Features
Redis Pub/Sub enables event-driven communication between services or between server processes without a full message broker like Kafka. It's perfect for real-time notifications, live dashboards, and chat — where you need low latency and can tolerate messages being lost if a subscriber is offline.
Important: Use Separate Connections for Pub/Sub
A Redis connection in subscriber mode can only issue subscribe/unsubscribe commands. Mixing pub/sub with regular commands on the same connection will throw errors. Always create a dedicated subscriber client.
// pubsub.js
const Redis = require('ioredis');
const publisher = new Redis({ host: process.env.REDIS_HOST });
const subscriber = new Redis({ host: process.env.REDIS_HOST });
// Publisher: emit an event
async function publishEvent(channel, data) {
const payload = JSON.stringify({
timestamp: Date.now(),
data,
});
const receiverCount = await publisher.publish(channel, payload);
return receiverCount; // number of clients that received the message
}
// Subscriber: listen for events
function subscribeToChannel(channel, handler) {
subscriber.subscribe(channel, (err, count) => {
if (err) {
console.error('[PubSub] Subscribe error:', err.message);
return;
}
console.log(`[PubSub] Subscribed to ${count} channel(s)`);
});
subscriber.on('message', (ch, message) => {
if (ch !== channel) return;
try {
const parsed = JSON.parse(message);
handler(parsed);
} catch (err) {
console.error('[PubSub] Failed to parse message:', err.message);
}
});
}
module.exports = { publishEvent, subscribeToChannel };
Real-World Example: Live Notifications via WebSocket
// server.js (Express + ws)
const express = require('express');
const { WebSocketServer } = require('ws');
const { subscribeToChannel, publishEvent } = require('./pubsub');
const app = express();
const server = app.listen(3000);
const wss = new WebSocketServer({ server });
// Track connected WebSocket clients by userId
const clients = new Map(); // userId -> Set<WebSocket>
wss.on('connection', (ws, req) => {
const userId = req.url.replace('/?userId=', ''); // simplified auth
if (!clients.has(userId)) clients.set(userId, new Set());
clients.get(userId).add(ws);
ws.on('close', () => {
clients.get(userId)?.delete(ws);
if (clients.get(userId)?.size === 0) clients.delete(userId);
});
});
// Subscribe to notification channel
subscribeToChannel('notifications', ({ data }) => {
const { userId, message } = data;
const userSockets = clients.get(userId);
if (!userSockets) return; // user not connected on this server instance
for (const ws of userSockets) {
if (ws.readyState === ws.OPEN) {
ws.send(JSON.stringify(message));
}
}
});
// API endpoint to send a notification
app.post('/notify', express.json(), async (req, res) => {
const { userId, message } = req.body;
await publishEvent('notifications', { userId, message });
res.json({ ok: true });
});
This scales horizontally — when you add more Node.js processes, each subscribes to the same channel, so whichever process holds the user's WebSocket connection will deliver the message.
Pattern Extension: Pattern Subscribe
For wildcard channel matching, use psubscribe:
subscriber.psubscribe('user:*:events', (err) => {
if (err) console.error('[PubSub] Pattern subscribe error:', err.message);
});
subscriber.on('pmessage', (pattern, channel, message) => {
// pattern = 'user:*:events', channel = 'user:42:events'
const userId = channel.split(':')[1];
console.log(`Event for user ${userId}:`, message);
});
Pattern 3: Rate Limiting with Sliding Window Algorithm
Fixed-window rate limiting has a well-known flaw: a user can exhaust their limit at the end of window N and again at the start of window N+1, effectively doubling the allowed rate at window boundaries. The sliding window algorithm eliminates this by tracking requests within a rolling time window.
// ratelimit.js
/**
* Sliding window rate limiter using Redis sorted sets.
* Each request is stored with its timestamp as the score.
* Expired entries are pruned on every check.
*/
async function slidingWindowRateLimit(identifier, limit, windowSeconds) {
const key = `ratelimit:${identifier}`;
const now = Date.now();
const windowStart = now - windowSeconds * 1000;
const requestId = `${now}-${Math.random()}`;
// Lua script for atomicity: prune + count + add in one round trip
const lua = `
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window_start = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local request_id = ARGV[4]
local ttl = tonumber(ARGV[5])
-- Remove expired entries
redis.call('ZREMRANGEBYSCORE', key, '-inf', window_start)
-- Count current requests in window
local count = redis.call('ZCARD', key)
if count < limit then
-- Add this request
redis.call('ZADD', key, now, request_id)
redis.call('EXPIRE', key, ttl)
return {1, limit - count - 1} -- allowed, remaining
else
return {0, 0} -- denied
end
`;
const [allowed, remaining] = await redis.eval(
lua,
1,
key,
now,
windowStart,
limit,
requestId,
windowSeconds + 1
);
return {
allowed: allowed === 1,
remaining: parseInt(remaining),
resetAt: new Date(now + windowSeconds * 1000),
};
}
module.exports = { slidingWindowRateLimit };
Express Middleware
// middleware/rateLimiter.js
const { slidingWindowRateLimit } = require('../ratelimit');
function createRateLimiter({ limit = 100, windowSeconds = 60, keyFn } = {}) {
return async (req, res, next) => {
const identifier = keyFn
? keyFn(req)
: req.ip; // default: limit per IP
try {
const { allowed, remaining, resetAt } = await slidingWindowRateLimit(
identifier,
limit,
windowSeconds
);
// Always set rate limit headers
res.set({
'X-RateLimit-Limit': limit,
'X-RateLimit-Remaining': remaining,
'X-RateLimit-Reset': Math.ceil(resetAt.getTime() / 1000),
});
if (!allowed) {
return res.status(429).json({
error: 'Too Many Requests',
retryAfter: windowSeconds,
});
}
next();
} catch (err) {
// On Redis failure, fail open (let request through) rather than blocking all traffic
console.error('[RateLimit] Redis error, failing open:', err.message);
next();
}
};
}
// Usage: per-IP limit on public API, per-user limit on authenticated routes
app.use('/api/public', createRateLimiter({ limit: 30, windowSeconds: 60 }));
app.use(
'/api/user',
createRateLimiter({
limit: 1000,
windowSeconds: 3600,
keyFn: (req) => `user:${req.user.id}`,
})
);
Production tip: Always fail open on Redis errors for rate limiting. It's better to let a few extra requests through during an outage than to block all legitimate traffic.
Pattern 4: Session Storage with Automatic Expiry
Storing sessions in Redis instead of memory means your sessions survive process restarts and work across multiple server instances — essential for any horizontally scaled Node.js app.
// session.js
const crypto = require('crypto');
const SESSION_TTL = 86400; // 24 hours in seconds
function generateSessionId() {
return crypto.randomBytes(32).toString('hex');
}
async function createSession(userId, data = {}) {
const sessionId = generateSessionId();
const key = `session:${sessionId}`;
const sessionData = {
userId,
createdAt: Date.now(),
lastAccessedAt: Date.now(),
...data,
};
await redis.setex(key, SESSION_TTL, JSON.stringify(sessionData));
return sessionId;
}
async function getSession(sessionId) {
const key = `session:${sessionId}`;
const data = await redis.get(key);
if (!data) return null;
const session = JSON.parse(data);
// Sliding expiry: reset TTL on every access
session.lastAccessedAt = Date.now();
await redis.setex(key, SESSION_TTL, JSON.stringify(session));
return session;
}
async function updateSession(sessionId, updates) {
const session = await getSession(sessionId);
if (!session) throw new Error('Session not found or expired');
const updated = { ...session, ...updates, lastAccessedAt: Date.now() };
await redis.setex(`session:${sessionId}`, SESSION_TTL, JSON.stringify(updated));
return updated;
}
async function destroySession(sessionId) {
await redis.del(`session:${sessionId}`);
}
module.exports = { createSession, getSession, updateSession, destroySession };
Integration with express-session
For apps using the popular express-session middleware, use connect-redis:
npm install express-session connect-redis
const session = require('express-session');
const RedisStore = require('connect-redis').default;
app.use(
session({
store: new RedisStore({ client: redis }),
secret: process.env.SESSION_SECRET,
resave: false,
saveUninitialized: false,
cookie: {
secure: process.env.NODE_ENV === 'production',
httpOnly: true,
maxAge: 86400 * 1000, // 24 hours in ms
},
})
);
Session Invalidation Patterns
Sometimes you need to invalidate all sessions for a user (e.g., password change, account compromise). Track session IDs per user with a Redis set:
async function createSessionWithTracking(userId, data = {}) {
const sessionId = await createSession(userId, data);
// Track this session under the user's set
const userSessionsKey = `user:${userId}:sessions`;
await redis.sadd(userSessionsKey, sessionId);
await redis.expire(userSessionsKey, SESSION_TTL);
return sessionId;
}
async function invalidateAllUserSessions(userId) {
const userSessionsKey = `user:${userId}:sessions`;
const sessionIds = await redis.smembers(userSessionsKey);
if (sessionIds.length === 0) return;
// Delete all session keys atomically
const pipeline = redis.pipeline();
sessionIds.forEach((sid) => pipeline.del(`session:${sid}`));
pipeline.del(userSessionsKey);
await pipeline.exec();
console.log(`[Session] Invalidated ${sessionIds.length} sessions for user ${userId}`);
}
Production tip: Use Redis pipelines (batched commands) when deleting multiple keys. A single pipeline.exec() sends all commands in one round trip instead of N round trips.
Pattern 5: Distributed Locks with Redlock Algorithm
Distributed locks prevent race conditions across multiple Node.js processes — things like ensuring only one instance processes a scheduled job, or preventing double-charging in payment flows.
The naive approach (SET NX EX) works for single Redis instances but fails in Redis Cluster or when using Redis replication: if the primary fails after writing the lock but before replicating, another client can acquire the same lock from the replica. Redlock solves this by requiring a majority quorum across N independent Redis instances.
npm install redlock
// lock.js
const Redlock = require('redlock');
const Redis = require('ioredis');
// For production: use 3-5 independent Redis instances for quorum
// For single-instance dev/staging: one client is fine (without quorum guarantees)
const redisClients = [
new Redis({ host: process.env.REDIS_HOST_1 || '127.0.0.1', port: 6379 }),
// new Redis({ host: process.env.REDIS_HOST_2, port: 6379 }),
// new Redis({ host: process.env.REDIS_HOST_3, port: 6379 }),
];
const redlock = new Redlock(redisClients, {
driftFactor: 0.01, // clock drift compensation
retryCount: 10, // retry up to 10 times
retryDelay: 200, // wait 200ms between retries
retryJitter: 100, // add random jitter to avoid thundering herd
automaticExtensionThreshold: 500, // auto-extend if still running within 500ms of expiry
});
redlock.on('clientError', (err) => {
console.error('[Redlock] Client error:', err.message);
});
/**
* Execute a function while holding a distributed lock.
* The lock is automatically released when the function completes.
*/
async function withLock(resource, ttlMs, fn) {
let lock;
try {
lock = await redlock.acquire([`lock:${resource}`], ttlMs);
return await fn();
} catch (err) {
if (err.name === 'ExecutionError') {
throw new Error(`Could not acquire lock for resource: ${resource}`);
}
throw err;
} finally {
if (lock) {
try {
await lock.release();
} catch (releaseErr) {
// Log but don't rethrow — lock will expire naturally via TTL
console.error('[Redlock] Lock release failed (will expire via TTL):', releaseErr.message);
}
}
}
}
module.exports = { redlock, withLock };
Real-World Example: Idempotent Payment Processing
// payments.js
const { withLock } = require('./lock');
async function processPayment(orderId, amount, userId) {
const resource = `payment:order:${orderId}`;
const lockTTL = 30000; // 30 seconds max processing time
return withLock(resource, lockTTL, async () => {
// Check if already processed (idempotency)
const processed = await redis.get(`processed:${orderId}`);
if (processed) {
console.log(`[Payment] Order ${orderId} already processed, skipping`);
return JSON.parse(processed);
}
// Process the payment
const result = await paymentGateway.charge({ orderId, amount, userId });
// Mark as processed with a long TTL for idempotency window
await redis.setex(
`processed:${orderId}`,
86400 * 7, // 7 days
JSON.stringify(result)
);
return result;
});
}
Cron Job Deduplication
// scheduler.js — ensure only one instance runs the daily report job
const { withLock } = require('./lock');
async function runDailyReport() {
const today = new Date().toISOString().split('T')[0]; // YYYY-MM-DD
const resource = `cron:daily-report:${today}`;
try {
await withLock(resource, 300000, async () => { // 5-minute lock
console.log('[Cron] Acquired lock, running daily report...');
await generateAndSendReport();
console.log('[Cron] Daily report complete');
});
} catch (err) {
if (err.message.includes('Could not acquire lock')) {
console.log('[Cron] Daily report already running on another instance, skipping');
} else {
throw err;
}
}
}
Production Concerns
Connection Pooling
ioredis manages a single persistent connection per client instance. For high-throughput applications, use cluster mode (which distributes connections across shards) or create a small pool manually. For most apps, a single ioredis connection handles thousands of concurrent operations via pipelining — don't over-engineer this.
// For Redis Cluster:
const cluster = new Redis.Cluster(
[
{ host: 'redis-node-1', port: 6379 },
{ host: 'redis-node-2', port: 6379 },
{ host: 'redis-node-3', port: 6379 },
],
{
redisOptions: {
password: process.env.REDIS_PASSWORD,
},
clusterRetryStrategy(times) {
return Math.min(times * 100, 3000);
},
}
);
Error Handling Strategy
Never let a Redis error crash your entire request handler. Define a circuit-breaker behavior per use case:
| Use case | On Redis failure |
|---|---|
| Cache-aside | Fall through to database |
| Rate limiting | Fail open (allow request) |
| Session storage | Return 503 or redirect to login |
| Distributed lock | Fail closed (reject operation) |
| Pub/Sub | Log and drop message |
Monitoring Key Metrics
Watch these in production:
-
Memory usage:
redis.info('memory')— setmaxmemoryandmaxmemory-policy(useallkeys-lrufor pure caches) - Hit rate: Track cache hits vs misses in your app and alert if hit rate drops below 80%
-
Keyspace:
redis.dbsize()— unexpected key growth often indicates a TTL bug -
Slow log:
redis.slowlog('get', 10)— find queries taking over 10ms -
Connected clients:
redis.info('clients')— unexpected spikes indicate connection leaks
// Health check endpoint
app.get('/health/redis', async (req, res) => {
try {
const start = Date.now();
await redis.ping();
const latencyMs = Date.now() - start;
const info = await redis.info('memory');
const memMatch = info.match(/used_memory_human:(\S+)/);
const mem = memMatch ? memMatch[1] : 'unknown';
res.json({
status: 'ok',
latencyMs,
usedMemory: mem,
});
} catch (err) {
res.status(503).json({ status: 'error', error: err.message });
}
});
Key Naming Conventions
Consistent key naming makes redis-cli debugging and SCAN-based maintenance infinitely easier:
{namespace}:{entity}:{id} → user:profile:42
{namespace}:{entity}:{id}:{subkey} → user:sessions:42
lock:{resource} → lock:payment:order:99
ratelimit:{identifier} → ratelimit:ip:192.168.1.1
cron:{job}:{date} → cron:daily-report:2025-04-01
Avoid using : within actual data values as key separators — it clashes with this convention and confuses redis-cli's namespace browser.
Putting It All Together
These five patterns cover the vast majority of production Redis use cases in Node.js:
- Cache-aside eliminates database load for reads, with stampede prevention protecting you at scale
- Pub/Sub enables cross-process real-time messaging without a full broker
- Sliding window rate limiting enforces fair usage without boundary-window exploits
- Session storage makes your sessions scale-out safe and survives restarts
- Redlock coordinates exclusive operations across distributed processes
The common thread: Redis is fast enough that the real complexity lies in handling failure gracefully — cache misses, lock contention, network errors, and Redis restarts. The code above addresses each of these explicitly, because production systems don't get the luxury of a happy path.
If you're new to Redis, start with the cache-aside pattern and add TTL hygiene. Once you're comfortable, layer in rate limiting — it pays immediate dividends in API stability. The distributed lock pattern is your last resort when all else fails: it's powerful but adds latency and operational complexity.
Wilson Xu is a full-stack engineer who writes about Node.js, distributed systems, and developer tooling. Find him on GitHub at chengyixu.
Top comments (0)