Most WebSocket demos die at 10,000 connections. Not because Node.js cannot handle more, but because the architecture is wrong.
Here are 7 production patterns that take you from “it works locally” to “it survives 1 million concurrent connections”.
1. Single Instance Chat → Redis Pub/Sub Fan-Out
A single ws server works until you add a second instance behind a load balancer.
Before (single instance only):
import { WebSocketServer } from 'ws';
const wss = new WebSocketServer({ port: 3000 });
const clients = new Set<WebSocket>();
wss.on('connection', (ws) => {
clients.add(ws);
ws.on('message', (msg) => {
for (const client of clients) {
if (client.readyState === ws.OPEN) {
client.send(msg.toString());
}
}
});
ws.on('close', () => clients.delete(ws));
});
Deploy this across 3 instances and messages disappear. Each process only knows about its own connections.
After (Redis pub/sub bridge):
import { WebSocketServer } from 'ws';
import { createClient } from 'redis';
const wss = new WebSocketServer({ port: 3000 });
const pub = createClient({ url: process.env.REDIS_URL });
const sub = createClient({ url: process.env.REDIS_URL });
await pub.connect();
await sub.connect();
const localClients = new Set<WebSocket>();
wss.on('connection', (ws) => {
localClients.add(ws);
ws.on('message', async (msg) => {
await pub.publish('chat', msg.toString());
});
ws.on('close', () => localClients.delete(ws));
});
await sub.subscribe('chat', (message) => {
for (const client of localClients) {
if (client.readyState === client.OPEN) {
client.send(message);
}
}
});
Now any instance can publish. All instances broadcast locally. Horizontal scaling unlocked.
2. No Sticky Sessions → Session Affinity at the Load Balancer
Without sticky sessions, reconnects land on different instances and break in-memory state.
Before: default round-robin load balancer.
After: enable session affinity.
Nginx example:
upstream websocket_backend {
ip_hash;
server app1:3000;
server app2:3000;
server app3:3000;
}
Or use cookie-based affinity on managed platforms.
This keeps a client pinned to one instance, which drastically reduces cross-node coordination overhead.
3. No Heartbeat → Ping/Pong Cleanup
Dead connections silently consume memory and file descriptors.
Before:
wss.on('connection', (ws) => {
// no liveness tracking
});
This leaks connections on mobile network drops.
After:
const INTERVAL = 30000;
wss.on('connection', (ws) => {
let alive = true;
ws.on('pong', () => {
alive = true;
});
const heartbeat = setInterval(() => {
if (!alive) {
ws.terminate();
return;
}
alive = false;
ws.ping();
}, INTERVAL);
ws.on('close', () => clearInterval(heartbeat));
});
Without this, you will eventually crash under load. Every production server needs it.
4. Blocking the Event Loop → Worker Threads for Heavy Work
JSON parsing, crypto, compression. Do that on the main thread and latency spikes for every connection.
Before:
ws.on('message', (data) => {
const parsed = JSON.parse(data.toString());
const result = expensiveOperation(parsed);
ws.send(JSON.stringify(result));
});
This blocks the event loop.
After (worker threads):
import { Worker } from 'worker_threads';
function runWorker(payload: unknown) {
return new Promise((resolve, reject) => {
const worker = new Worker('./worker.js', {
workerData: payload
});
worker.on('message', resolve);
worker.on('error', reject);
});
}
ws.on('message', async (data) => {
const parsed = JSON.parse(data.toString());
const result = await runWorker(parsed);
ws.send(JSON.stringify(result));
});
In production, use a worker pool like piscina. This isolates CPU spikes from your real-time loop.
5. Stateless Reconnect → Exponential Backoff With Jitter
If 20,000 clients reconnect at once, your server never recovers.
Before:
ws.onclose = () => {
setTimeout(connect, 1000);
};
All clients retry at the same time.
After:
function reconnect(attempt: number) {
const base = 1000;
const max = 30000;
const delay = Math.min(base * 2 ** attempt, max);
const jitter = Math.random() * 1000;
setTimeout(connect, delay + jitter);
}
Backoff plus jitter prevents the thundering herd problem.
6. Using WebSockets for Everything → Event Bus Architecture
A common mistake is replacing your REST API with WebSockets.
That does not scale cleanly.
Before:
ws.on('message', async (msg) => {
const data = JSON.parse(msg.toString());
const result = await db.query('SELECT * FROM users WHERE id = ?', [data.id]);
ws.send(JSON.stringify(result));
});
Your WebSocket layer now owns business logic and database access.
After:
// REST handles writes
app.post('/update-user', async (req, res) => {
const user = await updateUser(req.body);
await redis.publish('user.updated', JSON.stringify(user));
res.json(user);
});
// WebSocket only pushes events
await redis.subscribe('user.updated', (msg) => {
broadcastToRoom('users', msg);
});
WebSockets become a delivery layer for events. Not your query engine.
If you are designing large JavaScript systems, this separation mirrors the event-driven architecture patterns discussed in the JavaScript application architecture system design guide.
7. Ignoring OS Limits → Raising File Descriptors
You will hit the OS limit before you hit CPU.
Check:
ulimit -n
Typical default: 1024.
Raise it:
ulimit -n 100000
Persist in /etc/security/limits.conf:
* soft nofile 100000
* hard nofile 100000
Each WebSocket connection consumes one file descriptor. At scale, this matters more than your Node.js code.
What Actually Gets You to 1 Million Connections
It is not a magic library.
It is:
- Horizontal scaling with Redis or a coordinator
- Sticky sessions
- Heartbeat cleanup
- Worker isolation for CPU work
- Backoff on reconnect
- Clear separation between API and event delivery
- Proper OS tuning
Most teams fail at one of these and blame Node.js.
You can implement all 7 patterns in a weekend. The difference between a 5,000 connection demo and a 1,000,000 connection system is not syntax. It is architecture.
If you are building real-time systems, start by load testing at 1,000 connections locally. Measure latency. Watch file descriptors. Then scale horizontally before production forces you to.
Top comments (0)