DEV Community

JSGuruJobs
JSGuruJobs

Posted on

7 WebSocket Scaling Patterns That Let Node.js Handle 1M Real-Time Connections

Most WebSocket demos die at 10,000 connections. Not because Node.js cannot handle more, but because the architecture is wrong.

Here are 7 production patterns that take you from “it works locally” to “it survives 1 million concurrent connections”.


1. Single Instance Chat → Redis Pub/Sub Fan-Out

A single ws server works until you add a second instance behind a load balancer.

Before (single instance only):

import { WebSocketServer } from 'ws';

const wss = new WebSocketServer({ port: 3000 });
const clients = new Set<WebSocket>();

wss.on('connection', (ws) => {
  clients.add(ws);

  ws.on('message', (msg) => {
    for (const client of clients) {
      if (client.readyState === ws.OPEN) {
        client.send(msg.toString());
      }
    }
  });

  ws.on('close', () => clients.delete(ws));
});
Enter fullscreen mode Exit fullscreen mode

Deploy this across 3 instances and messages disappear. Each process only knows about its own connections.

After (Redis pub/sub bridge):

import { WebSocketServer } from 'ws';
import { createClient } from 'redis';

const wss = new WebSocketServer({ port: 3000 });

const pub = createClient({ url: process.env.REDIS_URL });
const sub = createClient({ url: process.env.REDIS_URL });

await pub.connect();
await sub.connect();

const localClients = new Set<WebSocket>();

wss.on('connection', (ws) => {
  localClients.add(ws);

  ws.on('message', async (msg) => {
    await pub.publish('chat', msg.toString());
  });

  ws.on('close', () => localClients.delete(ws));
});

await sub.subscribe('chat', (message) => {
  for (const client of localClients) {
    if (client.readyState === client.OPEN) {
      client.send(message);
    }
  }
});
Enter fullscreen mode Exit fullscreen mode

Now any instance can publish. All instances broadcast locally. Horizontal scaling unlocked.


2. No Sticky Sessions → Session Affinity at the Load Balancer

Without sticky sessions, reconnects land on different instances and break in-memory state.

Before: default round-robin load balancer.

After: enable session affinity.

Nginx example:

upstream websocket_backend {
  ip_hash;
  server app1:3000;
  server app2:3000;
  server app3:3000;
}
Enter fullscreen mode Exit fullscreen mode

Or use cookie-based affinity on managed platforms.

This keeps a client pinned to one instance, which drastically reduces cross-node coordination overhead.


3. No Heartbeat → Ping/Pong Cleanup

Dead connections silently consume memory and file descriptors.

Before:

wss.on('connection', (ws) => {
  // no liveness tracking
});
Enter fullscreen mode Exit fullscreen mode

This leaks connections on mobile network drops.

After:

const INTERVAL = 30000;

wss.on('connection', (ws) => {
  let alive = true;

  ws.on('pong', () => {
    alive = true;
  });

  const heartbeat = setInterval(() => {
    if (!alive) {
      ws.terminate();
      return;
    }
    alive = false;
    ws.ping();
  }, INTERVAL);

  ws.on('close', () => clearInterval(heartbeat));
});
Enter fullscreen mode Exit fullscreen mode

Without this, you will eventually crash under load. Every production server needs it.


4. Blocking the Event Loop → Worker Threads for Heavy Work

JSON parsing, crypto, compression. Do that on the main thread and latency spikes for every connection.

Before:

ws.on('message', (data) => {
  const parsed = JSON.parse(data.toString());
  const result = expensiveOperation(parsed);
  ws.send(JSON.stringify(result));
});
Enter fullscreen mode Exit fullscreen mode

This blocks the event loop.

After (worker threads):

import { Worker } from 'worker_threads';

function runWorker(payload: unknown) {
  return new Promise((resolve, reject) => {
    const worker = new Worker('./worker.js', {
      workerData: payload
    });

    worker.on('message', resolve);
    worker.on('error', reject);
  });
}

ws.on('message', async (data) => {
  const parsed = JSON.parse(data.toString());
  const result = await runWorker(parsed);
  ws.send(JSON.stringify(result));
});
Enter fullscreen mode Exit fullscreen mode

In production, use a worker pool like piscina. This isolates CPU spikes from your real-time loop.


5. Stateless Reconnect → Exponential Backoff With Jitter

If 20,000 clients reconnect at once, your server never recovers.

Before:

ws.onclose = () => {
  setTimeout(connect, 1000);
};
Enter fullscreen mode Exit fullscreen mode

All clients retry at the same time.

After:

function reconnect(attempt: number) {
  const base = 1000;
  const max = 30000;

  const delay = Math.min(base * 2 ** attempt, max);
  const jitter = Math.random() * 1000;

  setTimeout(connect, delay + jitter);
}
Enter fullscreen mode Exit fullscreen mode

Backoff plus jitter prevents the thundering herd problem.


6. Using WebSockets for Everything → Event Bus Architecture

A common mistake is replacing your REST API with WebSockets.

That does not scale cleanly.

Before:

ws.on('message', async (msg) => {
  const data = JSON.parse(msg.toString());
  const result = await db.query('SELECT * FROM users WHERE id = ?', [data.id]);
  ws.send(JSON.stringify(result));
});
Enter fullscreen mode Exit fullscreen mode

Your WebSocket layer now owns business logic and database access.

After:

// REST handles writes
app.post('/update-user', async (req, res) => {
  const user = await updateUser(req.body);
  await redis.publish('user.updated', JSON.stringify(user));
  res.json(user);
});

// WebSocket only pushes events
await redis.subscribe('user.updated', (msg) => {
  broadcastToRoom('users', msg);
});
Enter fullscreen mode Exit fullscreen mode

WebSockets become a delivery layer for events. Not your query engine.

If you are designing large JavaScript systems, this separation mirrors the event-driven architecture patterns discussed in the JavaScript application architecture system design guide.


7. Ignoring OS Limits → Raising File Descriptors

You will hit the OS limit before you hit CPU.

Check:

ulimit -n
Enter fullscreen mode Exit fullscreen mode

Typical default: 1024.

Raise it:

ulimit -n 100000
Enter fullscreen mode Exit fullscreen mode

Persist in /etc/security/limits.conf:

* soft nofile 100000
* hard nofile 100000
Enter fullscreen mode Exit fullscreen mode

Each WebSocket connection consumes one file descriptor. At scale, this matters more than your Node.js code.


What Actually Gets You to 1 Million Connections

It is not a magic library.

It is:

  • Horizontal scaling with Redis or a coordinator
  • Sticky sessions
  • Heartbeat cleanup
  • Worker isolation for CPU work
  • Backoff on reconnect
  • Clear separation between API and event delivery
  • Proper OS tuning

Most teams fail at one of these and blame Node.js.

You can implement all 7 patterns in a weekend. The difference between a 5,000 connection demo and a 1,000,000 connection system is not syntax. It is architecture.

If you are building real-time systems, start by load testing at 1,000 connections locally. Measure latency. Watch file descriptors. Then scale horizontally before production forces you to.

Top comments (0)