Ashwin k

Posted on Mar 23

How I Fixed Multi-Pod WebSocket Sync in Kubernetes Using Redis Adapter

#redis #kubernetes #node #websockets

If you've ever deployed a Socket.IO app to Kubernetes and watched your real-time features silently break — no errors, no crashes, just messages vanishing — this is exactly what happened to me, and here's how I fixed it.

The Problem

We had a real-time feature built with WebSockets (Socket.IO). Everything worked perfectly in local and single-instance environments.

The moment we deployed to Kubernetes with multiple pods, things broke.

Symptoms

Users connected to different pods couldn't receive each other's events
Messages randomly "disappeared"
Broadcasting only worked within the same pod

At first glance, everything looked fine — no errors, no crashes. But the system was fundamentally broken at the architectural level.

Root Cause: In-Memory Connections Don't Cross Pod Boundaries

Socket.IO (and WebSockets in general) maintain in-memory connections.

That means:

Each pod has its own isolated set of connected clients
There is no shared state between pods

So when User A connects to Pod 1 and User B connects to Pod 2, and Pod 1 emits an event — Pod 2 has no idea that event exists.

Each pod becomes a real-time island.

The Architecture Problem

Kubernetes distributes traffic using a load balancer. Requests get routed to different pods randomly — and without a shared communication layer, those pods can never talk to each other.

Without a shared messaging layer, real-time systems break horizontally.

The Solution: Redis Pub/Sub + Socket.IO Adapter

To fix this, we introduced Redis Pub/Sub using the official Socket.IO Redis adapter.

What Redis does here

Acts as a message broker between all pods
When Pod 1 emits an event → it's published to Redis
Redis broadcasts it to all subscribed pods
Every pod then emits it to its own connected clients

Result: All clients receive the event, regardless of which pod they're on.

Implementation

1. Install dependencies

npm install socket.io @socket.io/redis-adapter ioredis

2. Create Redis pub/sub clients

import { createClient } from "ioredis";

const pubClient = createClient({ host: "redis-host", port: 6379 });
const subClient = pubClient.duplicate();

Two separate clients are required — one for publishing, one for subscribing. This is a Redis Pub/Sub requirement.

3. Attach the Redis adapter to Socket.IO

import { Server } from "socket.io";
import { createAdapter } from "@socket.io/redis-adapter";

const io = new Server(server, {
  cors: { origin: "*" }
});

io.adapter(createAdapter(pubClient, subClient));

4. Emit events — nothing changes

io.emit("message", data);

No changes to your business logic. Redis handles the cross-pod sync entirely behind the scenes.

Result

After the fix:

Events are synchronized across all pods
Real-time communication works reliably in Kubernetes
No more silent "missing messages"
The fix required zero changes to business logic

Things You Should Know Before Using This

1. Redis is now a critical dependency

If Redis goes down → your real-time layer breaks. Plan for Redis high-availability (Redis Sentinel or Redis Cluster) in production.

2. Latency increases slightly

There's a small overhead introduced by the pub/sub round-trip. For most real-time apps, this is negligible — but worth knowing.

3. Sticky sessions are no longer required

One of the nice side effects: Redis removes the need for session affinity (sticky sessions) at the load balancer. Any user can hit any pod.

Alternative Approaches — And Why We Didn't Use Them

Sticky Sessions

Routes each user to the same pod permanently
Breaks horizontal scalability — defeats the purpose of multiple pods
A temporary band-aid, not a real fix

Kafka / RabbitMQ

Powerful, but significant operational overhead
Overkill for a straightforward real-time sync requirement
Redis Pub/Sub hits the sweet spot: simple, fast, battle-tested

Key Takeaway

If you're scaling a real-time system horizontally:

In-memory sockets won't scale across pods. You need a shared messaging layer.

Redis Pub/Sub is one of the simplest and most effective ways to bridge that gap.

Final Thought

This wasn't a bug — it was an architectural gap.

Once you understand that each pod is an isolated process with no awareness of other pods, you start designing distributed systems differently. You stop assuming "in-process = reliable" and start asking "where's the shared state?"

And that's when things actually scale.

Top comments (1)

Shiva Shanmugam • Mar 27

Great work @ashwin_k 🙌🏻!