Real-time systems are unforgiving. In a standard web application, a 500-millisecond delay is an inconvenience. In a real-time multiplayer gaming platform, 500 milliseconds is the difference between a fair outcome and a furious user demanding a refund.
Between October 2019 and December 2021, I served as Senior Software Engineer at Lordwin Group, where I led a team of five backend developers building dice.ng — a real-time gaming platform where thousands of users placed wagers, watched outcomes resolve live, and expected the entire experience to feel instantaneous. The platform also encompassed an investment management system and hotel booking service, but the gaming backend was the most technically demanding piece of infrastructure I have ever designed.
This article is a technical deep dive into how I architected that real-time gaming backend from the ground up — the WebSocket infrastructure, the event-driven architecture, the scaling strategy, and the hard lessons learned from operating a system where latency directly impacts revenue.
The Problem: Sub-100ms Latency for Thousands of Simultaneous Players
The requirements were aggressive from day one. Lordwin Group needed a gaming platform that could:
- Support 3,000+ concurrent WebSocket connections during peak hours
- Deliver game events (dice rolls, bet confirmations, payout calculations) with sub-100ms latency
- Maintain absolute consistency in game state — in a wagering platform, a race condition is not a bug, it is a financial liability
- Handle burst traffic patterns — user activity would spike dramatically around evening hours and weekends, sometimes tripling within minutes
- Achieve 99.9% uptime — every minute of downtime was measurable lost revenue
The initial prototype used HTTP polling. Clients hit an API endpoint every two seconds to check for game state updates. At 500 concurrent users, this generated over 15,000 HTTP requests per minute, the database was drowning, and the "real-time" experience felt sluggish and broken. It was immediately clear that polling could not scale. We needed a fundamentally different architecture.
My Role: Technical Lead and System Architect
As the senior engineer leading a team of five, my responsibilities went beyond writing code. I made the core architecture decisions, designed the WebSocket infrastructure, established the event-driven messaging patterns, defined the horizontal scaling strategy, and mentored junior engineers on real-time system design.
Olamilekan Lamidi was not just a contributor on this project — I owned the technical direction. Every critical design decision described in this article was one I either made directly or guided the team toward after evaluating alternatives.
Technical Deep Dive: Building the Real-Time Infrastructure
1. WebSocket Server Architecture with Node.js
I chose Node.js as the runtime for the WebSocket server. Its event-driven, non-blocking I/O model is purpose-built for maintaining thousands of persistent connections with minimal resource overhead. PHP (our backend for the investment and hotel systems) was poorly suited for long-lived connections, so I designed the gaming layer as a separate Node.js service communicating with the main platform through Redis message channels.
The core WebSocket server was built on the ws library rather than Socket.IO. While Socket.IO offers convenience features like automatic reconnection and room management, it adds protocol overhead and abstractions that reduce control. For a latency-sensitive gaming platform, I needed raw WebSocket performance with custom protocol handling.
const WebSocket = require('ws');
const Redis = require('ioredis');
const { v4: uuidv4 } = require('uuid');
const wss = new WebSocket.Server({
port: 8080,
maxPayload: 1024 * 16,
perMessageDeflate: false,
clientTracking: true,
});
const connections = new Map();
wss.on('connection', (ws, req) => {
const connectionId = uuidv4();
const clientIp = req.headers['x-forwarded-for'] || req.socket.remoteAddress;
connections.set(connectionId, {
ws,
userId: null,
rooms: new Set(),
connectedAt: Date.now(),
lastHeartbeat: Date.now(),
});
ws.on('message', (raw) => {
try {
const message = JSON.parse(raw);
routeMessage(connectionId, message);
} catch (err) {
ws.send(JSON.stringify({ type: 'error', code: 'INVALID_PAYLOAD' }));
}
});
ws.on('close', () => {
cleanupConnection(connectionId);
});
ws.on('pong', () => {
const conn = connections.get(connectionId);
if (conn) conn.lastHeartbeat = Date.now();
});
ws.send(JSON.stringify({
type: 'connected',
connectionId,
serverTime: Date.now(),
}));
});
I disabled perMessageDeflate deliberately. Compression adds CPU overhead per message, and since our payloads were small (typically under 500 bytes for game events), the bandwidth savings were negligible compared to the latency cost. This single configuration change reduced median message delivery time by 8ms across our benchmark tests.
2. Connection Management and Authentication
In a gaming platform handling real money, every WebSocket connection must be authenticated. I implemented a token-based authentication flow where clients first obtain a short-lived JWT via the REST API, then present it during WebSocket handshake.
const jwt = require('jsonwebtoken');
function routeMessage(connectionId, message) {
const conn = connections.get(connectionId);
if (!conn) return;
switch (message.type) {
case 'authenticate':
handleAuthentication(connectionId, message.token);
break;
case 'join_game':
if (!conn.userId) return sendError(conn.ws, 'NOT_AUTHENTICATED');
handleJoinGame(connectionId, message.gameId);
break;
case 'place_bet':
if (!conn.userId) return sendError(conn.ws, 'NOT_AUTHENTICATED');
handlePlaceBet(connectionId, message);
break;
case 'ping':
conn.ws.send(JSON.stringify({ type: 'pong', serverTime: Date.now() }));
break;
default:
sendError(conn.ws, 'UNKNOWN_MESSAGE_TYPE');
}
}
function handleAuthentication(connectionId, token) {
const conn = connections.get(connectionId);
try {
const payload = jwt.verify(token, process.env.JWT_SECRET, {
algorithms: ['HS256'],
maxAge: '5m',
});
conn.userId = payload.userId;
conn.ws.send(JSON.stringify({ type: 'authenticated', userId: payload.userId }));
userConnectionIndex.set(payload.userId, connectionId);
} catch (err) {
conn.ws.send(JSON.stringify({ type: 'auth_failed', reason: 'INVALID_TOKEN' }));
conn.ws.close(4001, 'Authentication failed');
}
}
The JWT had a deliberately short expiry of five minutes. Since it was only used for the WebSocket handshake, there was no reason for it to live longer. Once authenticated, the connection was maintained through heartbeat pings rather than token refresh cycles.
3. Room-Based Game State Management
Each active game session functioned as a "room" — a logical grouping of WebSocket connections that should receive the same events. I implemented a lightweight room system without relying on external libraries:
const rooms = new Map();
function handleJoinGame(connectionId, gameId) {
const conn = connections.get(connectionId);
if (!conn) return;
if (!rooms.has(gameId)) {
rooms.set(gameId, new Set());
}
rooms.get(gameId).add(connectionId);
conn.rooms.add(gameId);
conn.ws.send(JSON.stringify({
type: 'game_joined',
gameId,
players: rooms.get(gameId).size,
}));
broadcastToRoom(gameId, {
type: 'player_joined',
userId: conn.userId,
playerCount: rooms.get(gameId).size,
}, connectionId);
}
function broadcastToRoom(gameId, payload, excludeConnectionId = null) {
const room = rooms.get(gameId);
if (!room) return;
const message = JSON.stringify(payload);
for (const connId of room) {
if (connId === excludeConnectionId) continue;
const conn = connections.get(connId);
if (conn && conn.ws.readyState === WebSocket.OPEN) {
conn.ws.send(message);
}
}
}
One critical optimisation: I serialised the message payload once with JSON.stringify outside the loop, then sent the same string buffer to every client. For a room of 500 players, this avoided 499 redundant serialisation operations per broadcast — a saving that compounds rapidly under load.
4. Event-Driven Game Engine with Redis Pub/Sub
The game logic itself — dice rolling, bet resolution, payout calculation — ran in a separate process from the WebSocket server. This was a deliberate architectural decision. The WebSocket server's only responsibility was managing connections and delivering messages. Game logic, financial calculations, and database writes happened in dedicated worker processes.
Redis Pub/Sub was the communication backbone:
const publisher = new Redis(process.env.REDIS_URL);
const subscriber = new Redis(process.env.REDIS_URL);
subscriber.subscribe('game:events', 'game:results', 'system:announcements');
subscriber.on('message', (channel, message) => {
const event = JSON.parse(message);
switch (channel) {
case 'game:events':
broadcastToRoom(event.gameId, {
type: 'game_event',
event: event.eventType,
data: event.payload,
timestamp: event.timestamp,
});
break;
case 'game:results':
handleGameResult(event);
break;
case 'system:announcements':
broadcastToAll({
type: 'system_announcement',
message: event.message,
});
break;
}
});
async function handlePlaceBet(connectionId, message) {
const conn = connections.get(connectionId);
const betEvent = {
gameId: message.gameId,
userId: conn.userId,
amount: message.amount,
selection: message.selection,
timestamp: Date.now(),
idempotencyKey: message.idempotencyKey,
};
await publisher.publish('bets:incoming', JSON.stringify(betEvent));
conn.ws.send(JSON.stringify({
type: 'bet_acknowledged',
idempotencyKey: message.idempotencyKey,
status: 'processing',
}));
}
This decoupling had three critical benefits:
- Fault isolation: If the game engine crashed, WebSocket connections remained alive. Users saw a brief pause rather than a disconnection.
- Independent scaling: I could run multiple game engine workers to process bets in parallel without changing the WebSocket layer.
- Auditability: Every game event flowed through Redis, creating a natural event stream that I logged for regulatory compliance and dispute resolution.
5. Horizontal Scaling Across Multiple WebSocket Servers
A single Node.js process can handle approximately 10,000 concurrent WebSocket connections before memory and CPU become constraints. But scaling WebSockets horizontally introduces a problem that REST APIs do not have: connection affinity. If Player A is connected to Server 1 and Player B to Server 2, a room broadcast must reach both servers.
I solved this with Redis as the cross-server message bus:
const serverId = process.env.SERVER_ID || uuidv4();
subscriber.subscribe('ws:broadcast');
subscriber.on('message', (channel, raw) => {
if (channel !== 'ws:broadcast') return;
const message = JSON.parse(raw);
if (message.originServer === serverId) return;
if (message.targetRoom) {
broadcastToRoom(message.targetRoom, message.payload);
} else if (message.targetUser) {
sendToUser(message.targetUser, message.payload);
}
});
function clusterBroadcastToRoom(gameId, payload) {
broadcastToRoom(gameId, payload);
publisher.publish('ws:broadcast', JSON.stringify({
originServer: serverId,
targetRoom: gameId,
payload,
}));
}
Each WebSocket server instance subscribed to a shared Redis channel. When a game event needed to reach all players in a room, the originating server broadcast to its local connections and simultaneously published to Redis, where other servers picked up the message and relayed it to their local connections.
The originServer check prevented message loops — without it, a server would re-broadcast messages it had already delivered locally.
6. Heartbeat Monitoring and Connection Cleanup
Stale connections are a silent performance killer in WebSocket systems. Users close browser tabs, lose internet connectivity, or let their phones go to sleep. Without proactive cleanup, the server accumulates zombie connections that consume memory and distort room player counts.
I implemented a heartbeat interval that pinged every client every 30 seconds:
const HEARTBEAT_INTERVAL = 30000;
const HEARTBEAT_TIMEOUT = 45000;
setInterval(() => {
const now = Date.now();
for (const [connectionId, conn] of connections) {
if (now - conn.lastHeartbeat > HEARTBEAT_TIMEOUT) {
conn.ws.terminate();
cleanupConnection(connectionId);
continue;
}
if (conn.ws.readyState === WebSocket.OPEN) {
conn.ws.ping();
}
}
}, HEARTBEAT_INTERVAL);
function cleanupConnection(connectionId) {
const conn = connections.get(connectionId);
if (!conn) return;
for (const gameId of conn.rooms) {
const room = rooms.get(gameId);
if (room) {
room.delete(connectionId);
if (room.size === 0) {
rooms.delete(gameId);
} else {
broadcastToRoom(gameId, {
type: 'player_left',
userId: conn.userId,
playerCount: room.size,
});
}
}
}
if (conn.userId) {
userConnectionIndex.delete(conn.userId);
}
connections.delete(connectionId);
}
This heartbeat mechanism maintained accurate player counts and ensured we never wasted resources on dead connections — critical when operating at 3,000+ concurrent users where even small inefficiencies compound.
Ensuring Game Integrity: The Financial Safety Layer
In a platform where real money is at stake, game outcome integrity is non-negotiable. I designed several safeguards:
Idempotent Bet Processing: Every bet carried a client-generated idempotency key. If a network hiccup caused a duplicate submission, the game engine would recognise the duplicate key and return the original response rather than processing the bet twice.
Atomic Balance Operations: All balance changes used Redis WATCH/MULTI/EXEC transactions for in-memory operations and database-level row locking for persistence, ensuring no user could ever bet more than their balance — even under concurrent requests.
async function processBalanceDeduction(userId, amount, idempotencyKey) {
const lockKey = `lock:balance:${userId}`;
const lock = await acquireLock(lockKey, 5000);
try {
const processed = await redis.get(`idempotency:${idempotencyKey}`);
if (processed) return JSON.parse(processed);
const balance = parseFloat(await redis.get(`balance:${userId}`));
if (balance < amount) {
return { success: false, reason: 'INSUFFICIENT_BALANCE' };
}
const newBalance = balance - amount;
await redis.set(`balance:${userId}`, newBalance.toString());
const result = { success: true, newBalance, deducted: amount };
await redis.setex(`idempotency:${idempotencyKey}`, 3600, JSON.stringify(result));
return result;
} finally {
await releaseLock(lock);
}
}
Provably Fair Outcomes: Game results were generated using a combination of server seed and client seed, hashed together. Users could verify after each round that the outcome was not manipulated. This was both a regulatory requirement and a trust-building feature.
Monitoring and Observability in Production
Operating a real-time gaming system requires comprehensive observability. I built custom monitoring dashboards tracking:
- Active connections per server instance — to trigger auto-scaling
- Message throughput — messages sent per second across all rooms
- Event latency — time from game engine event emission to client delivery
- Redis pub/sub lag — early warning for message bus saturation
- Game round completion times — anomaly detection for stuck game sessions
I instrumented the WebSocket server with Prometheus metrics:
const client = require('prom-client');
const activeConnections = new client.Gauge({
name: 'ws_active_connections',
help: 'Number of active WebSocket connections',
});
const messageLatency = new client.Histogram({
name: 'ws_message_latency_ms',
help: 'Message delivery latency in milliseconds',
buckets: [5, 10, 25, 50, 100, 250, 500],
});
const messagesPerSecond = new client.Counter({
name: 'ws_messages_total',
help: 'Total WebSocket messages sent',
labelNames: ['type'],
});
These metrics fed into Grafana dashboards that gave me real-time visibility into system health. On several occasions, a spike in message latency gave us a 10-minute early warning before a Redis memory issue would have caused visible user impact.
The Results: Production Metrics
After three months of iterative development and load testing, Olamilekan Lamidi and the engineering team shipped the gaming backend into production. The measurable outcomes exceeded our original targets:
| Metric | Target | Achieved |
|---|---|---|
| Concurrent WebSocket connections | 3,000 | 3,200+ sustained peak |
| Event delivery latency (p50) | <100ms | 32ms |
| Event delivery latency (p99) | <200ms | 87ms |
| System uptime (monthly) | 99.5% | 99.9% |
| Bet processing throughput | 500/min | 850+/min |
| Connection recovery after deploy | <5s | ~2.5s average |
The sub-50ms median latency was particularly significant. Users experienced game events as truly instantaneous — dice rolls resolved, payouts appeared, and leaderboards updated in what felt like real time. This directly correlated with user engagement metrics: average session duration increased by 35% compared to the earlier HTTP polling prototype.
Lessons Learned
Building this system taught me principles that I have carried into every project since:
Separate connection management from business logic. The WebSocket server should be a dumb pipe. All intelligence belongs in backend workers communicating through message queues.
Design for graceful degradation. When the game engine was temporarily slow, the WebSocket layer continued serving heartbeats and acknowledged messages. Users experienced a delay, not a crash.
Serialise once, send many. In broadcast-heavy systems, the cost of JSON serialisation dwarfs the cost of network transmission. Serialise the payload once and reuse the buffer.
Monitor latency percentiles, not averages. A 30ms average means nothing if 5% of your users experience 500ms delays. The p99 is what your angriest users feel.
Test with realistic load patterns. Our load tests simulated bursty evening traffic with sudden spikes, not steady-state throughput. Production traffic is never smooth.
Final Thoughts
Architecting a real-time gaming backend was the most technically challenging and rewarding project of my career up to that point. It demanded a deep understanding of network protocols, distributed systems, concurrent programming, and financial transaction safety — all operating under latency constraints that left no room for architectural shortcuts.
The patterns I developed at Lordwin Group — event-driven architecture, Redis-backed horizontal scaling, idempotent financial operations — became foundational to how I approach every real-time system I have built since. At VacancySoft, I applied similar event-driven patterns to handle 50,000+ daily API requests. At 2am Tech, the idempotent transaction processing pattern directly informed the financial workflows I built for the Addio platform.
Real-time systems are hard. But the engineering discipline they demand makes you a better systems engineer in every other context. If you are building similar infrastructure, I hope this deep dive gives you a head start on the decisions and trade-offs that matter most.
Olamilekan Lamidi is a Senior Full-Stack Engineer with 9+ years of experience building scalable, high-performance web applications. He specialises in designing robust APIs, optimising systems for performance at scale, and leading engineering teams to deliver reliable production systems.
Tags: #WebSockets #NodeJS #RealTime #GameDev #SystemDesign #BackendEngineering #Redis #JavaScript #WebDevelopment #SoftwareArchitecture
Top comments (0)