TL;DR: I helped build Mstock, a stock trading platform handling 100K+ concurrent users during market hours. This post walks through the full system design — from WebSocket connections for live prices, to order matching, to the caching and database layers — with architecture diagrams, sequence flows, and the hard lessons we learned in production.
The Problem
Designing a stock trading platform isn't like building a typical web app. The constraints are brutal:
- Latency: Order placement must complete in under 100ms. Users lose real money on delays.
- Concurrency: 100K+ users online simultaneously during the 6-hour trading window (9:15 AM - 3:30 PM IST).
- Data velocity: Stock prices update every second across 5,000+ symbols. That's 5,000 messages/second to broadcast.
- Accuracy: A rounding error in price or quantity isn't a bug — it's a financial incident.
- Availability: Downtime during market hours = regulatory scrutiny + angry traders + lost revenue.
I worked on Mstock as part of the frontend architecture team, but I had deep exposure to the full stack. Here's how a system like this is designed from the ground up.
High-Level Architecture
STOCK TRADING PLATFORM — HIGH LEVEL ARCHITECTURE
═══════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Web │ │ Mobile │ │ Mobile │ │ Desktop │ │
│ │ (React) │ │ (iOS) │ │(Android) │ │ Terminal │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └─────┬──────┘ │
│ │ │ │ │ │
│ └──────────────┴──────┬───────┴──────────────┘ │
│ │ │
└─────────────────────────────┼───────────────────────────────┘
│ HTTPS + WebSocket
▼
┌─────────────────────────────────────────────────────────────┐
│ GATEWAY LAYER │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ API Gateway / Load Balancer │ │
│ │ (Nginx / AWS ALB / Kong / Traefik) │ │
│ │ │ │
│ │ - SSL termination │ │
│ │ - Rate limiting (1000 req/min per user) │ │
│ │ - WebSocket upgrade handling │ │
│ │ - IP whitelisting for exchange connections │ │
│ └──────────────────────┬───────────────────────────────┘ │
│ │ │
└─────────────────────────┼───────────────────────────────────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │
│ │ Order │ │ Market │ │ Portfolio │ │
│ │ Service │ │ Data │ │ Service │ │
│ │ │ │ Service │ │ │ │
│ │ - Place │ │ - Live │ │ - Holdings │ │
│ │ - Cancel │ │ prices │ │ - P&L calculation │ │
│ │ - Modify │ │ - OHLCV │ │ - Margin check │ │
│ │ - History │ │ - Depth │ │ - Fund transfer │ │
│ └──────┬─────┘ └──────┬─────┘ └──────┬─────────────┘ │
│ │ │ │ │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │
│ │ Auth │ │ WebSocket │ │ Notification │ │
│ │ Service │ │ Server │ │ Service │ │
│ │ │ │ │ │ │ │
│ │ - JWT │ │ - Price │ │ - Order status │ │
│ │ - 2FA │ │ feed │ │ - Price alerts │ │
│ │ - Session │ │ - Order │ │ - Margin calls │ │
│ │ │ │ updates │ │ - Push + Email │ │
│ └────────────┘ └────────────┘ └────────────────────┘ │
│ │
└──────────────────────────┬──────────────────────────────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │
│ │ Redis │ │ PostgreSQL │ │ TimescaleDB │ │
│ │ Cluster │ │ (Primary) │ │ (Time-series) │ │
│ │ │ │ │ │ │ │
│ │ - Sessions │ │ - Users │ │ - OHLCV candles │ │
│ │ - Prices │ │ - Orders │ │ - Tick data │ │
│ │ - Pub/Sub │ │ - Trades │ │ - Historical │ │
│ │ - Queues │ │ - Holdings │ │ charts │ │
│ └────────────┘ └────────────┘ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ EXCHANGE LAYER │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │
│ │ NSE │ │ BSE │ │ MCX │ │
│ │ Exchange │ │ Exchange │ │ (Commodities) │ │
│ └────────────┘ └────────────┘ └────────────────────┘ │
│ │
│ Connected via: FIX Protocol / Exchange API │
│ Latency requirement: < 5ms to exchange gateway │
│ │
└─────────────────────────────────────────────────────────────┘
Core Component Deep-Dives
1. WebSocket Server — The Real-Time Backbone
The WebSocket server is the heart of the real-time experience. Every user has a persistent connection that receives live price updates, order status changes, and alerts.
WEBSOCKET CONNECTION LIFECYCLE
══════════════════════════════
Client Server
│ │
│──── HTTP Upgrade Request ─────▶│
│ GET /ws │
│ Upgrade: websocket │
│ Connection: Upgrade │
│ Sec-WebSocket-Key: ... │
│ │
│◀── 101 Switching Protocols ───│
│ Upgrade: websocket │
│ Sec-WebSocket-Accept: ... │
│ │
│◀════ WebSocket Connected ═════▶│
│ │
│──── Subscribe: RELIANCE ──────▶│
│──── Subscribe: INFY ──────────▶│
│──── Subscribe: TCS ───────────▶│
│ │
│◀─── Price: RELIANCE 2450.50 ──│ (every ~1s)
│◀─── Price: INFY 1623.75 ──────│
│◀─── Price: TCS 3890.20 ───────│
│◀─── Depth: RELIANCE {...} ────│
│ │
│──── Place Order ──────────────▶│
│◀─── Order Confirmed ──────────│
│◀─── Order Executed ───────────│
│ │
│──── Heartbeat (ping) ─────────▶│ (every 30s)
│◀─── Heartbeat (pong) ─────────│
│ │
flowchart TD
A[Client Connects via WS] --> B[Authenticate JWT]
B -->|Valid| C[Add to Connection Pool]
B -->|Invalid| D[Reject + Close]
C --> E[Client Subscribes to Symbols]
E --> F[Add to Symbol Channels]
F --> G[Receive Price Updates via Redis Pub/Sub]
G --> H[Broadcast to Subscribed Clients]
H --> G
style D fill:#ef4444,color:#fff
style H fill:#22c55e,color:#fff
WebSocket Server Implementation
// ws-server.js — WebSocket server with Redis Pub/Sub for price distribution
const WebSocket = require('ws');
const Redis = require('ioredis');
const jwt = require('jsonwebtoken');
const wss = new WebSocket.Server({ noServer: true });
const redisSub = new Redis(); // Subscriber connection
const redisPub = new Redis(); // Publisher connection
// Track subscriptions: symbol → Set of WebSocket clients
const subscriptions = new Map();
// Track clients: ws → { userId, subscribedSymbols }
const clients = new Map();
// Handle new WebSocket connections
wss.on('connection', (ws, req) => {
const userId = req.userId; // Set during upgrade authentication
clients.set(ws, { userId, subscribedSymbols: new Set() });
ws.on('message', (raw) => {
const msg = JSON.parse(raw);
switch (msg.type) {
case 'subscribe':
handleSubscribe(ws, msg.symbols);
break;
case 'unsubscribe':
handleUnsubscribe(ws, msg.symbols);
break;
case 'order':
handleOrder(ws, msg.data);
break;
}
});
ws.on('close', () => {
// Clean up subscriptions
const client = clients.get(ws);
if (client) {
for (const symbol of client.subscribedSymbols) {
const subs = subscriptions.get(symbol);
if (subs) {
subs.delete(ws);
if (subs.size === 0) {
subscriptions.delete(symbol);
redisSub.unsubscribe(`price:${symbol}`);
}
}
}
clients.delete(ws);
}
});
});
function handleSubscribe(ws, symbols) {
const client = clients.get(ws);
for (const symbol of symbols) {
// Add to client's subscription set
client.subscribedSymbols.add(symbol);
// Add to symbol's subscriber set
if (!subscriptions.has(symbol)) {
subscriptions.set(symbol, new Set());
// First subscriber for this symbol — subscribe to Redis channel
redisSub.subscribe(`price:${symbol}`);
}
subscriptions.get(symbol).add(ws);
}
}
// When Redis publishes a price update, broadcast to all subscribed clients
redisSub.on('message', (channel, message) => {
const symbol = channel.replace('price:', '');
const subscribers = subscriptions.get(symbol);
if (subscribers) {
const payload = JSON.stringify({
type: 'price',
symbol,
data: JSON.parse(message),
timestamp: Date.now(),
});
for (const ws of subscribers) {
if (ws.readyState === WebSocket.OPEN) {
ws.send(payload);
}
}
}
});
Scaling WebSockets to 100K+ Connections
A single Node.js process can handle ~10K-50K WebSocket connections (depending on message frequency). For 100K+, you need horizontal scaling:
WEBSOCKET SCALING ARCHITECTURE
══════════════════════════════
┌───────────────────┐
│ Load Balancer │
│ (Sticky Sessions │
│ by IP hash) │
└────────┬──────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ WS Server 1 │ │ WS Server 2 │ │ WS Server 3 │
│ ~35K conns │ │ ~35K conns │ │ ~35K conns │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────────────┼────────────────┘
│
┌────────▼────────┐
│ Redis Pub/Sub │
│ │
│ All 3 servers │
│ subscribe to │
│ same channels │
└─────────────────┘
Key insight: Redis Pub/Sub acts as the broadcast bus.
When a price update comes in, ALL servers receive it
and broadcast to their own connected clients.
Why sticky sessions? WebSocket connections are stateful. If a client disconnects and reconnects, they should hit the same server to resume their subscriptions. IP-hash based load balancing handles this without session state.
2. Order Flow — The Critical Path
The order flow is the most latency-sensitive part of the system. From the moment a user clicks "Buy" to the order reaching the exchange, every millisecond counts.
ORDER FLOW — END TO END
═══════════════════════
┌────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────┐
│ Client │────▶│ API │────▶│ Order │────▶│ Exchange │────▶│ Order │
│ (UI) │ │ Gateway │ │ Service │ │ Gateway │ │ Matching │
└────────┘ └──────────┘ └──────────────┘ └──────────┘ └──────────┘
│
│ Validates:
├─ Sufficient margin?
├─ Valid symbol?
├─ Market hours?
├─ Price within circuit limits?
├─ Quantity within limits?
└─ User not restricted?
│
▼
┌──────────────┐
│ Risk Check │
│ Engine │
└──────────────┘
Timeline:
─────────────────────────────────────────────────────▶ time
│ User clicks │ Validation │ Risk Check │ To Exchange │
│ "Buy" │ ~5ms │ ~3ms │ ~10ms │
│ │ │ │ │
│ └────────────┴────────────┴─────────────│
│ Total: ~20-50ms │
sequenceDiagram
participant Client
participant API as API Gateway
participant Order as Order Service
participant Risk as Risk Engine
participant Exchange
participant WS as WebSocket Server
participant DB as PostgreSQL
Client->>API: Place Order (BUY RELIANCE x100 @ 2450)
API->>Order: Validate & Forward
Order->>Order: Check market hours, symbol, quantity
Order->>Risk: Margin check
Risk-->>Order: Margin OK (available: ₹5,00,000)
Order->>DB: Save order (status: PENDING)
Order->>Exchange: Submit via FIX Protocol
Exchange-->>Order: Order ID: ORD123456
Order->>DB: Update status: SUBMITTED
Order->>WS: Notify client
WS-->>Client: Order Submitted ✓
Note over Exchange: Exchange matches order...
Exchange-->>Order: Execution Report (FILLED @ 2449.50)
Order->>DB: Update status: EXECUTED
Order->>DB: Update holdings + positions
Order->>WS: Notify client
WS-->>Client: Order Executed ✓ (100 x 2449.50)
Order Validation Code
// order-service.js — Order validation and submission
const MARKET_OPEN = { hour: 9, minute: 15 };
const MARKET_CLOSE = { hour: 15, minute: 30 };
async function placeOrder(userId, orderData) {
const { symbol, quantity, price, type, side } = orderData;
// Step 1: Basic validation
validateOrderParams(symbol, quantity, price, type, side);
// Step 2: Check market hours (IST)
const now = new Date();
const istHour = now.getUTCHours() + 5;
const istMin = now.getUTCMinutes() + 30;
if (!isMarketOpen(istHour, istMin)) {
throw new Error('Market is closed. Trading hours: 9:15 AM - 3:30 PM IST');
}
// Step 3: Margin check — can the user afford this order?
const requiredMargin = calculateMargin(symbol, quantity, price, side);
const availableMargin = await getAvailableMargin(userId);
if (requiredMargin > availableMargin) {
throw new Error(
`Insufficient margin. Required: ₹${requiredMargin}, Available: ₹${availableMargin}`
);
}
// Step 4: Block margin (prevent double-spending)
await blockMargin(userId, requiredMargin);
// Step 5: Save order to database
const order = await db.query(
`INSERT INTO orders (user_id, symbol, quantity, price, type, side, status, created_at)
VALUES ($1, $2, $3, $4, $5, $6, 'PENDING', NOW())
RETURNING *`,
[userId, symbol, quantity, price, type, side]
);
// Step 6: Submit to exchange
try {
const exchangeOrderId = await exchangeGateway.submit({
orderId: order.rows[0].id,
symbol,
quantity,
price,
type, // LIMIT, MARKET, SL, SL-M
side, // BUY, SELL
});
// Update with exchange order ID
await db.query(
'UPDATE orders SET exchange_order_id = $1, status = $2 WHERE id = $3',
[exchangeOrderId, 'SUBMITTED', order.rows[0].id]
);
// Notify client via WebSocket
notifyClient(userId, {
type: 'order_update',
orderId: order.rows[0].id,
status: 'SUBMITTED',
exchangeOrderId,
});
return { orderId: order.rows[0].id, status: 'SUBMITTED' };
} catch (err) {
// Release blocked margin on failure
await releaseMargin(userId, requiredMargin);
await db.query(
'UPDATE orders SET status = $1 WHERE id = $2',
['REJECTED', order.rows[0].id]
);
throw err;
}
}
3. Market Data Pipeline — 5,000 Price Updates Per Second
Stock exchanges push price data via their feed APIs. We need to process, store, and broadcast this data to 100K+ users in real-time.
MARKET DATA PIPELINE
════════════════════
┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ NSE │────▶│ Exchange │────▶│ Price │────▶│ Redis │
│ Exchange │ │ Feed Parser │ │ Processor│ │ Pub/Sub │
│ Feed │ │ │ │ │ │ │
│ │ │ - Binary │ │ - LTP │ │ price:INFY │
│ 5000 │ │ protocol │ │ - OHLCV │ │ price:TCS │
│ symbols │ │ - Decode │ │ - Change │ │ price:... │
│ ~1msg/s │ │ - Validate │ │ - Volume │ │ │
│ each │ │ │ │ - Depth │ │ ~5000 msg/s │
└──────────┘ └──────────────┘ └──────────┘ └──────┬───────┘
│
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ WS │ │ WS │ │ WS │
│ Server 1 │ │ Server 2 │ │ Server 3 │
│ 35K conn │ │ 35K conn │ │ 35K conn │
└──────────┘ └──────────┘ └──────────┘
│ │ │
▼ ▼ ▼
Clients Clients Clients
// price-processor.js — Process exchange feed and publish to Redis
const Redis = require('ioredis');
const redis = new Redis();
class PriceProcessor {
constructor() {
this.lastPrices = new Map();
}
async processTickData(tick) {
const {
symbol,
ltp, // Last Traded Price
open,
high,
low,
close,
volume,
timestamp,
} = tick;
// Calculate change and % change
const prevClose = this.lastPrices.get(symbol)?.close || close;
const change = ltp - prevClose;
const changePercent = ((change / prevClose) * 100).toFixed(2);
const priceData = {
symbol,
ltp,
open,
high,
low,
close: prevClose,
volume,
change,
changePercent: parseFloat(changePercent),
timestamp,
};
// Update local cache
this.lastPrices.set(symbol, priceData);
// Publish to Redis — all WS servers will receive this
await redis.publish(
`price:${symbol}`,
JSON.stringify(priceData)
);
// Store latest price in Redis hash (for API queries)
await redis.hset(`stock:${symbol}`, {
ltp: ltp.toString(),
open: open.toString(),
high: high.toString(),
low: low.toString(),
volume: volume.toString(),
change: change.toString(),
changePercent: changePercent,
updatedAt: timestamp.toString(),
});
}
}
4. Database Design
DATABASE SCHEMA (Simplified)
════════════════════════════
┌──────────────────┐ ┌──────────────────┐
│ users │ │ accounts │
├──────────────────┤ ├──────────────────┤
│ id PK │──┐ │ id PK │
│ email │ │ │ user_id FK │──┐
│ phone │ │ │ balance │ │
│ pan_number │ │ │ blocked_margin │ │
│ kyc_status │ │ │ available_margin │ │
│ created_at │ └───▶│ updated_at │ │
└──────────────────┘ └──────────────────┘ │
│
┌──────────────────┐ ┌──────────────────┐ │
│ orders │ │ holdings │ │
├──────────────────┤ ├──────────────────┤ │
│ id PK │ │ id PK │ │
│ user_id FK │──┐ │ user_id FK │◀─┘
│ symbol │ │ │ symbol │
│ quantity │ │ │ quantity │
│ price │ │ │ avg_price │
│ type (LIMIT/..) │ │ │ current_value │
│ side (BUY/SELL) │ │ │ pnl │
│ status │ │ │ updated_at │
│ exchange_order_id│ │ └──────────────────┘
│ executed_price │ │
│ executed_at │ │ ┌──────────────────┐
│ created_at │ │ │ trades │
└──────────────────┘ │ ├──────────────────┤
│ │ id PK │
└───▶│ order_id FK │
│ symbol │
│ quantity │
│ price │
│ side │
│ exchange_trade_id│
│ executed_at │
└──────────────────┘
┌──────────────────────────────────────────────┐
│ ohlcv_candles (TimescaleDB) │
├──────────────────────────────────────────────┤
│ time TIMESTAMPTZ (hypertable key) │
│ symbol TEXT │
│ open DECIMAL(12,2) │
│ high DECIMAL(12,2) │
│ low DECIMAL(12,2) │
│ close DECIMAL(12,2) │
│ volume BIGINT │
│ interval TEXT ('1m','5m','15m','1h','1d')│
└──────────────────────────────────────────────┘
erDiagram
USERS ||--o{ ACCOUNTS : has
USERS ||--o{ ORDERS : places
USERS ||--o{ HOLDINGS : owns
ORDERS ||--o{ TRADES : generates
USERS {
int id PK
string email
string phone
string pan_number
string kyc_status
}
ORDERS {
int id PK
int user_id FK
string symbol
int quantity
decimal price
string type
string side
string status
}
HOLDINGS {
int id PK
int user_id FK
string symbol
int quantity
decimal avg_price
}
TRADES {
int id PK
int order_id FK
string symbol
int quantity
decimal price
timestamp executed_at
}
Why Two Databases?
PostgreSQL vs TimescaleDB — DECISION MATRIX
═══════════════════════════════════════════
┌──────────────────┬──────────────────┬──────────────────────┐
│ │ PostgreSQL │ TimescaleDB │
├──────────────────┼──────────────────┼──────────────────────┤
│ Used for │ Users, Orders, │ Price candles, │
│ │ Holdings, Trades │ tick data, charts │
├──────────────────┼──────────────────┼──────────────────────┤
│ Query pattern │ Random access │ Time-range queries │
│ │ by user/order ID │ "RELIANCE, last 30d" │
├──────────────────┼──────────────────┼──────────────────────┤
│ Write pattern │ Low-medium │ Very high │
│ │ (orders/trades) │ (5000 writes/sec) │
├──────────────────┼──────────────────┼──────────────────────┤
│ Retention │ Forever │ 2 years, then │
│ │ │ aggregate + archive │
├──────────────────┼──────────────────┼──────────────────────┤
│ Why │ ACID for money │ Optimized for │
│ │ transactions │ time-series inserts │
│ │ │ + range scans │
└──────────────────┴──────────────────┴──────────────────────┘
5. Caching Strategy
We covered caching in detail in my Redis caching post, but here's how it fits into the trading platform specifically:
WHAT WE CACHE (AND WHAT WE DON'T)
══════════════════════════════════
CACHED (Redis):
┌──────────────────────────────────────────────────┐
│ stock:RELIANCE → { ltp, ohlcv, volume } │ TTL: 2s
│ depth:RELIANCE → { bids[], asks[] } │ TTL: 1s
│ watchlist:user123 → [RELIANCE, INFY, TCS] │ TTL: 5min
│ session:abc123 → { userId, permissions } │ TTL: 30min
│ portfolio:user123 → { holdings, pnl } │ TTL: 60s
│ config:margins → { symbol: margin% } │ TTL: 1hr
└──────────────────────────────────────────────────┘
NOT CACHED (Direct DB):
┌──────────────────────────────────────────────────┐
│ Order placement → Must be 100% accurate │
│ Account balance → Real-time for margin calc │
│ Trade execution → Exchange is source of truth │
│ Fund transfer → Financial transaction │
└──────────────────────────────────────────────────┘
Rule: If a stale value could cause financial loss,
don't cache it. Go to the database.
6. Authentication & Security
AUTH FLOW — JWT + TOTP 2FA
══════════════════════════
┌────────┐ ┌──────────┐ ┌───────┐
│ Client │ │ Auth │ │ Redis │
└───┬────┘ │ Service │ └───┬───┘
│ └────┬─────┘ │
│ 1. POST /login │ │
│ { email, password } │ │
│─────────────────────────────▶│ │
│ │ 2. Verify password │
│ │ (bcrypt compare) │
│ │ │
│ 3. Send TOTP code │ │
│◀─────────────────────────────│ │
│ │ │
│ 4. POST /verify-2fa │ │
│ { code: "482910" } │ │
│─────────────────────────────▶│ │
│ │ 5. Verify TOTP │
│ │ │
│ │ 6. Create session ──────▶│
│ │ SET session:uuid │
│ │ { userId, perms } │
│ │ TTL: 30 min │
│ 7. Return JWT │ │
│ { accessToken, │ │
│ refreshToken } │ │
│◀─────────────────────────────│ │
│ │ │
│ 8. Subsequent requests │ │
│ Authorization: Bearer jwt │ │
│─────────────────────────────▶│ │
│ │ 9. Verify JWT │
│ │ 10. Check session ──────▶│
│ │ in Redis │
│ │◀──────────────────────────│
│ 11. Response │ │
│◀─────────────────────────────│ │
Handling Failure Scenarios
In production, things break. Here's how we designed for the common failure modes:
FAILURE SCENARIOS & MITIGATIONS
═══════════════════════════════
┌─────────────────────────────────────────────────────────────┐
│ SCENARIO │ IMPACT │ MITIGATION │
├───────────────────────┼───────────────┼─────────────────────┤
│ Redis goes down │ Slower reads │ Fall back to DB │
│ │ │ Serve stale prices │
├───────────────────────┼───────────────┼─────────────────────┤
│ WS server crashes │ Users │ Client auto- │
│ │ disconnected │ reconnect with │
│ │ │ exponential backoff │
├───────────────────────┼───────────────┼─────────────────────┤
│ Exchange feed drops │ Stale prices │ Show "last updated" │
│ │ │ timestamp, warn user│
├───────────────────────┼───────────────┼─────────────────────┤
│ Order rejected by │ Order fails │ Release margin, │
│ exchange │ │ notify user, retry │
│ │ │ if transient │
├───────────────────────┼───────────────┼─────────────────────┤
│ DB write fails during │ Inconsistent │ Use DB transaction │
│ order execution │ state │ + idempotency key │
├───────────────────────┼───────────────┼─────────────────────┤
│ Network partition │ Split brain │ Prefer consistency │
│ between services │ │ (reject orders │
│ │ │ rather than risk │
│ │ │ double execution) │
└───────────────────────┴───────────────┴─────────────────────┘
Client Reconnection Strategy
// client-reconnect.js — WebSocket reconnection with exponential backoff
class TradingWebSocket {
constructor(url) {
this.url = url;
this.retryCount = 0;
this.maxRetries = 10;
this.subscriptions = new Set();
this.connect();
}
connect() {
this.ws = new WebSocket(this.url);
this.ws.onopen = () => {
console.log('Connected to trading server');
this.retryCount = 0;
// Re-subscribe to all symbols after reconnection
if (this.subscriptions.size > 0) {
this.ws.send(JSON.stringify({
type: 'subscribe',
symbols: [...this.subscriptions],
}));
}
};
this.ws.onclose = () => {
if (this.retryCount < this.maxRetries) {
// Exponential backoff: 1s, 2s, 4s, 8s, 16s... capped at 30s
const delay = Math.min(1000 * Math.pow(2, this.retryCount), 30000);
console.log(`Reconnecting in ${delay}ms (attempt ${this.retryCount + 1})`);
this.retryCount++;
setTimeout(() => this.connect(), delay);
} else {
console.error('Max reconnection attempts reached');
}
};
this.ws.onmessage = (event) => {
const data = JSON.parse(event.data);
this.handleMessage(data);
};
}
subscribe(symbols) {
symbols.forEach((s) => this.subscriptions.add(s));
if (this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify({ type: 'subscribe', symbols }));
}
}
handleMessage(data) {
switch (data.type) {
case 'price':
this.onPriceUpdate?.(data);
break;
case 'order_update':
this.onOrderUpdate?.(data);
break;
case 'alert':
this.onAlert?.(data);
break;
}
}
}
Scaling Considerations
SCALING FROM 10K TO 100K+ USERS
════════════════════════════════
10K Users (Single Instance)
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 1 Node.js│ │ 1 Redis │ │ 1 PgSQL │
│ instance │ │ instance │ │ instance │
└──────────┘ └──────────┘ └──────────┘
50K Users (Horizontal + Read Replicas)
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 3 Node.js│ │ Redis │ │ PgSQL │
│ instances│ │ Sentinel │ │ Primary │
│ + LB │ │ (HA) │ │ + 2 Read │
└──────────┘ └──────────┘ │ Replicas │
└──────────┘
100K+ Users (Full Cluster)
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ 5+ Node │ │ Redis │ │ PgSQL │ │Timescale │
│ instances│ │ Cluster │ │ Primary │ │ DB for │
│ + ALB │ │ (6 nodes │ │ + Read │ │ candles │
│ + Auto- │ │ 3M+3S) │ │ Replicas │ │ + ticks │
│ scale │ │ │ │ + PgPool │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
Key metrics to monitor:
├── WebSocket connections per server (target: <40K)
├── Redis memory usage and hit rate (target: >90%)
├── DB connection pool utilization (target: <80%)
├── Order processing latency P99 (target: <100ms)
└── Price broadcast latency (target: <50ms from exchange)
Common Mistakes in Trading System Design
1. Using REST for Real-Time Data
Polling every second for 5,000 symbols = 5,000 HTTP requests/second per user. Use WebSockets.
2. Not Separating Read and Write Databases
Order writes and portfolio reads have very different patterns. Use read replicas for dashboards, primary for order execution.
3. Caching Account Balances
A stale balance means a user might place an order they can't afford. Always read balance from the primary database.
4. No Circuit Breaker for Exchange Connection
If the exchange gateway is slow, your order queue backs up. Use a circuit breaker pattern to fail fast and inform users.
5. Synchronous Order Processing
Order validation, margin check, and exchange submission can be pipelined. Validate first, then submit asynchronously and notify via WebSocket.
Key Takeaways
- WebSocket + Redis Pub/Sub is the backbone for real-time price distribution across multiple servers
- Separate your services: order execution, market data, portfolio, and auth have very different scaling and consistency requirements
- Cache aggressively but carefully: stock prices (2s TTL) yes, account balances (never) no
- Design for failure: exchanges go down, connections drop, databases lag — graceful degradation beats crashing
- Use the right database for the job: PostgreSQL for transactional data, TimescaleDB for time-series, Redis for hot data
- Horizontal scaling via sticky-session load balancing and Redis as the broadcast bus
- Latency budget: know where every millisecond goes in your order flow
Connect with Me
If you found this useful, I write deep-dive system design posts and visual explainers every week. Follow along:
- Twitter/X: @robinsingh — threads on system design, Node.js, and AI engineering
- LinkedIn: Robin Singh — longer-form posts and career insights
- GitHub: robins163 — code from all my posts
- Hashnode: unknowntoplay.hashnode.dev — full blog with diagrams and animations
Next post: "The Node.js Event Loop — Finally Explained Right" — where I break down the event loop with animated diagrams and show exactly why setTimeout(fn, 0) doesn't mean what you think it does.
Drop a comment if you have questions about any part of this architecture — happy to go deeper on any component.
Top comments (0)