When I first started building the real-time chat features for my project, the initial architectural decisions seemed quite simple to me. It was going to be a "real-time" app so 'obviously' I needed WebSockets. My first instinct was to use a socket event for everything whether it be authentication or even loading chat history.
Eventually, I read some articles pertaining to the same and then shifted to the dual architecture which is in place now. Which is REST+WebSockets and using each protocol for what its good at.
The common first thought
It is incredibly tempting to establish a WebSocket connection and route the entire application state through it. It feels clean to have a single pipe to the server.
But I quickly ran into the architectural realities of stateful WebSocket connections. While they are good for maintaining persistent duplex connections, they provide a much lower-level transport abstraction than HTTP.
On fetching the older chat history, I immediately missed standard HTTP status codes. If a request failed, I had to myself make my own error-handling payload structure as there was no automated 404 or 401 code being sent. As in WebSockets, after a connection is established, there are no HTTP responses, its all just messages. So I lost out on all standard HTTP semantics, which meant a lot more work for me.
WebSockets are inherently stateful which makes load balancing and scaling much more complex, each connection has to stay attached to a server, so horizontally scaling requires connection-aware load balancing and some shared mechanism (such as Redis Pub-Sub) to broadcast events across instances as compared to spinning up stateless REST APIs behind a standard reverse proxy.
Why Not Just HTTP Polling?
On the other end of the spectrum I very briefly considered avoiding WebSockets entirely. Why not just use REST for everything and implement polling?
// The classic polling trap
function startPolling(roomId) {
setInterval(async () => {
const newMessages = await fetch(`/api/rooms/${roomId}/messages?since=${lastTimestamp}`);
// Handle new messages...
}, 3000);
}
The issue with polling became apparent the moment I thought about the user experience and server load. With short polling I would be sending HTTP requests every few seconds. And 90% of the time the server's response would be an empty array because no new messages had arrived. This results in massive HTTP overhead, unnecessary server strain and an inherent latency equal to the polling interval, which I just did not like seeing when I tested a dummy app.
Even long-polling (where the server holds the request open until a message arrives) felt like a hacky workaround to simulate the persistent connection that WebSockets natively provide.
And so I instantly knew that polling, long or short, is not an option for my case.
The Ideal Architecture
React Client
│
┌──────────┴──────────┐
│ │
▼ ▼
REST (Request/Response) WebSocket (Events)
│ │
│ │
Load state Live updates
(Login, Users, (Messages,
History, etc.) Typing, Presence)
└──────────┬──────────┘
▼
Backend + Database
I eventually landed on a dual-protocol approach. The mechanism was easy to understand, anything that is coming out of the database(essentially older stuff) comes from the HTTP side of things, anything realtime comes from the WS side of things.
1. REST for initial state
I use standard REST endpoints for operations that follow a Request-Response model.
- Fetching a user's profile.
- Fetching the list of users.
- Loading the initial chat history for an open chat.
- Authenticating the user.
By doing this I could keep my backend mostly stateless for the heavy queries.
2. WebSockets for the Real-Time Pipeline
Once the static state is loaded via REST, I open a WebSocket connection strictly as an event pipeline. The socket's only job is to push real-time events from the server to the client.
- new message
- typing event
- isOnline event
Here is a simplified snippet(not from the codebase because there is a lot of other stuff going on there) of how I orchestrate this dual approach when a user enters a chat room:
async function enterChatRoom(roomId) {
// 1. Fetch the static history via REST
try {
const response = await fetch(`/api/rooms/${roomId}/history`);
if (!response.ok) throw new Error('Failed to load history');
const pastMessages = await response.json();
setMessages(pastMessages);
} catch (error) {
handleRestError(error);
return;
}
// 2. Open the WebSocket strictly for real-time events
// We get low-latency, bi-directional communication without HTTP overhead
const ws = new WebSocket(`wss://api.myproject.com/rooms/${roomId}/live`);
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === 'NEW_MESSAGE') {
setMessages(prev => [...prev, data.message]);
} else if (data.type === 'USER_TYPING') {
showTypingIndicator(data.userId);
}
};
return () => ws.close();
}
Conclusion
Building this project taught me that just because a technology can do something doesn't mean it should. WebSockets are an incredible tool for real-time, low-latency communication, but they are a poor substitute for the robust, heavily-standardized world of RESTful HTTP.
By utilizing a dual-protocol architecture, I avoided the heavy overhead of HTTP polling while simultaneously saving myself from the nightmare of reinventing routing, caching, and error handling over a raw socket connection.
I'll be back with more next week. Till then, stay consistent!
Top comments (0)