DEV Community

Neural Download
Neural Download

Posted on

How Chat Apps Send Messages Instantly (WebSockets Breakdown)

https://www.youtube.com/watch?v=cMk2wrUu48s

HTTP is a phone call where you hang up after every sentence. WebSockets keep the line open. That one change is why chat apps, multiplayer games, and live dashboards actually work — and it all comes down to a single HTTP request that transforms into something else entirely.

The Problem With HTTP

HTTP is request-response. The client asks, the server answers, and the connection closes. To get new data, you have to ask again. And again. And again.

This works fine for loading a web page. It's a disaster for anything that needs to react the instant something changes on the server — a new message, an enemy moving, a stock price updating.

The old workaround was polling: the client asks "anything new?" every few seconds. Most of those requests come back empty. Each one carries hundreds of bytes of HTTP headers just to hear "nope." Wasted bandwidth, wasted battery, wasted server capacity. And even when there is new data, you only find out on the next poll.

The Upgrade Handshake

WebSockets solve this with a clever trick. They start as HTTP — then upgrade.

The browser sends a normal HTTP request, but with two special headers:

Upgrade: websocket
Connection: Upgrade
Enter fullscreen mode Exit fullscreen mode

It's saying: "Hey, I speak HTTP, but can we switch to something better?"

If the server supports WebSockets, it replies with 101 Switching Protocols. That's the handshake. One HTTP request, one HTTP response, and from that point on the connection is no longer HTTP. The TCP socket that carried the handshake stays open — but now both sides can send messages whenever they want.

The handshake also includes a security key. The client sends a random base-64 string. The server concatenates it with a magic GUID (258EAFA5-E914-47DA-95CA-C5AB0DC85B11), hashes the result, and sends it back. This proves the server actually understands the WebSocket protocol — it's not some random HTTP server accidentally saying yes.

The whole upgrade takes one round trip. After that, the connection is a persistent, full-duplex channel that stays open until one side explicitly closes it.

Persistent Frames

Once upgraded, HTTP is gone. No more headers on every message. Instead, both sides speak in frames — tiny binary packets with just a few bytes of metadata:

  • A bit indicating if this is the final frame of a message
  • An opcode (text, binary, ping, pong, close)
  • A payload length
  • An optional masking key (client → server only)
  • The actual data

A typical WebSocket text frame overhead is 2–14 bytes. Compare that to an HTTP request where headers alone are commonly 500–800 bytes. For anything chatty — chat apps, games, live feeds — this matters enormously.

And because the connection is persistent, there's no TCP handshake, no TLS negotiation, no DNS lookup on every message. You pay those costs once, at connection time, then ride the same pipe for hours.

Why Chat Apps & Games Need This

Think about Discord. When your friend sends a message, it appears on your screen almost instantly. That's not polling — your browser isn't asking every 200ms "any new messages?" That would melt the servers. Instead, you hold a single open WebSocket, and the server pushes the message down the pipe the moment it arrives.

Multiplayer games take this further. Every player position update, every projectile, every state change — 30 to 60 times per second — flows through a WebSocket (or a similar persistent protocol). Request-response would be comically unusable at those rates.

The key insight: the server decides when to talk. That's the capability HTTP can't give you.

SSE vs WebSockets

Server-Sent Events (SSE) is the other option for server-push. It's HTTP-native, simpler, and auto-reconnects. But it's one-way — server to client only. If you need the client to also send data (chat, game input, collaborative editing), you end up with two channels: SSE down, HTTP POST up. Messy.

WebSockets are full-duplex over one connection. Use them when you need bidirectional, low-latency messaging. Use SSE when you only need server → client push (live news feeds, notifications, dashboards).

Watch the Full Video

The full video walks through the handshake visually, shows exactly how the 101 switch works, and compares frame sizes with real numbers.

References

Top comments (0)