137Foundry

Posted on Jun 7

7 Essential Patterns for Production Real-Time Web Apps

#webdev #realtime #sse #websockets

The gap between "we shipped a real-time feature" and "we shipped a real-time feature that stays up under load" is usually a small set of patterns. Each one is cheap to implement. Skipping any of them creates a class of bug that is annoying to debug after the fact and easy to prevent up front.

This is the working list we keep at 137Foundry for real-time web work. Seven patterns, why each matters, and what good looks like.

Photo by Samon Yu on Pexels

1. Broadcast the bytes, not the object

The most common mistake in a real-time fan-out is to serialize the payload once per connected client. Build a payload object, JSON-stringify it for client A, JSON-stringify it again for client B, and so on. At a few hundred connected clients per process, the serialization cost dominates the broadcast latency, and the server starts dropping behind the producer.

The fix is to serialize once before the fan-out and write the same bytes to every connected client. For Server-Sent Events, build the event line as a string once and call res.write(line) in a loop. For WebSockets, serialize the frame once and call socket.send(frame) in a loop.

This single change can cut broadcast latency by an order of magnitude at the high end. Documented well in the Node.js performance guide and most WebSocket library guides.

2. Use a shared bus for multi-process fan-out

A single application process can hold a few thousand connections comfortably on a modern event-loop runtime. Past that, you need multiple processes behind a load balancer, and each process holds a subset of the active subscribers.

The standard scaling shape is to put a message bus between the event source and the application processes. The producer publishes once to the bus. Every application process subscribes to the bus. When a process receives a message, it runs the broadcast-the-bytes pattern (above) for its local subscribers.

Redis pub/sub is the canonical bus for this. NATS is the lower-overhead alternative. Kafka is the persistent option if you also need durability at the bus layer. All three work; pick the one your team already runs.

The longer guide on implementing Server-Sent Events for real-time web updates covers the multi-process fan-out in detail with the SSE-specific framing.

3. Send heartbeats every 15-30 seconds

A long-lived connection through an idle-timeout-aware load balancer needs traffic at least once before the timeout to stay alive. Without heartbeats, every real event triggers a reconnect storm because the previous connection is already closed.

For SSE, a comment line (: ping\n\n) every fifteen to thirty seconds keeps the connection alive past most load balancer timeouts (usually 30-60 seconds idle). For WebSockets, the protocol-level ping/pong frame serves the same purpose.

Heartbeats also help the server detect dead clients. A write to a socket whose other end has gone silently away will eventually fail, and the server can clean up the dead subscriber. Without heartbeats, the server keeps writing to dead sockets for the full idle-timeout duration.

4. Cap per-user connections

One user opening twenty tabs to the same dashboard should not cost the server twenty long-lived connections. The cost compounds at scale and creates an interesting denial-of-service vector that needs no malice to trigger.

Pick a cap (often three to five per user), enforce it on the server side by tracking active connections per user, and close the oldest when a new one would exceed the cap. Surface the cap in the API contract so the client side knows what to expect.

The same pattern applies to anonymous connections, but the bound should be much tighter (one or two per IP) because the cost of distinguishing legitimate users is higher.

Photo by Jakub Zerdzicki on Pexels

5. Disable middlebox buffering everywhere

This is the single most common production failure mode for SSE and the second most common for WebSockets.

Some reverse proxies buffer the response body until they see a content-length or hit a threshold. For long-lived streams without content-length, the threshold may never be reached, and events sit in the proxy until the timeout closes the connection. The user sees "nothing for thirty seconds, then a burst, then nothing again."

The fix is one config line per middlebox.

For nginx: proxy_buffering off; for the stream location, or send X-Accel-Buffering: no header from the application.
For HAProxy: usually fine by default, but confirm with a slow-stream test.
For CDN edges: the per-platform docs from Cloudflare and equivalent providers all have a "disable buffering for this path" setting.
For ALBs and similar cloud load balancers: confirm the idle timeout is longer than your heartbeat interval.

Test the path end to end with a slow stream (one event per minute) and watch the events arrive on the client. If they arrive in bursts, a middlebox is still buffering.

6. Plan a replay window

Real-time transports drop events when the client disconnects. SSE has automatic resumption via Last-Event-ID if the server retains the missed events. WebSockets have no built-in resumption; you write it yourself.

Either way, the server has to decide how far back to retain events and where to retain them. A common starting point: write every event to durable storage (database or event log) before publishing it, and serve replay from durable storage when a client reconnects with a Last-Event-ID header. Documented thoroughly in the WHATWG HTML spec for the SSE side; WebSocket libraries usually have their own conventions.

Past the replay window, the client should re-fetch state via a normal HTTP endpoint on initial load. Real-time is a delta layer on top of an authoritative read endpoint, not a replacement for it.

7. Instrument the stream like a long-running system

A real-time stream is a long-running system, and the observability story should match. Treat it as you would a job queue or a database connection pool, not as a request-response endpoint.

Minimum metrics worth shipping: active connections per process, connection-open rate, connection-close rate (split by reason: client-initiated, server-initiated, timeout, error), per-event broadcast latency, replay-from-store latency, and dead-letter count for events that failed to broadcast.

Surface these in your normal monitoring dashboard. Alert on the same conditions you would for any other long-running system: error rate above threshold, throughput drop below threshold, latency tail above threshold. Tools that work well: Prometheus for metrics, any structured log aggregator for the per-connection lifecycle events.

"Every real-time feature we have shipped at 137Foundry that stayed reliable in production had these seven patterns in place from day one. Every one that did not start with them ended up with most of them added retroactively, after the first user-visible incident." - Dennis Traina, founder of 137Foundry

How to roll these out on an existing system

If you already have a real-time feature in production and you are not running all seven patterns, the order to adopt them matters.

First: disable middlebox buffering. Single largest single-config-change improvement available. Many "laggy stream" complaints disappear the same day.

Second: send heartbeats. Solves the most common reconnect-storm cause and lets the server detect dead clients faster.

Third: broadcast the bytes, not the object. Eliminates the silent-O(n) cost in the broadcast hot path.

Fourth: cap per-user connections. Removes a class of accidental denial-of-service from the user behavior surface.

Fifth: stand up a shared bus. Required for any horizontal scale; one-process fan-out has a ceiling.

Sixth: plan a replay window. Required for any feature where missed events are user-visible (notifications, dashboards with persistent state).

Seventh: instrument the stream. Lets you measure whether the other six are actually helping and surfaces problems before users do.

Each step has a measurable win. The full set, in place, is the difference between a feature that needs a senior engineer babysitting it and a feature you ship and forget.

What we left off the list

Three patterns we considered for the list and chose not to include.

End-to-end encryption. Real-time streams should be over HTTPS or WSS in production. We considered this baseline rather than essential; if you are not already running TLS, you have a bigger problem than your real-time transport. The Internet Engineering Task Force has all the relevant protocol references.

Custom backoff. SSE's EventSource and most WebSocket libraries handle reconnection backoff for you. We considered "tune the backoff" worth a line in the contract docs, but not its own pattern.

Per-event compression. Compression at the transport layer (gzip on the response, deflate on WebSocket frames) is usually a small win. We considered it worth doing but not worth promoting to "essential."

The honest summary

Real-time web is one of the dimensions of the modern web stack where the gap between "works in dev" and "works at scale" is mostly about discipline, not protocol choice. The seven patterns above cover most of that discipline.

Adopt them in the order listed and most production real-time features stay reliable through traffic spikes, deploy churn, and the long tail of network weirdness real users actually experience. Skip them and the same features become a recurring source of incidents that are expensive to debug and easy to prevent.

The transport (SSE, WebSockets, polling) is the visible choice. The patterns around the transport are the invisible choice that decides whether the feature is good or bad in production. Invest in the patterns first.

DEV Community