If you've built anything that streams LLM responses over SSE, you've probably hit this: the user refreshes the page, or their network blips, or the load balancer routes the reconnect to a different instance — and the stream is just gone. The generation keeps burning tokens on your backend, but the client sees nothing.
In the JS/TS world this is mostly solved. Vercel shipped resumable-stream, there's ai-resumable-stream, Ably has a whole token streaming product. But if your backend is in Go? Nothing.
I ran into this while working on a project where the LLM worker and the HTTP handler live in different processes. I needed something that:
- persists chunks so reconnecting clients can replay what they missed
- delivers cancel signals across instances (user clicks "stop" on one node, generation stops on another)
- prevents duplicate producers (two requests racing to start the same session)
So I built streamhub.
How it works
Two Redis primitives, that's it:
- Redis Streams store chunks. New subscribers read history first, then get live data.
- Redis Pub/Sub carries cancel signals. Fast, fire-and-forget.
Each producer gets a generation ID that acts as a fencing token — if a stale producer tries to write after losing ownership, the writes are rejected.
What the code looks like
Producer side:
stream, created, err := hub.Register("chat:123", func() {
// called when someone cancels this session
})
if !created {
return // another instance already owns this
}
defer stream.Close()
stream.Publish("hello")
stream.Publish(" world")
Consumer side (can be a completely different process):
stream := hub.Get("chat:123")
chunks, unsubscribe := stream.Subscribe(128)
defer unsubscribe()
for chunk := range chunks {
// replays existing chunks first, then streams live
fmt.Fprint(w, chunk)
w.(http.Flusher).Flush()
}
Cancel from anywhere:
hub.Get("chat:123").Cancel()
Why not just use X?
"Just use Redis Streams directly" — you can, but you'll end up reimplementing subscriber fan-out, replay-then-live handoff, generation fencing, and the cancel side-channel. That's what streamhub is.
"Use Centrifuge/Centrifugo" — great project, but it's a full real-time messaging framework. If all you need is to make your LLM streams durable, it's a lot of surface area.
"Use vercel/resumable-stream" — TypeScript only, tightly coupled to the Vercel AI SDK.
Status
Early days. The API surface might still change. If you're dealing with this same problem in Go, I'd appreciate feedback: github.com/gtoxlili/streamhub
Top comments (0)