DEV Community

Gtio
Gtio

Posted on

I needed resumable LLM streams in Go — so I built streamhub

If you've built anything that streams LLM responses over SSE, you've probably hit this: the user refreshes the page, or their network blips, or the load balancer routes the reconnect to a different instance — and the stream is just gone. The generation keeps burning tokens on your backend, but the client sees nothing.

In the JS/TS world this is mostly solved. Vercel shipped resumable-stream, there's ai-resumable-stream, Ably has a whole token streaming product. But if your backend is in Go? Nothing.

I ran into this while working on a project where the LLM worker and the HTTP handler live in different processes. I needed something that:

  • persists chunks so reconnecting clients can replay what they missed
  • delivers cancel signals across instances (user clicks "stop" on one node, generation stops on another)
  • prevents duplicate producers (two requests racing to start the same session)

So I built streamhub.

How it works

Two Redis primitives, that's it:

  • Redis Streams store chunks. New subscribers read history first, then get live data.
  • Redis Pub/Sub carries cancel signals. Fast, fire-and-forget.

Each producer gets a generation ID that acts as a fencing token — if a stale producer tries to write after losing ownership, the writes are rejected.

What the code looks like

Producer side:

stream, created, err := hub.Register("chat:123", func() {
    // called when someone cancels this session
})
if !created {
    return // another instance already owns this
}
defer stream.Close()

stream.Publish("hello")
stream.Publish(" world")
Enter fullscreen mode Exit fullscreen mode

Consumer side (can be a completely different process):

stream := hub.Get("chat:123")
chunks, unsubscribe := stream.Subscribe(128)
defer unsubscribe()

for chunk := range chunks {
    // replays existing chunks first, then streams live
    fmt.Fprint(w, chunk)
    w.(http.Flusher).Flush()
}
Enter fullscreen mode Exit fullscreen mode

Cancel from anywhere:

hub.Get("chat:123").Cancel()
Enter fullscreen mode Exit fullscreen mode

Why not just use X?

"Just use Redis Streams directly" — you can, but you'll end up reimplementing subscriber fan-out, replay-then-live handoff, generation fencing, and the cancel side-channel. That's what streamhub is.

"Use Centrifuge/Centrifugo" — great project, but it's a full real-time messaging framework. If all you need is to make your LLM streams durable, it's a lot of surface area.

"Use vercel/resumable-stream" — TypeScript only, tightly coupled to the Vercel AI SDK.

Status

Early days. The API surface might still change. If you're dealing with this same problem in Go, I'd appreciate feedback: github.com/gtoxlili/streamhub

Top comments (0)