BlackBull goes multi-protocol (part 1) — first it had to forget it was HTTP

#webdev #mqtt #http #python

What does it take to teach an HTTP server to speak a protocol that isn't HTTP at
all? Not "add a WebSocket upgrade" — that still starts with an HTTP handshake.
I mean a protocol that arrives on its own port, speaks its own wire format, and
has never heard of a request-line.

That's the problem I walked into with BlackBull, a pure-Python ASGI framework
with its own HTTP/1.1, HTTP/2, and WebSocket stack. I'd recently gotten it to a
stable, spec-passing HTTP server — and hit a wall. A clean HTTP server, however
well-built, is a solved problem; there are dozens. What I actually wanted to
know was whether BlackBull's foundation could carry something HTTP couldn't
reach: a second, unrelated protocol sharing the same process. And HTTP alone
couldn't tell me — every connection was being handled by machinery that quietly
assumed HTTP from the first byte.

So I decided to add an MQTT 5 broker (part 2 of this series) — a protocol with
no request-line, no status codes, no HTTP at all. The first thing I discovered:
the core didn't know how to welcome a second protocol. It assumed every
connection was HTTP — not as a bug, but as an identity. Before BlackBull could
host anything else, it had to forget it was an HTTP server.

And it had to do that surgery under a working HTTP server, without breaking a
single feature. That constraint — change the foundation, break nothing — is
the whole story. Whether the foundation actually held is the question the rest of
this post answers.

Where HTTP was hiding in the dispatcher

Every incoming TCP connection is owned by a ConnectionActor. Its job is to
figure out what protocol the peer is speaking and hand the socket to the right
handler. The trouble: "figure out what protocol" had HTTP baked into it in three
places.

The ALPN path read exactly 24 bytes — the length of the HTTP/2 connection preface (PRI * HTTP/2.0\r\n\r\n…, RFC 9113 §3.4).
The cleartext path read until \r\n — the HTTP/1.1 request-line delimiter.
The slowloris timeout wrote back HTTP/1.1 408 Request Timeout — an HTTP string, on a connection we hadn't yet proven was HTTP.

For an HTTP-only server this is fine; the assumptions are always true. But a
protocol whose detection needs a different number of bytes, or a different
delimiter, or that should never receive an HTTP 408, cannot participate at
all. The dispatcher wasn't a dispatcher — it was an HTTP parser wearing a
dispatcher's coat.

Peek, then replay

The fix is a detection primitive that reads from the socket without consuming
what it read: PrefixReader. It peeks at the opening bytes, lets each registered
protocol look at them, and then — crucially — replays those same bytes to
whichever protocol claims the connection, so the handler sees the stream from
byte zero as if nothing had been read.

class ProtocolBinding:
    # How many bytes detection needs before it can decide.
    detection_read: int = -1     # -1 = until \r\n, 0 = none, N = exactly N

    def detect(self, prefix: bytes, alpn: str | None) -> bool: ...
    async def serve(self, conn: ConnectionView) -> None: ...

ConnectionActor no longer knows what 24 means, or why \r\n matters. It groups
the registered bindings by their declared detection_read, peeks exactly what
the group needs, offers the prefix to each binding's detect(), and calls
serve() on the winner. HTTP/1.1, HTTP/2, and WebSocket became three ordinary
bindings. The HTTP knowledge moved out of the dispatcher and into the HTTP
bindings, where it belongs.

One honest caveat about that API: detection_read only expresses two shapes —
read N fixed bytes or read until \r\n. The \r\n sentinel is a pragmatic
default, not a universal delimiter; a protocol that frames on something else (a
C-style \0, say) can't yet describe its own detection need. The seam is in the
right place — a binding declares what detection requires — but the vocabulary
isn't fully protocol-neutral. Generalizing detection_read to a per-binding
predicate is the natural next step, and a known one.

Two structural things fell out of this.

Why a whole class could be deleted

RawProtocolActor is gone — and why it could go is the real point. To feel the
deletion you have to know what it did. In the old design ConnectionActor was
HTTP-shaped: it spoke in request-lines and response writers, and it could only
hand a connection to something that fit that shape. A non-HTTP handler didn't. So
there was an adapter — an extra actor layer everyone called "L2" — whose entire
job was to wrap a raw byte handler in enough HTTP-actor costume that the
dispatcher would deign to talk to it. It was pure translation, a class that
existed only because the dispatcher couldn't address anything that wasn't HTTP.

Once serve(conn) became the single door every protocol walks through, that
translation had no one left to translate for — there was only one shape now. L2
wasn't refactored; it was obviated. The clearest sign a seam is in the right
place is when a whole layer that existed to bridge two shapes wakes up to find
there's only one.

A lifecycle event that finally fires everywhere

connection_closed now fires for HTTP too. Because teardown used to live on the
HTTP-specific path, the lifecycle event never fired for HTTP connections — only
for the raw ones. Unifying the path fixed that asymmetry as a side effect.

The headline payoff is forward-looking: a new protocol now registers a binding
and changes zero lines of ConnectionActor. gRPC, Redis RESP, a raw TCP echo —
each is a ProtocolBinding, not a dispatcher edit.

The part that actually mattered: breaking nothing

Rewriting the code that decides what every connection is — under an HTTP server
that already passes h2spec and Autobahn — is the kind of change that quietly
introduces a regression you discover in production six weeks later. So the
refactor was gated on two things, both built before the dispatcher was
touched:

A regression oracle. A golden-path test suite (test_connection_dispatch_golden.py) pins the exact detection outcome for every protocol/ALPN/prefix combination — HTTP/1.1 cleartext, HTTP/2 prior knowledge, ALPN h2, WebSocket upgrade, the slowloris timeout. The refactor had to keep every one of those outcomes byte-identical. The oracle ran red first (proving it tests something), then stayed green through the change.
A performance gate. The detection path is on every connection, so a careless rewrite taxes every request. The change was held to a ±2% budget against the pre-refactor HTTP/1.1 throughput baseline, measured on the HttpArena suite.

The result cleared both bars. HttpArena's validation run came back unchanged —
47 passed / 0 failed, the full suite, plus 7/7 WebSocket — and HTTP/1.1
throughput held within the ±2% budget against baseline. No regression. The
foundation was rebuilt under the house and not one wall cracked.

Why tell this story before the MQTT one?

Because the MQTT broker (part 2) is the easy part. It's a self-contained
extension. The hard, valuable engineering was making BlackBull's core able to
receive a second protocol without the core learning anything HTTP-specific
about it — and proving the existing HTTP behavior survived the operation
unchanged.

Next: the broker itself — a publish/subscribe engine with no locks and a
single owner, built entirely on top of this seam.

Source: github.com/TOKUJI/BlackBull
Docs: tokuji.github.io/BlackBull