DEV Community

Cover image for Stop Thinking of HTTP as Request/Response. It's a Universal Data Layout — and It's Faster Than Binary Protocol.
Thuangf45
Thuangf45

Posted on

Stop Thinking of HTTP as Request/Response. It's a Universal Data Layout — and It's Faster Than Binary Protocol.

Everyone "knows" binary protocol is faster than HTTP.

I used to believe that too. Until I stopped looking at HTTP as a wire protocol and started looking at it as what it actually is — a layout engine for the CPU.

That reframe changed everything.


0. The Mental Model That's Costing You Performance

The industry standard narrative goes like this:

"Binary protocols are fast because they're compact and machine-readable. HTTP is slow because it's human-readable text."

On the surface, it sounds reasonable. So engineers reach for Protocol Buffers, MessagePack, custom binary frames — anything to "get away from HTTP overhead."

But here's the question nobody asks: What exactly is the CPU doing when it parses your binary protocol?

Let's answer that honestly.


1. How Your CPU Actually Reads Data

Modern CPUs don't read byte by byte. They don't read int by int. They read 128 to 512 bits at a time — thanks to SIMD (Single Instruction, Multiple Data) registers. AVX2 can scan 256 bits per instruction. AVX-512 does 512 bits. The hardware wants to eat data in large chunks and run fast.

Now look at what a typical binary protocol demands:

// Hypothetical fixed binary protocol — 20 fields
[offset 0,  size 4]  = MessageType
[offset 4,  size 4]  = SequenceId
[offset 8,  size 2]  = Version
[offset 10, size 2]  = Flags
[offset 12, size 4]  = Timestamp
[offset 16, size 8]  = UserId
// ... 14 more fields
Enter fullscreen mode Exit fullscreen mode

To read this, the CPU must:

  1. Go to offset 0, read 4 bytes → MessageType
  2. Go to offset 4, read 4 bytes → SequenceId
  3. Go to offset 8, read 2 bytes → Version
  4. ...repeat 17 more times

That's 20 sequential reads, each guided by hardcoded human-defined offsets. The CPU is operating at human speed — one field at a time, exactly as the programmer specified.

And if those 20 fields are pointers rather than inline values? The CPU reads a pointer at offset N, then jumps to a completely different memory location to fetch the actual value. Minimum 40 memory operations — each with potential cache misses. The pipeline stalls. Performance collapses.

Binary protocol is forcing a 512-bit-capable CPU to act like a human reading a checklist.


2. What HttpModel Does Instead

HttpModel is not HTTP/1.1. Let me be clear about this upfront.

RequestModel and ResponseModel are just two specific instances of HttpModel. The model itself is general. Its structure is:

[Token1] [Token2] [Token3]\r\n
[Key]: [Value] \r\n
[Key]: [Value] \r\n
\r\n
[Body]
Enter fullscreen mode Exit fullscreen mode

That's it. A start-line with three tokens, an arbitrary number of key-value header pairs, and a body. Nothing is fixed. Nothing is locked.

For HTTP requests, the three tokens happen to be Method, URL, Protocol. For HTTP responses, they're Protocol, StatusCode, StatusPhrase. For a game server? They could be Hp, Atk, Def. For a custom RPC? They could be ServiceName, MethodName, RequestId. The layout is yours to fill.

This is what I mean by HttpModel as a universal data layout.


3. The Parse That Runs at Machine Speed

Now here's where it gets interesting. When HttpModel receives bytes, this is what happens:

// From ReceiveHeader() in HttpModel.cs
var headerEnd = Lucifer.IndexOf(span, CrLfCrLf);      // One SIMD scan
var firstLineEnd = Lucifer.IndexOf(span, CrLf);         // One SIMD scan

// Split start-line into three tokens
Lucifer.TrySplitAt(startLine, Space, out var first, out var rest1);
Lucifer.TrySplitAt(rest1, Space, out var second, out var third);

// Mark positions — no allocation
_first  = new Position(offset, length);
_second = new Position(offset, length);
_third  = new Position(offset, length);
Enter fullscreen mode Exit fullscreen mode

The CPU runs IndexOf on the full buffer using SIMD. It doesn't inspect each byte individually — it sweeps 256 or 512 bits at a time looking for \r\n. The delimiter pattern is simple enough that the hardware can detect it at maximum throughput.

What does "marking positions" mean? It means storing an (offset, size) pair — two integers — pointing into the existing buffer. No string is created. No object is allocated. No copy happens. The data lives where it landed in the receive buffer.

// Position is just two ints
private Position _first;   // (Offset: 0, Size: 3)   → "GET"
private Position _second;  // (Offset: 4, Size: 1)   → "/"
private Position _third;   // (Offset: 6, Size: 8)   → "HTTP/1.1"
Enter fullscreen mode Exit fullscreen mode

Accessing any field is a ReadOnlySpan<byte> slice — zero cost, zero allocation:

public ReadOnlySpan<byte> FirstSpan
{
    get => Cache.AsSpan(_first.Offset, _first.Size);  // span slice, no copy
}
Enter fullscreen mode Exit fullscreen mode

4. Binary Protocol vs. HttpModel — The Real Comparison

Let me make this concrete.

Binary protocol, 20 fixed fields:

  • CPU reads field 1 at offset 0
  • CPU reads field 2 at offset 4
  • ...
  • CPU reads field 20 at offset N
  • 20 sequential operations, pace set by the programmer

Binary protocol, 20 pointer fields:

  • CPU reads pointer at offset 0, jumps to memory address X to get value
  • CPU reads pointer at offset 8, jumps to memory address Y to get value
  • ...
  • 40+ operations, many with cache misses, pipeline stalls

HttpModel, 20 headers:

  • CPU sweeps the entire buffer in one SIMD pass
  • \r\n delimiters are found; positions are marked
  • One fast scan, pace set by the hardware

Then the human reads:

// You asked for 3 specific headers. You jump to exactly those positions.
// You read zero-copy span slices. The other 17 headers? Never touched.
model.TryGetHeader(0, out var contentType, out var ctValue);
model.TryGetHeader(5, out var userId, out var userIdValue);
model.TryGetHeader(11, out var requestId, out var reqIdValue);
Enter fullscreen mode Exit fullscreen mode

This is the key insight:

Parse is the machine's job. Read is the human's job. Don't mix them.

Binary protocol conflates the two. The programmer decides what to read, and that decision dictates how the CPU must parse. The machine works at human pace.

HttpModel separates them. The machine scans everything at full hardware speed — SIMD, no branching, no tiny offset math. The programmer then reads only what they need, from marked positions, with zero allocation.


5. Zero Allocation by Architecture

Most parsers produce objects. You send bytes in, you get a ParsedMessage struct out — with strings, arrays, boxed values, GC pressure.

HttpModel produces nothing. The parsed "result" is just a set of (offset, size) pairs sitting in the same memory as the received bytes.

// No alloc. Just two ints per token/header, pointing into Cache.
internal List<(Position, Position)> _headers = [];

// Reading is a span slice — no heap involved
public bool TryGetHeader(int i, out ReadOnlySpan<byte> key, out ReadOnlySpan<byte> value)
{
    key   = _headers[i].Item1.GetData(Cache);   // span slice
    value = _headers[i].Item2.GetData(Cache);   // span slice
    return true;
}
Enter fullscreen mode Exit fullscreen mode

This is what Buffer-Model Architecture means. The buffer is the source of truth. The model is a set of rules — offsets and sizes — layered on top. No materialization. No copying. Virtualized access to the same memory region.

Even the Clone() operation is clean:

// Clone copies the cache bytes once, then shares position metadata
clone.Cache.Append(Cache.AsSpan());   // one memcpy
clone._first   = _first;              // two ints
clone._second  = _second;             // two ints
clone._headers = [.. _headers];       // list of int pairs
Enter fullscreen mode Exit fullscreen mode

6. Unlimited Extensibility — The Part Binary Protocol Can Never Match

Here's the thing about a fixed binary schema: it's fixed. If you need a new field, you version the protocol, update all clients, deploy everywhere, handle backward compatibility. It's an engineering project just to add a field.

HttpModel has no schema. Adding a header is one line:

model.SetHeader("X-Game-Season"u8, "3"u8);
model.SetHeader("X-Player-Guild"u8, "ShadowBlade"u8);
model.SetHeader("X-Latency-Budget-Ms"u8, "50"u8);
Enter fullscreen mode Exit fullscreen mode

These headers exist if present. If absent, they're absent. No version bump. No migration. No client update. The layout is infinitely extensible because it is not a schema — it's a pattern.

The same parser handles all of it. One function. ReceiveHeader() doesn't care how many headers you have or what they're called. It scans, marks, returns.


7. Nested Models — One Parser for Everything

Here's a capability that surprises people: HttpModel supports nesting.

The body of any HttpModel can itself contain one or more HttpModel instances. Each sub-model follows the same structure — start-line, headers, body. The body of a sub-model can contain further sub-models.

[Root Model]
  Token1: BatchRequest
  Token2: /api/game/sync
  Token3: v2
  Content-Type: multipart/model
  Model-Count: 3
  \r\n
  [Sub-Model 1]
    Hp: 80
    Atk: 120
    Def: 60
    \r\n
    [body data]
  [Sub-Model 2]
    ...
  [Sub-Model 3]
    ...
Enter fullscreen mode Exit fullscreen mode

The same ReceiveHeader() / position-marking logic applies at every level. You don't write a new parser per payload type. You write one parser and reuse it recursively.

This means: one TCP connection, one buffer, one parse pass, heterogeneous payload types, multiplexed — and zero allocation on the parse side.


8. Real-World Demo: RequestModel and ResponseModel

The two most familiar instances of HttpModel are RequestModel and ResponseModel. Here's how they work in practice:

Building a request:

using var req = Lucifer.Rent<RequestModel>();

req.SetBegin("POST"u8, "/api/score/submit"u8)
   .SetHeader("X-Player-Id"u8, "player_98421"u8)
   .SetHeader("X-Season"u8, "3"u8)
   .SetBody("{\"score\":9800,\"level\":42}"u8);

// req.Cache is now the complete HTTP-formatted bytes
// ready to send over the wire — no serialization, no allocation
Enter fullscreen mode Exit fullscreen mode

Building a response:

using var res = Lucifer.Rent<ResponseModel>();

res.SetBegin(200)
   .SetHeader("X-Request-Id"u8, requestId)
   .SetBody("{\"accepted\":true}"u8);
Enter fullscreen mode Exit fullscreen mode

Parsing incoming bytes — the zero-alloc path:

// Incoming raw bytes land in a receive buffer
// ReceiveHeader() does one SIMD scan, marks positions, returns
bool headersDone = model.ReceiveHeader(buffer, offset, size);

// Now read only what you need — zero copy, zero alloc
var method  = model.MethodSpan;    // span slice into receive buffer
var url     = model.UrlSpan;       // span slice
var body    = model.BodySpan;      // span slice

// Need a specific header?
model.TryGetHeader(i, out var key, out var value);  // span slices
Enter fullscreen mode Exit fullscreen mode

The tokens can be anything. This is what makes HttpModel general:

// For a game server session protocol:
model.SetBegin("CONNECT"u8, "room_44"u8, "GameProto/1"u8);

// For a pub/sub event stream:
model.SetBegin("PUBLISH"u8, "topic/sensor/temp"u8, "EventStream/2"u8);

// For a custom RPC:
model.SetBegin("CALL"u8, "UserService.GetProfile"u8, "RPC/1"u8);
Enter fullscreen mode Exit fullscreen mode

Same model. Same parser. Same zero-alloc path. Different semantics — yours.


9. The Truth Nobody Says Out Loud: HttpModel IS a Binary Protocol

Here's the reframe that changes everything.

People draw a sharp line: "HTTP is text. Binary protocols are bytes."

That line is wrong.

HttpModel is all bytes. The receive buffer is raw bytes. The Position struct is two integers pointing into raw bytes. ReadOnlySpan<byte> accesses raw bytes. There is no string, no text, no Unicode anywhere in the hot path. HttpModel is a binary protocol.

The only difference from a "traditional" binary protocol is this:

Traditional Binary HttpModel
Offset Hardcoded constant Computed dynamically by SIMD scan
Size Hardcoded constant Computed dynamically by delimiter
Schema Fixed at compile time None — unlimited headers
Fields N fixed fields Infinite

Traditional binary protocol: Offset = constant. Carved into the code at design time.

HttpModel: Offset = variable. Computed by the CPU at maximum hardware speed via delimiter scanning.

Same concept. Dynamic execution. Zero schema constraint.

This is the real definition of HttpModel: a binary protocol with variable offset/size, where the CPU computes the layout at runtime instead of the programmer hardcoding it at compile time.


10. Why HTTP Got a Bad Reputation — and Who's Actually Responsible

If HttpModel is this fast and this flexible, why does everyone say HTTP is slow?

Because the world turned it into an OOP nightmare.

Look at what a standard framework does when an HTTP request arrives:

// The OOP way — what most frameworks actually do
HttpContext context = new HttpContext(request);         // heap allocation
HttpRequest req     = new HttpRequest(context);         // heap allocation
Dictionary<string, string> headers = new();             // heap allocation
foreach (var header in rawHeaders)
{
    string key   = Encoding.UTF8.GetString(keyBytes);   // heap allocation
    string value = Encoding.UTF8.GetString(valueBytes); // heap allocation
    headers[key] = value;                               // heap allocation
}
string body = await new StreamReader(req.Body).ReadToEndAsync(); // heap allocation
MyDto dto   = JsonSerializer.Deserialize<MyDto>(body);           // heap allocation
Enter fullscreen mode Exit fullscreen mode

Every single line allocates on the heap. One HTTP request → dozens of objects → GC pressure → slowdown.

Then engineers benchmark this and conclude: "HTTP is slow. Binary protocol is faster."

That conclusion is comparing the wrong things. They are not comparing HTTP vs. binary protocol. They are comparing OOP HTTP vs. DOD binary protocol. The variable being changed is not the wire format — it is the programming paradigm.

Flip it. Implement the binary protocol in OOP style:

// Binary protocol, OOP style — equally slow
var message = new Message();
message.Type      = BitConverter.ToInt32(buffer, 0);
message.UserId    = BitConverter.ToInt64(buffer, 4);
message.Timestamp = BitConverter.ToInt64(buffer, 12);
message.Name      = Encoding.UTF8.GetString(buffer, 20, nameLen); // heap
// ...20 more fields, all materialized into object properties
Enter fullscreen mode Exit fullscreen mode

Same allocation pattern. Same GC pressure. Same performance collapse. Because the bottleneck was never the wire format. The bottleneck was object allocation.

Now implement HttpModel in DOD style — which is exactly what LuciferCore does — and the allocation count drops to zero. No strings. No dictionaries. No objects. Just two integers per field, pointing into a buffer that already exists.

HTTP was never slow. The OOP wrapper around it was slow. Those are not the same thing.

The binary protocol community earned its performance reputation by adopting DOD early — no alloc, no copy, span-based access. HttpModel takes that same DOD discipline and applies it to a layout that is infinitely more extensible. You get the speed of DOD binary protocol. You get the freedom of an unlimited schema. You get both, simultaneously.


11. Layout Is Not Just a Frontend Concept

We talk about "layout" in UI all the time — flex, grid, constraints, templates.

Backend engineers rarely use that word. But a data layout is precisely what HttpModel provides. It's a template with named slots: three start-line tokens, N key-value pairs, a body. Your job is to fill the slots. The model handles everything else — parsing, position tracking, span access, memory management.

Frontend has layout engines. Backend has binary protocols. HttpModel brings layout thinking to the backend — and that's why it parses at machine speed.

The layout doesn't constrain you. It liberates you. You define what the three tokens mean. You define what headers exist. You define the body format. The infrastructure never changes — only the semantics you layer on top.


12. Summary: Let the Machine Do Machine Work

Binary protocols make the programmer define exactly how the CPU reads memory. This is the CPU working at human speed — structured by human decisions, capped by human granularity.

HttpModel inverts this. The CPU scans at full hardware throughput, guided only by delimiters. The programmer reads from marked positions, on demand, touching only what they need.

OOP HTTP (frameworks) DOD Binary Protocol HttpModel (DOD)
Parse unit Deserialize to object Per field (fixed offset) Per delimiter (SIMD scan)
Parse speed Slow (alloc-bound) Fast Machine-paced
Allocation Massive (string, dict, object) Low Zero by architecture
Extensibility Limited by object model Schema change required Add a header, done
Nesting Framework-dependent Requires new parser Recursive, one parser
Universality HTTP semantics only Fixed per protocol Tokens are yours to define
Offset/Size Object properties Hardcoded constants Dynamically computed at runtime

The core philosophy:

HttpModel is a binary protocol. The offset and size are not hardcoded by the programmer — they are computed by the CPU at full SIMD speed. That is the only difference. And that difference gives you infinite extensibility for free.

Parse is the CPU's job. Do it at CPU speed — all at once, no human-defined order.
Read is the programmer's job. Do it at human speed — lazily, only what you need.

Binary protocol merges these two jobs. OOP HTTP drowns both of them in allocation. HttpModel keeps them separate and lets each run at its natural speed.


Implementation

Everything described here is implemented in LuciferCoreHttpModel, RequestModel, ResponseModel, Buffer, Position, and the full Buffer-Model Architecture.

"Let the machine scan. Let the human choose."

Top comments (3)

Collapse
 
thuangf45 profile image
Thuangf45

Worth clarifying: HttpModel here is not tied to HTTP/1.1 the transport protocol. It's the layout pattern — start-line, headers, body, delimited by \r\n. HTTP/2 and HTTP/3 changed the framing layer, but the layout thinking described here operates one level below that. You can apply this same DOD approach regardless of which version sits underneath.

Collapse
 
thuangf45 profile image
Thuangf45

FlatBuffers and Cap'n Proto are excellent — and yes, they achieve zero-copy too. The key difference is schema flexibility. Those formats require a compiled schema and a version contract. HttpModel requires neither. You add a header in one line, no codegen, no recompile, no client update. The zero-alloc guarantees are comparable; the extensibility is not.

Collapse
 
thuangf45 profile image
Thuangf45

Raw Span gives you the memory access but not the structure. You still need to decide how to scan, where delimiters are, which offsets map to which fields, and how to handle nesting. HttpModel is that structure — a reusable, zero-alloc layout engine built on top of spans. It's the difference between having a fast car and having a road to drive it on.