Anapeksha Mukherjee

Posted on Jun 11

What Designing a Binary Protocol Actually Taught Me

#database #rust #distributedsystems #architecture

Most developers never have to design a network protocol from scratch. You use HTTP, gRPC, WebSockets, or something else that already exists and has been debugged by thousands of people over many years. That is the right call for most situations.

I did not take that path when building Vaylix, a key-value database engine. I designed a custom binary protocol called VTP2, and the process taught me things about networking that I would not have picked up any other way.

This is not an argument that you should also build a custom protocol. For most things, you should not. This is an honest account of what I ran into.

Why not HTTP

The first question anyone reasonably asks is: why not just use HTTP?

HTTP is everywhere. The tooling is excellent. Every language has a client. Debugging with curl is trivial. If I had used HTTP, I would have had working client libraries in a dozen languages before writing a single line of server code.

The problem is that HTTP is stateless by design. Every request is independent. Every request carries headers. Every response carries headers. The model assumes that each round trip is a fresh conversation with no memory of what came before.

A database session is the opposite of that. A client connects, authenticates, and then issues many commands over the same connection. The authentication should happen once. The session should carry state. Pipelining requests without waiting for each response to return should be natural, not something you fight the protocol to achieve.

HTTP/2 closes some of this gap. But using HTTP/2 correctly for a stateful session model involves working against the grain of what HTTP was designed for. I would have been spending a lot of time on infrastructure that exists to make HTTP behave less like HTTP.

The other issue is overhead. HTTP headers are verbose. For small key-value operations, the headers can easily exceed the payload. That felt wrong for something designed to be a tight operational data store.

So I went with TCP directly, with a custom framing layer on top.

The first thing TCP teaches you

TCP is a stream. Not a sequence of messages. A stream.

When a client sends two requests back to back, the server cannot assume they arrive as two separate chunks of bytes. They might arrive together. They might arrive in three pieces. One might arrive before the other but the other might split across two read calls.

The first real problem a custom protocol has to solve is: where does one message end and the next begin?

The standard answer is a length-prefixed frame. Every message starts with a fixed-size header that includes the length of the payload that follows. The receiver reads the header, learns how long the body is, reads exactly that many bytes, and now has one complete message.

+--------+-------+---------+------ ... ------+
| magic  |  ver  |  flags  |    payload      |
| 4 bytes| 1 byte| 2 bytes | length bytes    |
+--------+-------+---------+------ ... ------+
         |                 |
         header (fixed)    body (variable)

Simple in theory. The implementation detail that catches you is the partial read. If you ask for 16 bytes and only 10 arrive, you wait. If you ask for 10,000 bytes and the connection closes after 9,000, you have a truncated frame. Both of these are normal TCP behavior and your parser needs to handle them without panicking or blocking forever.

In Rust with Tokio this is manageable, but it requires explicit handling. You cannot just call read() and assume the full frame arrives.

Versioning is a commitment you make to every future client

Once you have framing, the next thing you want is versioning. Not because you plan to break anything, but because you will, and you want a way to handle it gracefully when you do.

VTP2 includes the protocol version in every frame header. This sounds straightforward until you think about what compatibility actually means across versions.

A client built against protocol version 2 connects to a server running version 3. What should happen? There are two reasonable answers:

The server accepts the connection and negotiates down to the common version.
The server rejects the connection with a structured error explaining what versions it supports.

VTP2 uses a startup negotiation step. Before any command frames are exchanged, the client sends a hello frame with its protocol version, its client name and version, and the capabilities it wants to use. The server responds with what it accepts.

This means adding a new capability in a future version is safe because older clients simply do not request it. They get a connection without the new capability and everything works as before.

What you cannot do without breaking things is change the meaning of an existing opcode or restructure an existing response format. Those are wire-breaking changes. The version number is the signal that something fundamentally changed.

I learned this the hard way. In an early version of VTP2, I changed the response format for EXEC to return structured typed results instead of a string list. That was a correctness improvement. It was also a silent breaking change for any client that had already parsed the old response format. Now that is a protocol version boundary: 0.2.x clients are not transaction-wire-compatible with 0.3.0 servers, and the changelog says so explicitly.

Request IDs are not optional when you pipeline

Early in the design, requests used a local counter for identification. It was simple. It was wrong.

When you pipeline requests over a single connection, you might have dozens of in-flight requests at the same time with responses arriving in any order depending on how long each operation takes. If two connections both generate request IDs from a local counter, they can collide. If one connection's counter resets, it can collide with itself.

VTP2 switched to UUIDs for request IDs. Every request carries a UUID. Every response echoes back the same UUID. The client correlates responses to requests using the UUID, not position.

This removes the ordering assumption entirely. Responses can arrive in any order. The client matches them correctly regardless.

The cost is 16 bytes per request and per response for the UUID. For the workloads Vaylix targets, that is irrelevant. For a high-throughput system doing millions of tiny operations per second, it might be worth revisiting. For coordination state, it is the right tradeoff.

Checksums are the difference between silent corruption and a caught error

A frame travels from the client through the OS, the network stack, maybe some middleware, and into the server. Bytes can be flipped. Not often. Not reliably reproducibly. But it happens.

Without a checksum, a corrupted frame is processed as if it were valid. The server executes a command with wrong arguments, or writes garbage to the store, or produces a result nobody asked for. The error is silent and the consequences are unpredictable.

VTP2 includes a checksum in the frame header that covers the payload. If the checksum does not match, the frame is rejected before any processing happens. The client gets a structured error with the expected and actual checksum values. The server logs it. Nothing gets executed.

One subtlety: Vaylix uses zstd compression on outbound frames above a size threshold. The checksum validates the compressed payload, not the decompressed payload. This means a compression bug that produces different bytes would be caught by the checksum, but a decompression bug that produces different bytes would not. That asymmetry is deliberate and documented, but it is the kind of thing that is easy to get backwards if you do not think it through.

Error codes need to be stable forever

Error handling is the part of protocol design that is easiest to under-invest in early and hardest to fix later.

The naive approach is to return error strings. A server returns "key not found" and the client parses the string. This works until you change the error message for any reason, which you will, and every client that pattern-matched the string silently breaks.

VTP2 uses structured errors with three fields: a stable numeric code, a stable string name, and a human-readable message that can change freely.

{
  code: 4001,
  name: "KEY_NOT_FOUND",
  message: "the key 'config:env' does not exist"
}

Client code matches on the numeric code or the name. The message is for humans debugging the problem. You can change the message text without breaking anything. You cannot change the code or the name without a versioned protocol change.

The error codes are now documented in ERROR_CODES.md and treated as a stability contract. Any code that shipped in a release will not be reused for a different failure class. Adding new codes is fine. Changing old ones is a breaking change.

This seems like a lot of discipline for a small project. It is. But the alternative is telling users that their error handling broke because I rewrote a string.

Capability negotiation solves the feature drift problem

As a protocol evolves, new features get added. Compression. Request deadlines. Trace context propagation. Metrics. Each of these is useful in some contexts and irrelevant or undesirable in others.

Hard-coding every feature into every connection creates two problems. Clients that do not need compression still pay the negotiation cost. Servers that add a new feature have no way to know which connected clients support it.

VTP2 uses capability negotiation in the startup hello. The client lists the capabilities it wants. The server lists what it accepts. The intersection is what the connection uses.

Current capabilities:

zstd — frame-level compression
request_deadline — per-request timeout propagation
server_metrics — server-side metric events
pipelining — explicit pipeline mode
trace_context — distributed trace ID propagation

Adding a new capability in a future release is safe because existing clients just do not request it. The server enables it only for clients that ask. There is no flag day where all clients must be updated simultaneously.

What I would do differently

The startup negotiation adds latency to connection establishment. One extra round trip before any commands can be sent. For long-lived connections this is a one-time cost and irrelevant. For workloads with short-lived connections, it adds up.

If I were starting over, I would think harder about whether the hello/server-hello round trip could be combined with the first command frame or at least pipelined without waiting for the server hello response before sending the first request.

I also underestimated how much work the per-language SDK burden would be. Every language binding starts from scratch with VTP2. There is no existing tooling, no existing parser, no existing test suite. A first-class TypeScript SDK exists now and a Go SDK is in progress, but each one is weeks of work that would have been avoided with RESP or gRPC.

Whether that tradeoff was worth it depends on what the protocol enables. For a system where the protocol needs to carry replication metadata, structured error codes, versioned CAS operations, and request deadlines on the same transport, building something that was designed for all of that from the start made the implementation cleaner than it would have been if those features had been retrofitted onto RESP.

But that is a judgment call that only makes sense in hindsight.

The short version

TCP gives you a stream, not messages. Framing is your problem.

Versioning is a commitment you make to every client that ever connects. Get it wrong early and you pay later.

Request IDs need to be globally unique if you pipeline.

Checksums catch corruption before it becomes a silent bug.

Error codes are forever. Treat them that way from the start.

Capability negotiation is the only sane way to evolve a protocol without breaking existing clients.

None of these are surprising in retrospect. Building VTP2 was the only way I was going to understand them properly.

VTP2 is the transport protocol powering Vaylix, an open source key-value database engine built for operational state that must survive crashes.

vaylix / vaylix

A key-value database engine in Rust. Custom binary protocol, RBAC, encrypted WAL persistence, Raft-style replication, TLS/mTLS, and binary-safe values with versioned compare-and-set.

Vaylix

Vaylix is a Rust key/value database built around a strict transport boundary:

client -> transport -> TCP/TLS -> transport -> server -> engine

The current server stores UTF-8 keys with opaque byte values using segmented WAL plus encrypted snapshot persistence. It includes a shared framed binary transport, a Tokio multi-client server, authentication with RBAC, optional TLS/mTLS, default-on frame compression, logical backup/restore commands, offline PITR-oriented storage subcommands, maintenance mode, hash-chained audit logging, and Raft-style HA replication with automatic leader election and quorum-backed writes.

Detailed architecture context lives in LLM.md Benchmark guidance lives in BENCHMARKING.md Stability and compatibility contracts live in STABILITY.md, COMPATIBILITY_1_0.md, ERROR_CODES.md, NON_GOALS.md, and DEPLOYMENT.md.

Downloads

Release binaries are published from tagged releases:

Server and client archives: https://github.com/vaylix/vaylix/releases
Server image: ghcr.io/vaylix/vaylix:latest
Versioned server image example: ghcr.io/vaylix/vaylix:0.9.0

Release builds also publish SBOMs and keyless Sigstore/cosign attestations.

Run with Docker

docker pull ghcr.io/vaylix/vaylix:latest
docker

…

View on GitHub

Top comments (1)

Anapeksha Mukherjee • Jun 11

Engine: github.com/vaylix/vaylix
TypeScript SDK: github.com/vaylix/vaylix-ts
Docs: vaylix.github.io