Beyond JSON: Why My Next Project Uses a Custom Binary Protocol

#webdev #rust #code #json

If you build web applications today, JSON is the air you breathe. It’s ubiquitous, human-readable and supported by every standard library on the planet. But there is a dark side to this convenience: JSON is an absolute CPU sink.

We’ve accepted a bizarre industry standard where we take perfectly structured, contiguous memory, blow it up into variable-length ASCII strings, hurl it across a network and force the receiving machine to burn precious clock cycles parsing those strings back into memory.

For 90% of standard CRUD apps the overhead is invisible. But when you start building low-latency systems whether that’s a decentralized time-sync tool, a terminal-based MIDI sequencer or high-frequency inter-agent AI communication—JSON stops being a tool and starts being a liability.

Here is why I am leaving JSON behind for my next project and why you should seriously consider understanding custom binary serialization.

The Illusion of "Fast Enough" and the Cargo Cult of Schema

When developers realize JSON is too slow, the standard knee-jerk reaction is to reach for Protocol Buffers (gRPC) or MessagePack.

These are massive improvements, absolutely. But they often come with a heavy philosophical tax. Protobuf requires dragging a massive toolchain into your project, compiling .proto files, and generating thousands of lines of boilerplate. It’s the "cargo cult" of schema—adopting enterprise-scale complexity when all you really need is raw, lean speed.

What if you just need to pass an 8-bit opcode and a tiny latent embedding between two local AI agents? Pulling in a massive framework for that is like using a sledgehammer to drive a thumbtack.

Enter Compact Binary: The Anatomy of BIT-S

In my recent work on the Binary Inter-agent Transport Schema (BIT-S), the goal was simple: achieve near zero-copy delta serialization without the bloat.

When you design a custom binary protocol, you get to strip away everything except the exact bytes you need. No curly braces, no quotation marks, no string-matching keys. Just pure data.

In BIT-S, the architecture is designed around fixed-width 8-bit opcodes and highly quantized latent embeddings. The payload maps perfectly to memory structures.

Let’s look at a practical example in Rust.

Imagine we need to send a command to an AI agent containing an instruction (opcode), a confidence score, and a small latent vector.

The JSON Approach

If we serialize this using a standard library like serde_json, the payload looks like this:

{
  "op": 14,
  "conf": 0.985,
  "vec": [128, 64, 32, 16]
}

That single, simple message takes up 55 bytes of network traffic. Worse, to read it the receiver has to allocate memory dynamically, parse floats from strings and match dictionary keys.

The Custom Binary Approach (Rust)

Instead, let's map this directly to a lean binary format using Rust.

#[derive(Debug, PartialEq)]
struct AgentCommand {
    opcode: u8,        // 1 byte
    confidence: f32,   // 4 bytes
    vector: [u8; 4],   // 4 bytes
} // Total: 9 bytes

impl AgentCommand {
    // Serialize directly to a byte array
    fn to_bytes(&self) -> [u8; 9] {
        let mut buffer = [0u8; 9];
        buffer[0] = self.opcode;
        buffer[1..5].copy_from_slice(&self.confidence.to_le_bytes());
        buffer[5..9].copy_from_slice(&self.vector);
        buffer
    }

    // Zero-allocation deserialization
    fn from_bytes(bytes: &[u8; 9]) -> Self {
        let mut conf_bytes = [0u8; 4];
        conf_bytes.copy_from_slice(&bytes[1..5]);

        let mut vec_bytes = [0u8; 4];
        vec_bytes.copy_from_slice(&bytes[5..9]);

        Self {
            opcode: bytes[0],
            confidence: f32::from_le_bytes(conf_bytes),
            vector: vec_bytes,
        }
    }
}

fn main() {
    let cmd = AgentCommand {
        opcode: 14,
        confidence: 0.985,
        vector: [128, 64, 32, 16],
    };

    let payload = cmd.to_bytes();
    println!("Binary Payload Size: {} bytes", payload.len()); 
    // Output: Binary Payload Size: 9 bytes
}

The Real-World Gains

By writing our own serialization we achieved a few critical things:

Size Reduction: We dropped the payload from 55 bytes to 9 bytes. That is an 83% reduction in bandwidth.
Zero Allocation: The from_bytes function requires exactly zero dynamic memory allocations (no malloc or String creation). It parses instantly.
Predictability: Fixed-size structs mean we always know exactly how many bytes to read from our TCP stream before we have a complete message.

Conclusion: Reclaiming the Stack

JSON is not going anywhere, and for public-facing web APIs, it remains the right tool for the job.

But as developers we need to stop treating text-based serialization as the default answer to every problem. As we move further into the era of agentic AI, edge computing and high-performance Rust backends, understanding how to pack a struct into a raw byte array isn't just a party trick—it’s an essential engineering skill.

Stop wasting CPU cycles parsing curly braces. Get closer to the metal.

You can check out the experimental work on lean, low-latency agent communication protocols at my GitHub repo.