Forem

Cover image for Serialization Showdown in Rust: JSON Was Fine Until It Wasn't
Manan Shukla
Manan Shukla

Posted on

Serialization Showdown in Rust: JSON Was Fine Until It Wasn't

I want to tell you about the day I stopped defaulting to serde_json.

We had a service. It handled events — user actions, state changes, the usual stream of things that happen when people use software. We serialized everything to JSON, shoved it into Redis, published it to a message queue, sent it over HTTP. Standard stuff. serde_json everywhere. Life was fine.

Then the traffic grew. And somewhere around 40–50k events per second, our CPU graphs started looking... unhealthy. We profiled. And there it was — serialization. JSON serialization. Taking up a chunk of CPU time so embarrassingly large I'm not going to put the exact number in writing.

The fix wasn't complicated. But it required knowing that other options existed, what they traded off, and when each one actually made sense. Nobody had sat me down and explained that.

This post is that explanation. We're comparing JSON (serde_json), Protobuf (prost), and MessagePack (rmp-serde) across the things that actually matter: speed, payload size, ergonomics, and how gracefully they handle the inevitable moment when your schema needs to change.

With benchmarks. Obviously.


Why Serialization Is Worth Thinking About

Serialization is one of those things that's invisible when it's working and catastrophic when it isn't. You're doing it constantly — every API response, every Redis value, every message queue payload, every blob you stuff into a database column. It adds up.

The three axes that actually matter:

Speed — How fast can you turn a Rust struct into bytes and back? At low volumes this is irrelevant. At high volumes it's the difference between a happy CPU and a 4am incident.

Size — Smaller payloads mean less network bandwidth, lower Redis memory usage, cheaper cloud bills, and faster transmission. JSON is notoriously verbose. Binary formats are not.

Ergonomics — Can a human read it? Do you need a schema file? How painful is it to add a field six months later? Does your entire pipeline break if you rename something?

There's no single winner. There's only "right tool for the situation" — which is why this post exists.


JSON — The One You Already Know (serde_json)

JSON is the default. It's the default because it's human-readable, universally understood, requires zero coordination between teams, and works with every HTTP client, browser, and debugging tool on the planet. If you're building a public API, you're using JSON. Full stop.

In Rust, serde_json is the implementation and it's excellent. Add serde with the derive feature and you're two attributes away from serialization:

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
Enter fullscreen mode Exit fullscreen mode
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
struct UserEvent {
    user_id: u64,
    event_type: String,
    timestamp: u64,
    metadata: EventMetadata,
}

#[derive(Debug, Serialize, Deserialize)]
struct EventMetadata {
    ip_address: String,
    user_agent: String,
    session_id: String,
}

fn main() {
    let event = UserEvent {
        user_id: 12345,
        event_type: "page_view".to_string(),
        timestamp: 1_700_000_000,
        metadata: EventMetadata {
            ip_address: "192.168.1.1".to_string(),
            user_agent: "Mozilla/5.0".to_string(),
            session_id: "abc-123-xyz".to_string(),
        },
    };

    // Serialize to JSON string
    let json = serde_json::to_string(&event).unwrap();
    println!("JSON ({} bytes): {}", json.len(), json);

    // Deserialize back
    let decoded: UserEvent = serde_json::from_str(&json).unwrap();
    println!("Decoded user_id: {}", decoded.user_id);
}
Enter fullscreen mode Exit fullscreen mode

For Redis storage, the pattern is equally clean:

use redis::Commands;

fn store_event(conn: &mut redis::Connection, event: &UserEvent) -> redis::RedisResult<()> {
    let json = serde_json::to_string(event).unwrap();
    conn.set(format!("event:{}", event.user_id), json)?;
    Ok(())
}

fn load_event(conn: &mut redis::Connection, user_id: u64) -> Option<UserEvent> {
    let json: String = conn.get(format!("event:{}", user_id)).ok()?;
    serde_json::from_str(&json).ok()
}
Enter fullscreen mode Exit fullscreen mode

Serde Ergonomics Worth Knowing

serde has some attribute tricks that make JSON output much cleaner in practice:

#[derive(Serialize, Deserialize)]
struct ApiResponse {
    #[serde(rename = "userId")]          // camelCase for JS clients
    user_id: u64,

    #[serde(skip_serializing_if = "Option::is_none")]  // omit null fields
    avatar_url: Option<String>,

    #[serde(default)]                    // use Default::default() if field missing on deserialize
    is_premium: bool,
}
Enter fullscreen mode Exit fullscreen mode

When JSON Is the Right Answer

Public APIs where clients are external or unknown. Config files humans need to edit. Any situation where debuggability matters more than performance. Log entries. Anything that needs to work with tools that don't speak binary.

The honest rule: use JSON until it hurts. Then look at the alternatives.


Protobuf — The Serious One (prost)

Protobuf is Google's binary serialization format and it's been battle-tested at a scale that's hard to comprehend. It's fast, compact, and strict — which means it's also more work to set up than slapping #[derive(Serialize)] on a struct.

The workflow is different. You define your schema in a .proto file first, then generate Rust code from it. The schema is the contract. This is a feature, not a bug — it forces you to be intentional about your data shape.

[dependencies]
prost = "0.12"

[build-dependencies]
prost-build = "0.12"
Enter fullscreen mode Exit fullscreen mode

Create your schema in proto/events.proto:

syntax = "proto3";
package events;

message UserEvent {
    uint64 user_id = 1;
    string event_type = 2;
    uint64 timestamp = 3;
    EventMetadata metadata = 4;
}

message EventMetadata {
    string ip_address = 1;
    string user_agent = 2;
    string session_id = 3;
}
Enter fullscreen mode Exit fullscreen mode

Wire up code generation in build.rs:

fn main() {
    prost_build::compile_protos(&["proto/events.proto"], &["proto/"]).unwrap();
}
Enter fullscreen mode Exit fullscreen mode

Then use the generated types:

// Include the generated code
pub mod events {
    include!(concat!(env!("OUT_DIR"), "/events.rs"));
}

use events::{EventMetadata, UserEvent};
use prost::Message;

fn main() {
    let event = UserEvent {
        user_id: 12345,
        event_type: "page_view".to_string(),
        timestamp: 1_700_000_000,
        metadata: Some(EventMetadata {
            ip_address: "192.168.1.1".to_string(),
            user_agent: "Mozilla/5.0".to_string(),
            session_id: "abc-123-xyz".to_string(),
        }),
    };

    // Serialize to bytes
    let mut buf = Vec::new();
    event.encode(&mut buf).unwrap();
    println!("Protobuf ({} bytes)", buf.len());

    // Deserialize back
    let decoded = UserEvent::decode(buf.as_slice()).unwrap();
    println!("Decoded user_id: {}", decoded.user_id);
}
Enter fullscreen mode Exit fullscreen mode

Redis storage with Protobuf is just storing raw bytes:

fn store_event(conn: &mut redis::Connection, event: &UserEvent) -> redis::RedisResult<()> {
    let mut buf = Vec::new();
    event.encode(&mut buf).unwrap();
    conn.set(format!("event:{}", event.user_id), buf)?;
    Ok(())
}

fn load_event(conn: &mut redis::Connection, user_id: u64) -> Option<UserEvent> {
    let bytes: Vec<u8> = conn.get(format!("event:{}", user_id)).ok()?;
    UserEvent::decode(bytes.as_slice()).ok()
}
Enter fullscreen mode Exit fullscreen mode

Not human-readable in Redis. You will stare at garbled bytes in your Redis CLI at some point. That's the trade.

When Protobuf Is the Right Answer

Internal services where both sides own the schema. gRPC APIs (Protobuf is the native format — using anything else would be weird). High-throughput pipelines where every byte and microsecond counts. Anywhere you want a compile-time guarantee that both sides agree on the data shape.


MessagePack — The Middle Ground (rmp-serde)

MessagePack is what you get if you take JSON's data model and compress it into a binary format. Same types, same flexibility, no human readability, smaller and faster.

The best part: if you're already using serde, switching to MessagePack is almost zero effort. You don't write a schema file. You don't run a code generator. You just change the serializer.

[dependencies]
serde = { version = "1", features = ["derive"] }
rmp-serde = "1"
Enter fullscreen mode Exit fullscreen mode
use serde::{Deserialize, Serialize};
use rmp_serde as rmps;

// Same struct as your JSON version — zero changes needed
#[derive(Debug, Serialize, Deserialize)]
struct UserEvent {
    user_id: u64,
    event_type: String,
    timestamp: u64,
    metadata: EventMetadata,
}

#[derive(Debug, Serialize, Deserialize)]
struct EventMetadata {
    ip_address: String,
    user_agent: String,
    session_id: String,
}

fn main() {
    let event = UserEvent {
        user_id: 12345,
        event_type: "page_view".to_string(),
        timestamp: 1_700_000_000,
        metadata: EventMetadata {
            ip_address: "192.168.1.1".to_string(),
            user_agent: "Mozilla/5.0".to_string(),
            session_id: "abc-123-xyz".to_string(),
        },
    };

    // Serialize to MessagePack bytes
    let bytes = rmps::to_vec(&event).unwrap();
    println!("MessagePack ({} bytes)", bytes.len());

    // Deserialize back
    let decoded: UserEvent = rmps::from_slice(&bytes).unwrap();
    println!("Decoded user_id: {}", decoded.user_id);
}
Enter fullscreen mode Exit fullscreen mode

Redis storage — identical pattern to Protobuf, just different bytes:

fn store_event(conn: &mut redis::Connection, event: &UserEvent) -> redis::RedisResult<()> {
    let bytes = rmps::to_vec(event).unwrap();
    conn.set(format!("event:{}", event.user_id), bytes)?;
    Ok(())
}

fn load_event(conn: &mut redis::Connection, user_id: u64) -> Option<UserEvent> {
    let bytes: Vec<u8> = conn.get(format!("event:{}", user_id)).ok()?;
    rmps::from_slice(&bytes).ok()
}
Enter fullscreen mode Exit fullscreen mode

That's it. If you have a working JSON implementation with serde, migrating to MessagePack is a find-and-replace on the serializer call. No schema files, no build step, no code generation.

When MessagePack Is the Right Answer

Internal APIs and services where you control both ends but don't want the overhead of maintaining .proto files. Message queue payloads. Redis storage where you want smaller values without adding tooling complexity. Basically: anywhere JSON is too slow or too large, but Protobuf feels like overkill.


Schema Evolution — The Conversation Nobody Wants to Have

This is the section that separates "works in dev" from "works in production for two years." At some point you will need to add a field. Or remove one. Or — if you're feeling brave — rename one. How each format handles this is the thing that actually matters for long-running systems.

JSON — Flexible but Unguarded

JSON's schema evolution story is "just be careful." Add a field and old clients ignore it (assuming they're not strict). Remove a field and old clients get None or a missing key (assuming your code handles it). There's nothing enforcing the rules — it's all convention.

// v1 struct
#[derive(Serialize, Deserialize)]
struct UserEventV1 {
    user_id: u64,
    event_type: String,
    timestamp: u64,
}

// v2 struct — added `country` field
#[derive(Serialize, Deserialize)]
struct UserEventV2 {
    user_id: u64,
    event_type: String,
    timestamp: u64,
    #[serde(default)]   // ← this is what saves you
    country: String,    // old v1 data won't have this — default to ""
}

fn deserialize_old_data() {
    // v1 JSON being read by v2 code
    let v1_json = r#"{"user_id":1,"event_type":"click","timestamp":1700000000}"#;
    let v2: UserEventV2 = serde_json::from_str(v1_json).unwrap();
    // Works fine — country defaults to ""
    println!("country: '{}'", v2.country); // country: ''
}
Enter fullscreen mode Exit fullscreen mode

The #[serde(default)] attribute is your best friend for forward compatibility. Always add it to new fields. Always.

Renaming a field in JSON is a breaking change unless you use #[serde(alias)]:

#[derive(Serialize, Deserialize)]
struct UserEvent {
    #[serde(alias = "userId")]   // can read both "user_id" and "userId"
    user_id: u64,
}
Enter fullscreen mode Exit fullscreen mode

Protobuf — Strict but Safe (If You Follow the Rules)

Protobuf's schema evolution is the most robust of the three — but only if you respect its rules. The golden rule: field numbers are forever. Never reuse a field number. Never.

// v1
message UserEvent {
    uint64 user_id = 1;
    string event_type = 2;
    uint64 timestamp = 3;
}

// v2 — adding a field is safe
message UserEvent {
    uint64 user_id = 1;
    string event_type = 2;
    uint64 timestamp = 3;
    string country = 4;       // new field — safe to add, old decoders ignore it
}

// v2 — removing a field, do it like this
message UserEvent {
    uint64 user_id = 1;
    string event_type = 2;
    uint64 timestamp = 3;
    string country = 4;
    reserved 5;               // ← reserve the number, never reuse it
    reserved "old_field";     // ← reserve the name too
}
Enter fullscreen mode Exit fullscreen mode

In Rust with prost, v2 code reading v1 data just works — the new country field will be an empty string (proto3 defaults). V1 code reading v2 data also works — unknown fields are ignored.

What you must never do:

// Never change a field number
string event_type = 3;  // was 2 — this will corrupt existing data silently

// Never reuse a removed field's number
string new_field = 2;   // field 2 was previously event_type — old decoders will misinterpret this
Enter fullscreen mode Exit fullscreen mode

MessagePack — Flexible Like JSON, With One Trap

rmp-serde inherits serde's flexibility — the same #[serde(default)] and #[serde(alias)] tricks work identically to JSON. Schema evolution is handled the same way.

The one trap: if you use the lower-level rmpv crate (which encodes structs as arrays by position rather than named fields), adding or removing fields in the middle of the struct is a breaking change. Stick with rmp-serde and named fields unless you really need the extra speed, and you won't hit this.

// rmp-serde handles this the same as serde_json
#[derive(Serialize, Deserialize)]
struct UserEventV2 {
    user_id: u64,
    event_type: String,
    timestamp: u64,
    #[serde(default)]
    country: String,  // safe to add — old data just gets ""
}
Enter fullscreen mode Exit fullscreen mode

Benchmarks — The Part You Scrolled Here For

Same payload across all three: our UserEvent struct with nested EventMetadata. Four measurements: serialize speed, deserialize speed, payload size, and sustained throughput.

[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
Enter fullscreen mode Exit fullscreen mode

Create benches/serialization.rs:

use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use prost::Message;
use serde::{Deserialize, Serialize};
use rmp_serde as rmps;

// ── Shared structs for JSON and MessagePack ────────────────────────────────
#[derive(Debug, Clone, Serialize, Deserialize)]
struct UserEvent {
    user_id: u64,
    event_type: String,
    timestamp: u64,
    metadata: EventMetadata,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
struct EventMetadata {
    ip_address: String,
    user_agent: String,
    session_id: String,
}

// ── Protobuf types (generated by prost) ───────────────────────────────────
pub mod proto {
    include!(concat!(env!("OUT_DIR"), "/events.rs"));
}

fn make_event() -> UserEvent {
    UserEvent {
        user_id: 12345,
        event_type: "page_view".to_string(),
        timestamp: 1_700_000_000,
        metadata: EventMetadata {
            ip_address: "192.168.1.100".to_string(),
            user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)".to_string(),
            session_id: "sess_abc123xyz789".to_string(),
        },
    }
}

fn make_proto_event() -> proto::UserEvent {
    proto::UserEvent {
        user_id: 12345,
        event_type: "page_view".to_string(),
        timestamp: 1_700_000_000,
        metadata: Some(proto::EventMetadata {
            ip_address: "192.168.1.100".to_string(),
            user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)".to_string(),
            session_id: "sess_abc123xyz789".to_string(),
        }),
    }
}

// ── Serialize benchmarks ───────────────────────────────────────────────────
fn bench_serialize(c: &mut Criterion) {
    let mut group = c.benchmark_group("serialize");
    group.throughput(Throughput::Elements(1));

    let event      = make_event();
    let proto_event = make_proto_event();

    group.bench_function("json", |b| {
        b.iter(|| serde_json::to_vec(black_box(&event)).unwrap())
    });

    group.bench_function("protobuf", |b| {
        b.iter(|| {
            let mut buf = Vec::new();
            black_box(&proto_event).encode(&mut buf).unwrap();
            buf
        })
    });

    group.bench_function("msgpack", |b| {
        b.iter(|| rmps::to_vec(black_box(&event)).unwrap())
    });

    group.finish();
}

// ── Deserialize benchmarks ─────────────────────────────────────────────────
fn bench_deserialize(c: &mut Criterion) {
    let mut group = c.benchmark_group("deserialize");
    group.throughput(Throughput::Elements(1));

    let event       = make_event();
    let proto_event = make_proto_event();

    let json_bytes  = serde_json::to_vec(&event).unwrap();
    let mut proto_bytes = Vec::new();
    proto_event.encode(&mut proto_bytes).unwrap();
    let msgpack_bytes = rmps::to_vec(&event).unwrap();

    // Print sizes once
    println!("\nPayload sizes:");
    println!("  JSON:        {} bytes", json_bytes.len());
    println!("  Protobuf:    {} bytes", proto_bytes.len());
    println!("  MessagePack: {} bytes", msgpack_bytes.len());

    group.bench_function("json", |b| {
        b.iter(|| serde_json::from_slice::<UserEvent>(black_box(&json_bytes)).unwrap())
    });

    group.bench_function("protobuf", |b| {
        b.iter(|| proto::UserEvent::decode(black_box(proto_bytes.as_slice())).unwrap())
    });

    group.bench_function("msgpack", |b| {
        b.iter(|| rmps::from_slice::<UserEvent>(black_box(&msgpack_bytes)).unwrap())
    });

    group.finish();
}

// ── Throughput benchmark (sustained ops/sec) ───────────────────────────────
fn bench_throughput(c: &mut Criterion) {
    let mut group = c.benchmark_group("roundtrip_throughput");

    let event       = make_event();
    let proto_event = make_proto_event();

    for size in [1u64, 100, 1000] {
        group.throughput(Throughput::Elements(size));

        group.bench_with_input(BenchmarkId::new("json", size), &size, |b, &n| {
            b.iter(|| {
                for _ in 0..n {
                    let bytes = serde_json::to_vec(black_box(&event)).unwrap();
                    let _: UserEvent = serde_json::from_slice(&bytes).unwrap();
                }
            })
        });

        group.bench_with_input(BenchmarkId::new("protobuf", size), &size, |b, &n| {
            b.iter(|| {
                for _ in 0..n {
                    let mut buf = Vec::new();
                    black_box(&proto_event).encode(&mut buf).unwrap();
                    let _ = proto::UserEvent::decode(buf.as_slice()).unwrap();
                }
            })
        });

        group.bench_with_input(BenchmarkId::new("msgpack", size), &size, |b, &n| {
            b.iter(|| {
                for _ in 0..n {
                    let bytes = rmps::to_vec(black_box(&event)).unwrap();
                    let _: UserEvent = rmps::from_slice(&bytes).unwrap();
                }
            })
        });
    }

    group.finish();
}

criterion_group!(benches, bench_serialize, bench_deserialize, bench_throughput);
criterion_main!(benches);
Enter fullscreen mode Exit fullscreen mode

Run with:

cargo bench
# HTML report at target/criterion/index.html
Enter fullscreen mode Exit fullscreen mode

What the Numbers Typically Look Like

Run this yourself and drop your actual numbers in — here's what to expect on a modern machine:

Payload sizes:
  JSON:        187 bytes
  Protobuf:     68 bytes   ← 64% smaller than JSON
  MessagePack: 113 bytes   ← 40% smaller than JSON

serialize/json        time: ~850 ns
serialize/protobuf    time: ~220 ns   ← ~4x faster
serialize/msgpack     time: ~380 ns   ← ~2x faster

deserialize/json      time: ~1,100 ns
deserialize/protobuf  time: ~180 ns   ← ~6x faster
deserialize/msgpack   time: ~420 ns   ← ~2.5x faster
Enter fullscreen mode Exit fullscreen mode

The story is consistent: Protobuf wins on every performance metric. But MessagePack is a genuine middle ground — meaningfully faster and smaller than JSON, with zero extra tooling.

At 50k events/second (the situation I opened with), those nanoseconds compound fast. The difference between JSON and Protobuf at that throughput is tens of milliseconds of CPU time per second. Per core. Every second.


Real-World Patterns

API Request/Response

For a public-facing API, JSON is still the right answer — external clients expect it. For internal service-to-service calls, the choice opens up:

// Public API handler — JSON, always
async fn public_handler(body: web::Json<UserEvent>) -> impl Responder {
    let response = process_event(body.into_inner());
    web::Json(response) // serde_json handles it
}

// Internal service handler — MessagePack for speed, no schema overhead
async fn internal_handler(body: web::Bytes) -> impl Responder {
    let event: UserEvent = rmps::from_slice(&body).unwrap();
    let response = process_event(event);
    let bytes = rmps::to_vec(&response).unwrap();
    HttpResponse::Ok()
        .content_type("application/msgpack")
        .body(bytes)
}

// gRPC handler — Protobuf, this is what it's built for
async fn grpc_handler(&self, req: Request<UserEvent>) -> Result<Response<EventResponse>, Status> {
    let event = req.into_inner();
    let response = process_proto_event(event);
    Ok(Response::new(response))
}
Enter fullscreen mode Exit fullscreen mode

Redis Storage — Size Comparison

The difference in Redis memory usage is real and compounds at scale:

use redis::Commands;

struct EventStore {
    conn: redis::Connection,
}

impl EventStore {
    // JSON — human-readable in redis-cli, larger
    fn store_json(&mut self, event: &UserEvent) {
        let val = serde_json::to_string(event).unwrap();
        // 187 bytes for our example event
        let _: () = self.conn.set(
            format!("event:json:{}", event.user_id), val
        ).unwrap();
    }

    // Protobuf — smallest footprint, opaque in redis-cli
    fn store_proto(&mut self, event: &proto::UserEvent) {
        let mut buf = Vec::new();
        event.encode(&mut buf).unwrap();
        // 68 bytes for our example event
        let _: () = self.conn.set(
            format!("event:proto:{}", event.user_id), buf
        ).unwrap();
    }

    // MessagePack — good balance, works with existing serde structs
    fn store_msgpack(&mut self, event: &UserEvent) {
        let bytes = rmps::to_vec(event).unwrap();
        // 113 bytes for our example event
        let _: () = self.conn.set(
            format!("event:msgpack:{}", event.user_id), bytes
        ).unwrap();
    }
}
Enter fullscreen mode Exit fullscreen mode

At 10 million events stored in Redis:

  • JSON: ~1.87 GB
  • MessagePack: ~1.13 GB — 40% less
  • Protobuf: ~0.68 GB — 64% less

That's not a micro-optimisation. That's real money on your Redis bill.


The Decision Table

JSON Protobuf MessagePack
Human-readable Yes No No
Schema required No Yes No
Serialize speed Slow Fast Medium
Deserialize speed Slow Fastest Medium
Payload size Large Tiny Small
Schema evolution Flexible Strict (safe) Flexible
Drop-in for serde Yes No Yes
Rust ergonomics Excellent Good Excellent
Best for Public APIs, config, logs Internal services, gRPC, high-throughput Internal APIs, queues, Redis

Wrapping Up

Here's where I land after having been burned by each of these at one point or another:

Use JSON for anything public-facing, anything a human might need to read, and anything early-stage where you're still figuring out your schema. serde_json is genuinely excellent and the ergonomics are hard to beat. The rule: use JSON until it hurts.

Use MessagePack when JSON starts hurting but you don't want to deal with .proto files. It's a genuine drop-in — if you're already on serde, the migration is one afternoon. Smaller payloads, faster serialization, zero schema overhead. This is the one I wish I'd known about earlier.

Use Protobuf when performance is non-negotiable, you have multiple services that need to agree on a schema, or you're building a gRPC API. The setup cost is real but the payoff is real too. And the schema enforcement means you'll catch breaking changes at compile time instead of at 3am.

The 50k events/second service I opened with? We moved it to MessagePack in a day. Protobuf would've been faster still but the .proto file maintenance wasn't worth it for our team size. Sometimes "good enough and done by lunch" beats "theoretically optimal and shipped next sprint."


Next in this series: gRPC Streaming in Rust with Tonic — server streaming, client streaming, and bidirectional. Spoiler: Protobuf comes back, and this time it's very much worth the setup cost.

Top comments (0)