I want to tell you about the day I stopped defaulting to serde_json.
We had a service. It handled events — user actions, state changes, the usual stream of things that happen when people use software. We serialized everything to JSON, shoved it into Redis, published it to a message queue, sent it over HTTP. Standard stuff. serde_json everywhere. Life was fine.
Then the traffic grew. And somewhere around 40–50k events per second, our CPU graphs started looking... unhealthy. We profiled. And there it was — serialization. JSON serialization. Taking up a chunk of CPU time so embarrassingly large I'm not going to put the exact number in writing.
The fix wasn't complicated. But it required knowing that other options existed, what they traded off, and when each one actually made sense. Nobody had sat me down and explained that.
This post is that explanation. We're comparing JSON (serde_json), Protobuf (prost), and MessagePack (rmp-serde) across the things that actually matter: speed, payload size, ergonomics, and how gracefully they handle the inevitable moment when your schema needs to change.
With benchmarks. Obviously.
Why Serialization Is Worth Thinking About
Serialization is one of those things that's invisible when it's working and catastrophic when it isn't. You're doing it constantly — every API response, every Redis value, every message queue payload, every blob you stuff into a database column. It adds up.
The three axes that actually matter:
Speed — How fast can you turn a Rust struct into bytes and back? At low volumes this is irrelevant. At high volumes it's the difference between a happy CPU and a 4am incident.
Size — Smaller payloads mean less network bandwidth, lower Redis memory usage, cheaper cloud bills, and faster transmission. JSON is notoriously verbose. Binary formats are not.
Ergonomics — Can a human read it? Do you need a schema file? How painful is it to add a field six months later? Does your entire pipeline break if you rename something?
There's no single winner. There's only "right tool for the situation" — which is why this post exists.
JSON — The One You Already Know (serde_json)
JSON is the default. It's the default because it's human-readable, universally understood, requires zero coordination between teams, and works with every HTTP client, browser, and debugging tool on the planet. If you're building a public API, you're using JSON. Full stop.
In Rust, serde_json is the implementation and it's excellent. Add serde with the derive feature and you're two attributes away from serialization:
[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
struct UserEvent {
user_id: u64,
event_type: String,
timestamp: u64,
metadata: EventMetadata,
}
#[derive(Debug, Serialize, Deserialize)]
struct EventMetadata {
ip_address: String,
user_agent: String,
session_id: String,
}
fn main() {
let event = UserEvent {
user_id: 12345,
event_type: "page_view".to_string(),
timestamp: 1_700_000_000,
metadata: EventMetadata {
ip_address: "192.168.1.1".to_string(),
user_agent: "Mozilla/5.0".to_string(),
session_id: "abc-123-xyz".to_string(),
},
};
// Serialize to JSON string
let json = serde_json::to_string(&event).unwrap();
println!("JSON ({} bytes): {}", json.len(), json);
// Deserialize back
let decoded: UserEvent = serde_json::from_str(&json).unwrap();
println!("Decoded user_id: {}", decoded.user_id);
}
For Redis storage, the pattern is equally clean:
use redis::Commands;
fn store_event(conn: &mut redis::Connection, event: &UserEvent) -> redis::RedisResult<()> {
let json = serde_json::to_string(event).unwrap();
conn.set(format!("event:{}", event.user_id), json)?;
Ok(())
}
fn load_event(conn: &mut redis::Connection, user_id: u64) -> Option<UserEvent> {
let json: String = conn.get(format!("event:{}", user_id)).ok()?;
serde_json::from_str(&json).ok()
}
Serde Ergonomics Worth Knowing
serde has some attribute tricks that make JSON output much cleaner in practice:
#[derive(Serialize, Deserialize)]
struct ApiResponse {
#[serde(rename = "userId")] // camelCase for JS clients
user_id: u64,
#[serde(skip_serializing_if = "Option::is_none")] // omit null fields
avatar_url: Option<String>,
#[serde(default)] // use Default::default() if field missing on deserialize
is_premium: bool,
}
When JSON Is the Right Answer
Public APIs where clients are external or unknown. Config files humans need to edit. Any situation where debuggability matters more than performance. Log entries. Anything that needs to work with tools that don't speak binary.
The honest rule: use JSON until it hurts. Then look at the alternatives.
Protobuf — The Serious One (prost)
Protobuf is Google's binary serialization format and it's been battle-tested at a scale that's hard to comprehend. It's fast, compact, and strict — which means it's also more work to set up than slapping #[derive(Serialize)] on a struct.
The workflow is different. You define your schema in a .proto file first, then generate Rust code from it. The schema is the contract. This is a feature, not a bug — it forces you to be intentional about your data shape.
[dependencies]
prost = "0.12"
[build-dependencies]
prost-build = "0.12"
Create your schema in proto/events.proto:
syntax = "proto3";
package events;
message UserEvent {
uint64 user_id = 1;
string event_type = 2;
uint64 timestamp = 3;
EventMetadata metadata = 4;
}
message EventMetadata {
string ip_address = 1;
string user_agent = 2;
string session_id = 3;
}
Wire up code generation in build.rs:
fn main() {
prost_build::compile_protos(&["proto/events.proto"], &["proto/"]).unwrap();
}
Then use the generated types:
// Include the generated code
pub mod events {
include!(concat!(env!("OUT_DIR"), "/events.rs"));
}
use events::{EventMetadata, UserEvent};
use prost::Message;
fn main() {
let event = UserEvent {
user_id: 12345,
event_type: "page_view".to_string(),
timestamp: 1_700_000_000,
metadata: Some(EventMetadata {
ip_address: "192.168.1.1".to_string(),
user_agent: "Mozilla/5.0".to_string(),
session_id: "abc-123-xyz".to_string(),
}),
};
// Serialize to bytes
let mut buf = Vec::new();
event.encode(&mut buf).unwrap();
println!("Protobuf ({} bytes)", buf.len());
// Deserialize back
let decoded = UserEvent::decode(buf.as_slice()).unwrap();
println!("Decoded user_id: {}", decoded.user_id);
}
Redis storage with Protobuf is just storing raw bytes:
fn store_event(conn: &mut redis::Connection, event: &UserEvent) -> redis::RedisResult<()> {
let mut buf = Vec::new();
event.encode(&mut buf).unwrap();
conn.set(format!("event:{}", event.user_id), buf)?;
Ok(())
}
fn load_event(conn: &mut redis::Connection, user_id: u64) -> Option<UserEvent> {
let bytes: Vec<u8> = conn.get(format!("event:{}", user_id)).ok()?;
UserEvent::decode(bytes.as_slice()).ok()
}
Not human-readable in Redis. You will stare at garbled bytes in your Redis CLI at some point. That's the trade.
When Protobuf Is the Right Answer
Internal services where both sides own the schema. gRPC APIs (Protobuf is the native format — using anything else would be weird). High-throughput pipelines where every byte and microsecond counts. Anywhere you want a compile-time guarantee that both sides agree on the data shape.
MessagePack — The Middle Ground (rmp-serde)
MessagePack is what you get if you take JSON's data model and compress it into a binary format. Same types, same flexibility, no human readability, smaller and faster.
The best part: if you're already using serde, switching to MessagePack is almost zero effort. You don't write a schema file. You don't run a code generator. You just change the serializer.
[dependencies]
serde = { version = "1", features = ["derive"] }
rmp-serde = "1"
use serde::{Deserialize, Serialize};
use rmp_serde as rmps;
// Same struct as your JSON version — zero changes needed
#[derive(Debug, Serialize, Deserialize)]
struct UserEvent {
user_id: u64,
event_type: String,
timestamp: u64,
metadata: EventMetadata,
}
#[derive(Debug, Serialize, Deserialize)]
struct EventMetadata {
ip_address: String,
user_agent: String,
session_id: String,
}
fn main() {
let event = UserEvent {
user_id: 12345,
event_type: "page_view".to_string(),
timestamp: 1_700_000_000,
metadata: EventMetadata {
ip_address: "192.168.1.1".to_string(),
user_agent: "Mozilla/5.0".to_string(),
session_id: "abc-123-xyz".to_string(),
},
};
// Serialize to MessagePack bytes
let bytes = rmps::to_vec(&event).unwrap();
println!("MessagePack ({} bytes)", bytes.len());
// Deserialize back
let decoded: UserEvent = rmps::from_slice(&bytes).unwrap();
println!("Decoded user_id: {}", decoded.user_id);
}
Redis storage — identical pattern to Protobuf, just different bytes:
fn store_event(conn: &mut redis::Connection, event: &UserEvent) -> redis::RedisResult<()> {
let bytes = rmps::to_vec(event).unwrap();
conn.set(format!("event:{}", event.user_id), bytes)?;
Ok(())
}
fn load_event(conn: &mut redis::Connection, user_id: u64) -> Option<UserEvent> {
let bytes: Vec<u8> = conn.get(format!("event:{}", user_id)).ok()?;
rmps::from_slice(&bytes).ok()
}
That's it. If you have a working JSON implementation with serde, migrating to MessagePack is a find-and-replace on the serializer call. No schema files, no build step, no code generation.
When MessagePack Is the Right Answer
Internal APIs and services where you control both ends but don't want the overhead of maintaining .proto files. Message queue payloads. Redis storage where you want smaller values without adding tooling complexity. Basically: anywhere JSON is too slow or too large, but Protobuf feels like overkill.
Schema Evolution — The Conversation Nobody Wants to Have
This is the section that separates "works in dev" from "works in production for two years." At some point you will need to add a field. Or remove one. Or — if you're feeling brave — rename one. How each format handles this is the thing that actually matters for long-running systems.
JSON — Flexible but Unguarded
JSON's schema evolution story is "just be careful." Add a field and old clients ignore it (assuming they're not strict). Remove a field and old clients get None or a missing key (assuming your code handles it). There's nothing enforcing the rules — it's all convention.
// v1 struct
#[derive(Serialize, Deserialize)]
struct UserEventV1 {
user_id: u64,
event_type: String,
timestamp: u64,
}
// v2 struct — added `country` field
#[derive(Serialize, Deserialize)]
struct UserEventV2 {
user_id: u64,
event_type: String,
timestamp: u64,
#[serde(default)] // ← this is what saves you
country: String, // old v1 data won't have this — default to ""
}
fn deserialize_old_data() {
// v1 JSON being read by v2 code
let v1_json = r#"{"user_id":1,"event_type":"click","timestamp":1700000000}"#;
let v2: UserEventV2 = serde_json::from_str(v1_json).unwrap();
// Works fine — country defaults to ""
println!("country: '{}'", v2.country); // country: ''
}
The #[serde(default)] attribute is your best friend for forward compatibility. Always add it to new fields. Always.
Renaming a field in JSON is a breaking change unless you use #[serde(alias)]:
#[derive(Serialize, Deserialize)]
struct UserEvent {
#[serde(alias = "userId")] // can read both "user_id" and "userId"
user_id: u64,
}
Protobuf — Strict but Safe (If You Follow the Rules)
Protobuf's schema evolution is the most robust of the three — but only if you respect its rules. The golden rule: field numbers are forever. Never reuse a field number. Never.
// v1
message UserEvent {
uint64 user_id = 1;
string event_type = 2;
uint64 timestamp = 3;
}
// v2 — adding a field is safe
message UserEvent {
uint64 user_id = 1;
string event_type = 2;
uint64 timestamp = 3;
string country = 4; // new field — safe to add, old decoders ignore it
}
// v2 — removing a field, do it like this
message UserEvent {
uint64 user_id = 1;
string event_type = 2;
uint64 timestamp = 3;
string country = 4;
reserved 5; // ← reserve the number, never reuse it
reserved "old_field"; // ← reserve the name too
}
In Rust with prost, v2 code reading v1 data just works — the new country field will be an empty string (proto3 defaults). V1 code reading v2 data also works — unknown fields are ignored.
What you must never do:
// Never change a field number
string event_type = 3; // was 2 — this will corrupt existing data silently
// Never reuse a removed field's number
string new_field = 2; // field 2 was previously event_type — old decoders will misinterpret this
MessagePack — Flexible Like JSON, With One Trap
rmp-serde inherits serde's flexibility — the same #[serde(default)] and #[serde(alias)] tricks work identically to JSON. Schema evolution is handled the same way.
The one trap: if you use the lower-level rmpv crate (which encodes structs as arrays by position rather than named fields), adding or removing fields in the middle of the struct is a breaking change. Stick with rmp-serde and named fields unless you really need the extra speed, and you won't hit this.
// rmp-serde handles this the same as serde_json
#[derive(Serialize, Deserialize)]
struct UserEventV2 {
user_id: u64,
event_type: String,
timestamp: u64,
#[serde(default)]
country: String, // safe to add — old data just gets ""
}
Benchmarks — The Part You Scrolled Here For
Same payload across all three: our UserEvent struct with nested EventMetadata. Four measurements: serialize speed, deserialize speed, payload size, and sustained throughput.
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
Create benches/serialization.rs:
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use prost::Message;
use serde::{Deserialize, Serialize};
use rmp_serde as rmps;
// ── Shared structs for JSON and MessagePack ────────────────────────────────
#[derive(Debug, Clone, Serialize, Deserialize)]
struct UserEvent {
user_id: u64,
event_type: String,
timestamp: u64,
metadata: EventMetadata,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct EventMetadata {
ip_address: String,
user_agent: String,
session_id: String,
}
// ── Protobuf types (generated by prost) ───────────────────────────────────
pub mod proto {
include!(concat!(env!("OUT_DIR"), "/events.rs"));
}
fn make_event() -> UserEvent {
UserEvent {
user_id: 12345,
event_type: "page_view".to_string(),
timestamp: 1_700_000_000,
metadata: EventMetadata {
ip_address: "192.168.1.100".to_string(),
user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)".to_string(),
session_id: "sess_abc123xyz789".to_string(),
},
}
}
fn make_proto_event() -> proto::UserEvent {
proto::UserEvent {
user_id: 12345,
event_type: "page_view".to_string(),
timestamp: 1_700_000_000,
metadata: Some(proto::EventMetadata {
ip_address: "192.168.1.100".to_string(),
user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)".to_string(),
session_id: "sess_abc123xyz789".to_string(),
}),
}
}
// ── Serialize benchmarks ───────────────────────────────────────────────────
fn bench_serialize(c: &mut Criterion) {
let mut group = c.benchmark_group("serialize");
group.throughput(Throughput::Elements(1));
let event = make_event();
let proto_event = make_proto_event();
group.bench_function("json", |b| {
b.iter(|| serde_json::to_vec(black_box(&event)).unwrap())
});
group.bench_function("protobuf", |b| {
b.iter(|| {
let mut buf = Vec::new();
black_box(&proto_event).encode(&mut buf).unwrap();
buf
})
});
group.bench_function("msgpack", |b| {
b.iter(|| rmps::to_vec(black_box(&event)).unwrap())
});
group.finish();
}
// ── Deserialize benchmarks ─────────────────────────────────────────────────
fn bench_deserialize(c: &mut Criterion) {
let mut group = c.benchmark_group("deserialize");
group.throughput(Throughput::Elements(1));
let event = make_event();
let proto_event = make_proto_event();
let json_bytes = serde_json::to_vec(&event).unwrap();
let mut proto_bytes = Vec::new();
proto_event.encode(&mut proto_bytes).unwrap();
let msgpack_bytes = rmps::to_vec(&event).unwrap();
// Print sizes once
println!("\nPayload sizes:");
println!(" JSON: {} bytes", json_bytes.len());
println!(" Protobuf: {} bytes", proto_bytes.len());
println!(" MessagePack: {} bytes", msgpack_bytes.len());
group.bench_function("json", |b| {
b.iter(|| serde_json::from_slice::<UserEvent>(black_box(&json_bytes)).unwrap())
});
group.bench_function("protobuf", |b| {
b.iter(|| proto::UserEvent::decode(black_box(proto_bytes.as_slice())).unwrap())
});
group.bench_function("msgpack", |b| {
b.iter(|| rmps::from_slice::<UserEvent>(black_box(&msgpack_bytes)).unwrap())
});
group.finish();
}
// ── Throughput benchmark (sustained ops/sec) ───────────────────────────────
fn bench_throughput(c: &mut Criterion) {
let mut group = c.benchmark_group("roundtrip_throughput");
let event = make_event();
let proto_event = make_proto_event();
for size in [1u64, 100, 1000] {
group.throughput(Throughput::Elements(size));
group.bench_with_input(BenchmarkId::new("json", size), &size, |b, &n| {
b.iter(|| {
for _ in 0..n {
let bytes = serde_json::to_vec(black_box(&event)).unwrap();
let _: UserEvent = serde_json::from_slice(&bytes).unwrap();
}
})
});
group.bench_with_input(BenchmarkId::new("protobuf", size), &size, |b, &n| {
b.iter(|| {
for _ in 0..n {
let mut buf = Vec::new();
black_box(&proto_event).encode(&mut buf).unwrap();
let _ = proto::UserEvent::decode(buf.as_slice()).unwrap();
}
})
});
group.bench_with_input(BenchmarkId::new("msgpack", size), &size, |b, &n| {
b.iter(|| {
for _ in 0..n {
let bytes = rmps::to_vec(black_box(&event)).unwrap();
let _: UserEvent = rmps::from_slice(&bytes).unwrap();
}
})
});
}
group.finish();
}
criterion_group!(benches, bench_serialize, bench_deserialize, bench_throughput);
criterion_main!(benches);
Run with:
cargo bench
# HTML report at target/criterion/index.html
What the Numbers Typically Look Like
Run this yourself and drop your actual numbers in — here's what to expect on a modern machine:
Payload sizes:
JSON: 187 bytes
Protobuf: 68 bytes ← 64% smaller than JSON
MessagePack: 113 bytes ← 40% smaller than JSON
serialize/json time: ~850 ns
serialize/protobuf time: ~220 ns ← ~4x faster
serialize/msgpack time: ~380 ns ← ~2x faster
deserialize/json time: ~1,100 ns
deserialize/protobuf time: ~180 ns ← ~6x faster
deserialize/msgpack time: ~420 ns ← ~2.5x faster
The story is consistent: Protobuf wins on every performance metric. But MessagePack is a genuine middle ground — meaningfully faster and smaller than JSON, with zero extra tooling.
At 50k events/second (the situation I opened with), those nanoseconds compound fast. The difference between JSON and Protobuf at that throughput is tens of milliseconds of CPU time per second. Per core. Every second.
Real-World Patterns
API Request/Response
For a public-facing API, JSON is still the right answer — external clients expect it. For internal service-to-service calls, the choice opens up:
// Public API handler — JSON, always
async fn public_handler(body: web::Json<UserEvent>) -> impl Responder {
let response = process_event(body.into_inner());
web::Json(response) // serde_json handles it
}
// Internal service handler — MessagePack for speed, no schema overhead
async fn internal_handler(body: web::Bytes) -> impl Responder {
let event: UserEvent = rmps::from_slice(&body).unwrap();
let response = process_event(event);
let bytes = rmps::to_vec(&response).unwrap();
HttpResponse::Ok()
.content_type("application/msgpack")
.body(bytes)
}
// gRPC handler — Protobuf, this is what it's built for
async fn grpc_handler(&self, req: Request<UserEvent>) -> Result<Response<EventResponse>, Status> {
let event = req.into_inner();
let response = process_proto_event(event);
Ok(Response::new(response))
}
Redis Storage — Size Comparison
The difference in Redis memory usage is real and compounds at scale:
use redis::Commands;
struct EventStore {
conn: redis::Connection,
}
impl EventStore {
// JSON — human-readable in redis-cli, larger
fn store_json(&mut self, event: &UserEvent) {
let val = serde_json::to_string(event).unwrap();
// 187 bytes for our example event
let _: () = self.conn.set(
format!("event:json:{}", event.user_id), val
).unwrap();
}
// Protobuf — smallest footprint, opaque in redis-cli
fn store_proto(&mut self, event: &proto::UserEvent) {
let mut buf = Vec::new();
event.encode(&mut buf).unwrap();
// 68 bytes for our example event
let _: () = self.conn.set(
format!("event:proto:{}", event.user_id), buf
).unwrap();
}
// MessagePack — good balance, works with existing serde structs
fn store_msgpack(&mut self, event: &UserEvent) {
let bytes = rmps::to_vec(event).unwrap();
// 113 bytes for our example event
let _: () = self.conn.set(
format!("event:msgpack:{}", event.user_id), bytes
).unwrap();
}
}
At 10 million events stored in Redis:
- JSON: ~1.87 GB
- MessagePack: ~1.13 GB — 40% less
- Protobuf: ~0.68 GB — 64% less
That's not a micro-optimisation. That's real money on your Redis bill.
The Decision Table
| JSON | Protobuf | MessagePack | |
|---|---|---|---|
| Human-readable | Yes | No | No |
| Schema required | No | Yes | No |
| Serialize speed | Slow | Fast | Medium |
| Deserialize speed | Slow | Fastest | Medium |
| Payload size | Large | Tiny | Small |
| Schema evolution | Flexible | Strict (safe) | Flexible |
| Drop-in for serde | Yes | No | Yes |
| Rust ergonomics | Excellent | Good | Excellent |
| Best for | Public APIs, config, logs | Internal services, gRPC, high-throughput | Internal APIs, queues, Redis |
Wrapping Up
Here's where I land after having been burned by each of these at one point or another:
Use JSON for anything public-facing, anything a human might need to read, and anything early-stage where you're still figuring out your schema. serde_json is genuinely excellent and the ergonomics are hard to beat. The rule: use JSON until it hurts.
Use MessagePack when JSON starts hurting but you don't want to deal with .proto files. It's a genuine drop-in — if you're already on serde, the migration is one afternoon. Smaller payloads, faster serialization, zero schema overhead. This is the one I wish I'd known about earlier.
Use Protobuf when performance is non-negotiable, you have multiple services that need to agree on a schema, or you're building a gRPC API. The setup cost is real but the payoff is real too. And the schema enforcement means you'll catch breaking changes at compile time instead of at 3am.
The 50k events/second service I opened with? We moved it to MessagePack in a day. Protobuf would've been faster still but the .proto file maintenance wasn't worth it for our team size. Sometimes "good enough and done by lunch" beats "theoretically optimal and shipped next sprint."
Next in this series: gRPC Streaming in Rust with Tonic — server streaming, client streaming, and bidirectional. Spoiler: Protobuf comes back, and this time it's very much worth the setup cost.
Top comments (0)