Pavel Sanikovich

Posted on Nov 23

High-Performance Marshaling Strategies in Go — What Actually Works at Scale

#go #performance #bench #senior

Marshaling sounds like a solved problem.
You call json.Marshal or proto.Marshal, send the bytes across the wire, and move on with your life.

But once you hit real load — tens of thousands of messages per second, strict p95 budgets, or aggressive CPU constraints — marshaling becomes one of the biggest sources of latency, garbage, and inefficiency.

I didn’t believe it at first either.
Then I profiled a production system and saw 20–40% of CPU time spent on serialization alone.

In this final article of the series, I’ll walk through every marshaling strategy that actually matters, why it works, where it fails, and how to choose the right approach depending on your system’s requirements.

Let’s get into it.

1. The Truth About Marshaling: It’s Always on the Hot Path

You can usually optimize:

DB queries
cache lookups
goroutine pools
handlers

…but marshaling happens every single time you:

respond to a client
publish an event
log structured data
serialize to Redis
write to Kafka or Redpanda
store snapshots

Marshaling is unavoidable — which makes it one of the highest-ROI optimizations.

When we optimized it in a highload Go service, we saw:

p95 latency: −40%
CPU: −22%
GC pauses: much smoother
memory footprint: reduced
node count: went from 10 → 7

All from marshaling improvements.

2. Strategy #1 — “Faster JSON” (But Still JSON)

A lot of teams start here because:

JSON is universal
JSON is easy
JSON works
JSON has tooling everywhere

And yet, encoding/json is painfully slow.
Reflection-heavy. Alloc-heavy. Predictably unpredictable.

If you must keep JSON, these are your options:

Option A: jsoniter (drop-in, fastest JSON)

import jsoniter "github.com/json-iterator/go"

var json = jsoniter.ConfigCompatibleWithStandardLibrary

data, _ := json.Marshal(v)

Pros:

drop-in replacement
easy adoption
faster than stdlib

Cons:

not the fastest JSON possible
still allocates
still string-based

Good for: APIs, moderate load.

Option B: easyjson (codegen, zero reflection)

Generate struct-specific serializers:

easyjson -all model.go

Pros:

massive speed increase (up to 2–3x)
near-zero reflection
fewer allocations
stable, widely used

Cons:

code generator boilerplate
must remember to re-generate

Good for: High-performance JSON systems.

Option C: gojay (streaming-based, insane speed)

For very large JSON payloads, gojay outperforms everyone.

Good for: Huge arrays, logs, bulk data.

3. Strategy #2 — Switch to Binary Formats

JSON is text. Text is slow.

Binary formats solve the problem from both ends:

smaller payload
faster encode/decode

The two strongest contenders:

MessagePack

import "github.com/vmihailenco/msgpack/v5"

data, _ := msgpack.Marshal(v)

Pros:

3× faster than JSON
schema-flexible
smaller payloads
drop-in over JSON

Cons:

still not the fastest
still allocates
requires consumer compatibility

Protobuf

data, _ := proto.Marshal(msg)

Pros:

5–10× faster than JSON
smallest payloads
strongly typed
versioning support
industry standard

Cons:

requires .proto
must maintain schemas
learning curve

Good for: Microservices, highload, RPC, messaging.

$ 4. Strategy #3 — Code Generation

Codegen is the “safe” way to get the performance of unsafe/low-level code without using unsafe.

Types of codegen serializers:

easyjson (JSON)
ffjson (JSON)
msgp (MessagePack codegen)
protoc (Protobuf)
flatbuffers
capnproto

Advantages:

no reflection
predictable performance
fewer allocations
static typing
extremely fast

Disadvantages:

codegen step
build complexity
more generated code to audit

If you want maximum speed without unsafe, this is the way.

5. Strategy #4 — Buffer Reuse and Preallocation

Reflection is slow — but allocation is worse.

Most serialization libraries allocate temporary buffers on each call.

You can eliminate that by reusing buffers.

Example:

buf := make([]byte, 0, 1024)
encoder := json.NewEncoder(bytes.NewBuffer(buf))
encoder.Encode(v)

Results:

fewer allocations
fewer GC cycles
more stable tail latencies

This is a free optimization that many teams miss.

6. Strategy #5 — Zero-Copy Techniques (Unsafe, Advanced)

This strategy delivers the most brutal performance improvements.

Example: convert string <-> []byte without copying:

func BytesToString(b []byte) string {
    return unsafe.String(&b[0], len(b))
}

Or reinterpret struct data:

h := (*Header)(unsafe.Pointer(&buf[0]))

You eliminate entire memory copies.
But you pay with safety.

Use only when:

data is immutable
you understand Go’s memory model
you control the lifecycle of buffers
you benchmarked gains

Unsafe can reduce CPU by 10–40% in serialization-heavy paths.

7. Strategy #6 — Custom Marshaling (Manual or Semi-Manual)

Sometimes you need full control.

Example: hand-written binary marshaler:

func (t *Trade) MarshalBinary() []byte {
    b := make([]byte, 16)
    binary.LittleEndian.PutUint64(b[0:], math.Float64bits(t.Price))
    binary.LittleEndian.PutUint64(b[8:], uint64(t.Quantity))
    return b
}

Advantages:

fastest possible
smallest payload
fully deterministic
zero reflection

Disadvantages:

extremely verbose
must maintain manually
brittle
requires deep understanding

Use only for ultra-critical hot paths.

8. Strategy #7 — Streaming Marshaling for Large Payloads

For large responses (10 KB+), use streaming.

enc := json.NewEncoder(w)
enc.Encode(v)

This avoids:

giant temporary buffers
multi-step copying
unnecessary heap pressure

Streaming regularly reduces large-response latency by 10–25%.

9. Strategy #8 — Removing Fields (Payload Hygiene)

The easiest and most underrated optimization:

Reduce payload size.

When we audited our payloads, we found:

25–30% of fields were unused
10–15% were redundant
5–10% could be computed on client side

Removing junk:

shrank payloads
reduced CPU
reduced latency
improved UX

This is the highest ROI improvement after switching formats.

10. Real Benchmarks from Production

When we implemented these techniques (in stages), total improvements were:

CPU: −28%
p95 latency: 6.7ms → 2.4ms
p50 latency: 2.4ms → 1.1ms
allocs/op: down 40–70%
network cost: −20–60% depending on format

All without touching the database or main logic.

Serialization alone made these gains.

11. Choosing the Right Strategy

Here’s the decision tree we use in real projects.

If you must keep JSON

Use jsoniter or easyjson.

If you want a fast drop-in binary format

Use MessagePack.

If you're designing microservices

Use Protobuf (ideal).

If you want maximum control

Use codegen or manual marshalers.

If you need extreme performance

Use unsafe zero-copy + custom binary format.

If payloads are massive

Use streaming.

If everything is slow

Start by removing fields.

12. Senior-Level Takeaways

Marshaling is a major performance bottleneck in most systems.
JSON is the slowest option, but JSON libs vary drastically in speed.
Binary formats are the fastest real-world choice.
Codegen is the sweet spot between safety and speed.
Buffer reuse is essential for stable latency.
Unsafe can be used safely, but only by experts.
Manual marshaling is unbeatable when done well.
Reducing payload size improves everything across the board.

High-performance marshaling is the combination of:

the right format + the right strategy + the right level of control.

13. Want to Go Deeper?

These Educative courses shaped the way I think about performance, distributed systems, and low-level engineering:

Highly recommended for backend engineers working at scale.

DEV Community

High-Performance Marshaling Strategies in Go — What Actually Works at Scale

1. The Truth About Marshaling: It’s Always on the Hot Path

2. Strategy #1 — “Faster JSON” (But Still JSON)

Option A: jsoniter (drop-in, fastest JSON)

Option B: easyjson (codegen, zero reflection)

Option C: gojay (streaming-based, insane speed)

3. Strategy #2 — Switch to Binary Formats

MessagePack

Protobuf

$ 4. Strategy #3 — Code Generation

5. Strategy #4 — Buffer Reuse and Preallocation

6. Strategy #5 — Zero-Copy Techniques (Unsafe, Advanced)

7. Strategy #6 — Custom Marshaling (Manual or Semi-Manual)

8. Strategy #7 — Streaming Marshaling for Large Payloads

9. Strategy #8 — Removing Fields (Payload Hygiene)

10. Real Benchmarks from Production

11. Choosing the Right Strategy

If you must keep JSON

If you want a fast drop-in binary format

If you're designing microservices

If you want maximum control

If you need extreme performance

If payloads are massive

If everything is slow

12. Senior-Level Takeaways

13. Want to Go Deeper?

Top comments (0)