I want to preface this by saying this is not a "look what I built"
post. This is more of a "I kept hitting the same wall and eventually
decided to do something about it" post.
How it started
I was playing Path of Exile 2 a few weeks back and noticed some
obvious desync — my character position on my screen not matching
what the server thought. Classic multiplayer networking bug.
I did not think much of it at the time.
Then I hit a similar issue using a Go based VPN app I run on my
own infrastructure. Sessions dropping silently under load.
No obvious error. Just wrong behaviour.
Then the same class of bug showed up in a Go project I was
building at work. I was parsing tightly packed UDP binary packets
and somewhere in my manual bit shifting code I had an offset wrong.
One bit. Everything downstream was garbage.
Three times. Three different contexts. Same root cause.
At that point I stopped and actually looked for a proper solution.
What I found
Python developers have had
construct since 2002.
You define your packet schema declaratively — field name,
type, bit width — and the library handles everything.
No bit math. No manual shifting. Used in production
by the Python network engineering community for over
two decades. It is the gold standard for this problem.
Go has nothing like it.
That is not an accident. Go's designers deliberately left
bitfields out of the language. C has native bitfields
but the C standard never defined which direction bits
are ordered in memory — every compiler decides for itself.
Go wanted no part of that ambiguity.
In 2019 someone filed a formal proposal —
golang/go#29650
— requesting native bit and byte allocation in structs.
It was closed without action, labelled FrozenDueToAge.
The Go team's position is clear — this belongs in
a library, not the language.
So I looked for the library.
Three packages exist in the Go ecosystem:
-
jmatsuzawa/go-bitfield— Unmarshal only. Zero stars. Marshal listed as a TODO in their own README. -
encodingx/binary— Has both Marshal and Unmarshal but requires wrapping every struct in nested word-structs. Awkward API, not idiomatic Go. -
Velocidex/vtypes— A forensics tool driven by JSON profiles. Not designed for production network code.
None of them have Validate, Explain, or Diff.
All effectively abandoned.
That was the gap. I decided to fill it.
What I built
I called it nibble — a nibble is 4 bits, half a byte.
Felt right for a library that works at the bit level.
The idea is simple. Python's construct lets you describe
a binary protocol in plain readable code and the library
does all the bit math for you. nibble brings that same
philosophy to Go using something Go developers already
know — struct tags.
The same pattern you use for JSON:
Name string `json:"name"`
nibble uses the same convention:
Health uint16 `bits:"9"`
You define your packet schema once as a struct.
Every field gets a bits tag telling nibble exactly
how wide that field is on the wire. That single
definition drives everything.
Here is a real game packet example:
type GamePacket struct {
IsAlive bool `bits:"1"` // 1 bit — true or false
WeaponID uint8 `bits:"4"` // 4 bits — up to 16 weapons
TeamID uint8 `bits:"2"` // 2 bits — up to 4 teams
Health uint16 `bits:"9"` // 9 bits — 0 to 511
PosX int16 `bits:"12"` // 12 bits — signed position
PosY int16 `bits:"12"` // 12 bits — signed position
Rotation uint8 `bits:"8"` // 8 bits — 0 to 255
Score uint32 `bits:"16"` // 16 bits — 0 to 65535
}
// Total: 64 bits = 8 bytes exactly
One struct. Eight fields. Eight bytes on the wire.
That definition gives you five functions automatically.
Marshal — pack your struct into raw bytes ready
to send over the wire:
data, err := nibble.Marshal(&p)
Unmarshal — parse incoming bytes back into your struct:
err := nibble.Unmarshal(data, &p)
Validate — nibble knows the maximum value for every
field from its bit width. A 9-bit field cannot exceed 511.
nibble enforces this for every field automatically with
no extra code from you:
err := nibble.Validate(&p)
// returns ErrFieldOverflow if any field exceeds its bit width
Explain — this is the one I personally find most useful.
Paste in raw hex bytes and nibble tells you exactly what
every single bit means in plain English:
result, err := nibble.Explain(data, GamePacket{})
Output:
Byte 0 [10110011]:
bit 0 → IsAlive: true (1)
bits 1-4 → WeaponID: 9 (1001)
bits 5-6 → TeamID: 1 (01)
bit 7 → Health: [continues...]
Byte 1 [00011111]:
bits 0-8 → Health: 225
I cannot tell you how many hours this would have saved me
staring at hex dumps trying to work out which bits
belong to which field.
Diff — compare two packets field by field.
Invaluable when you are trying to understand what
changed between two captures:
diffs, err := nibble.Diff(&packetA, &packetB)
// Field Before After
// Health 100 75
// PosX -86 -90
When your protocol changes — say health expands from
8 bits to 9 bits — you update one struct tag.
nibble recalculates everything. The bit math,
the validation range, the marshal and unmarshal logic
— all updated automatically. No hunting through
manual bit shift code hoping you caught every place
that needs updating.
Truncated packet arriving from the network?
nibble returns ErrInsufficientData — not a panic.
Your server stays up.
Field overflow from a malicious client?
Caught at the library level before it ever
reaches your application logic.
The benchmarks
I will be honest — the first version was not fast.
2,100 nanoseconds per packet. Around 300 times slower
than manual bit manipulation. That was reflection
running on every single call with no caching.
Not acceptable.
The fix was schema caching. Parse the struct tags once,
cache the full field layout in memory, reuse it on
every subsequent call.
Result: 182 nanoseconds per packet. An 11x improvement
from one optimisation.
Against go-bitfield — the closest existing library —
nibble is consistently 10 times faster across every
dataset size. 1,825 nanoseconds versus 182.
And unlike go-bitfield, nibble actually has
a Marshal path. They never implemented it.
Against manual bit manipulation — yes, manual wins
on raw speed. 6 nanoseconds versus 182.
I am not hiding that number. Manual bit math
at the CPU level will always be faster than
reflection based code.
But here is what that gap means in practice.
At 100,000 packets per second — a realistic
production game server load — nibble consumes
less than 2% of one CPU core. The other 98%
is free for your actual application logic.
One thing worth being upfront about: nibble currently
has 2 heap allocations per operation. Manual has zero.
Eliminating those allocations is the next target
for v0.2.0.
The full interactive benchmark report is here if
you want to dig into the numbers:
https://pavankumarms.github.io/nibble-benchmark/
And if you want to reproduce them yourself:
git clone https://github.com/PavanKumarMS/nibble-benchmarks
cd nibble-benchmarks
go test -bench=. -benchmem -benchtime=10s ./...
Who this is for
If you work with any of these in Go — this is for you:
- Game server backends — real-time UDP state packets
- IoT device hubs — sensor payloads, CAN bus frames, BLE characteristics
- VPN tooling — WireGuard, Tailscale, Nebula all written in Go, all parsing binary headers
- Network security — packet analysis, protocol fuzzing, CTF tooling
- Automotive — CAN bus and EV telemetry backends
And honestly — if you have ever written something like
this in Go and stared at it wondering what it does:
weaponID := (data[0] >> 1) & 0xF
This is for you.
Installation
go get github.com/PavanKumarMS/nibble@v0.1.0
What is next
There is plenty of room to improve and I have
a clear list:
- Eliminate the 2 heap allocations per operation
- Pre-built schemas for common protocols (TCP, UDP, DNS, BLE, CAN bus)
- Streaming encoder and decoder
- Schema DSL for cross-language code generation
One more thing
This is also my first time making a video
instead of writing a post. Turns out talking
to a camera is harder than debugging bit fields.
If you prefer watching over reading the 12 minute
walkthrough covers the whole story — the problem,
the existing solutions, how nibble works,
and what the benchmarks actually mean.
Links
- Library: https://github.com/PavanKumarMS/nibble
- Benchmarks: https://pavankumarms.github.io/nibble-benchmark/
- Go proposal that was closed: https://github.com/golang/go/issues/29650
- Python construct library: https://construct.readthedocs.io
If you are working with binary protocols in Go
I would genuinely love to hear your use case.
What are you parsing? Is there something nibble
is missing that would make it useful for you?
Open an issue, drop a comment, or just star the
repo if you think this fills a gap you have seen too.
Top comments (0)