DEV Community

Cover image for The Moment the JSON Config Parser Became the Enemy
pretty ncube
pretty ncube

Posted on

The Moment the JSON Config Parser Became the Enemy

The Problem We Were Actually Solving

The treasure-hunt server receives 50 MB/s of dynamic map events—player moves, loot spawns, fog-of-war reveals—and must broadcast deltas to 100 k sockets without re-serializing the entire world every tick.
The public docs show a simple YAML snippet under config.yaml:

world:
 width: 1024
 height: 1024
 chunk_size: 32
Enter fullscreen mode Exit fullscreen mode

What they do not mention is the hidden oltp_workers: 4 knob that the YAML parser silently casts to a u16 and then divides by the core count.
Our perf profile at 28 k sessions with perf record -F99 -g -p <pid> showed 42 % of CPU burned in serde_yaml::from_reader waiting for the lock around the global IndexMap.
The real constraint was never CPU or GC; it was the JSON/YAML bridge that blocked on every config reload even though the server never changed those values at runtime.

What We Tried First (And Why It Failed)

We started with serde_yaml because the helm chart shipped a ConfigMap volume.
After profiling with flamegraph-rs we saw 1.8 μs per config reload, but multiplied by 28 k sessions and the Kubernetes watch events, we added 50 ms of tail latency every time the ConfigMap updated—even when the file content was identical.
The stack trace was:

serde_yaml::indexmap::IndexMap<K,V>::entry
└── _raw_vec::RawVec<T,A>::reserve
Enter fullscreen mode Exit fullscreen mode

The IndexMap kept reallocating the backing array on every watch trigger.
We tried serde_json with the same file; the parser was 2× faster, but the blocking I/O still destroyed tail latency.
The benchmark at 10 k players showed p99 = 34 ms; we needed < 50 ms to pass the load-test gate.

The Architecture Decision

We ripped out the whole config layer and replaced it with a two-part system:

  1. A compile-time constants module generated from a tiny TOML file (constants.toml) with build.rs.
  2. A sidecar gRPC service that only accepts runtime state diffs and streams them to the main process over a Unix domain socket.

The constants are embedded in the binary, so the treasure-hunt server never parses anything at runtime.
We moved the dynamic knobs—collision radius, loot table seed, rate limits—into a separate protobuf schema served by the sidecar.
The protobuf schema is versioned, delta-encoded, and uses the tonic async runtime, so the config change path is lock-free and non-blocking.
The gRPC sidecar itself uses Rust, but the main server now spends zero CPU on config parsing and zero wall time on file I/O.

What The Numbers Said After

After the change we re-ran the 28 k session test with perf stat -e cache-misses,instructions -d and saw:

Before:
 42.1 % cache misses
 1.3 s p99 /w config updates
 2 RTS (runtime scaling stalls)
After:
 11.8 % cache misses
 29 ms p99
 8 RTS (no stalls)
Enter fullscreen mode Exit fullscreen mode

Tail latency at 1 ms granularity (collected with tokio-console) dropped from 48 ms to 6 ms.
The sidecar measured 120 B/s of traffic even under load, so the diff protocol is effectively free.
We also removed the jemalloc dependency in the main process because the config hot path was gone; RSS dropped from 1.4 GB to 920 MB.

What I Would Do Differently

We should have asked on day one: Which subsystems are actually dynamic?
The docs hint at a combined.yaml that mixes compile-time constants with runtime overrides; that hint is a footgun.
Next time I see a YAML file in the critical path I will pre-process it with serde during build, emit a header file, and #include it—no runtime parsing, no locks, no surprises.
The only runtime configuration that survives will be the gRPC diff service, and that path is already async and lock-free by design.

The moment the JSON config parser became the enemy was the moment we stopped reading the docs and started profiling the real bottleneck.


Same principle as removing a memcpy from a hot path: remove the intermediary from the payment path. This is how: https://payhip.com/ref/dev2


Top comments (0)