DEV Community

arcker
arcker

Posted on • Originally published at arcker.org

Every elevator has a load plate. Tests are supposed to kill fear, not feed it.

Every elevator has a small metal plate near the doors: 8 persons — 630 kg. It doesn't mean the car snaps at the ninth person. It means an engineer loaded it, measured it, and signed an envelope: this is what I guarantee, within these limits, with margin. The plate isn't a confession of weakness. It's the exact thing that lets you step in without a second thought.

I build Lithair — a small, memory-first Rust web framework. This post isn't about its API; it stands on its own. It's about a sentence I've come to believe — DevOps without the test isn't DevOps — and the reason for it is probably not the one you expect. It's not about laziness. It's about fear.

The plate is permission to stop worrying

Look around and the load-bearing things in the physical world all carry their envelope stamped on the outside. A bridge has a weight rating. A cable has an amperage. A climbing carabiner has a kN number etched into the spine. None of those numbers are bureaucracy. Each one is permission: you don't have to be afraid of this, here's exactly how far it's been taken. The plate converts a private unknown into a public, re-checkable fact, and the reward for that conversion is that nobody has to feel the unknown anymore.

That is what a test is for. Not a gate. Not a ritual. A way to stop being afraid of your own system.

What I've actually seen

I've spent years around enterprise software, and when real testing existed, it usually wore one of two shapes — and both, underneath, were shaped by fear.

There were the dedicated performance teams: a silo you handed the question to, where the schedule was measured in quarters, and where sometimes the software itself had to be bent to fit the benchmark rather than the other way around. Real rigor, but distant — the envelope lived in someone else's backlog.

And there were the functional-test teams who panicked the moment anyone touched anything. Change the color of a logo and you'd get a straight face and the words "non-regression testing." Not because they were unreasonable — because they had no fast, trustworthy way to know whether your one-line change broke something three modules away. So they did the only thing fear leaves you: they said don't touch it.

I want to be careful here, because it's easy to read this as contempt and it isn't. That fear is completely rational when you can't verify. If I couldn't check, I'd guard the gate too. The tragedy isn't the fear. It's that the one thing built to dissolve it — automated tests, ideally backed by a dedicated tool — so often gets turned into another thing to be afraid of.

Tests are the cure for that fear

A test you actually trust does something quietly radical: it gives you back the right to change things. Change the logo color and the suite tells you, in seconds, that the checkout flow still works. Change the storage engine and the harness tells you the throughput still holds. The fear doesn't get managed or scheduled or escalated — it just leaves, because the unknown it fed on is now a number on a screen.

That's the whole point of the plate. You can step into the elevator because someone already took it to the edge and wrote down where the edge is. Automated tooling is how you stamp that plate cheaply enough to do it for everything, not just the one system important enough to rent a perf team. And DevOps — the actual discipline, not the job title — is made for exactly this: turning "don't touch it" into "touch it freely, the tooling will tell you the truth."

So we built the bench

Lithair's cluster had no performance team behind it and no QA gate in front of it. There was just the question — what does it hold? — and nobody to hand it to. So the test became the work: we wrote our own stress harness and ran it until the numbers stopped being opinions.

A three-node cluster. 170,200 writes. Zero drops, zero panics, zero replication divergence across all three nodes. Good numbers — but the one that actually matters isn't a success, it's a ceiling: ~210–240 single-writes per second, per leader. Past that wall, the latency you measure isn't instability — it's pure queueing against the ceiling. The mild decay over a long run (241 → 206 ops/s) traces to in-memory state growth, which is exactly what a memory-first design predicts — not a leak.

We wrote all of it down. That's the plate. And the real payoff isn't the throughput number — it's that I can now change the cluster code and not be afraid of it, because the harness will tell me the moment I leave the envelope.

A plate only kills fear if it's honest

This is the part that's tempting to skip. A load rating that hides its caveats is worse than none, because it manufactures fresh fear later — the kind that shows up at 3 a.m. So the runbook names the ugly bits out loud:

  • The leader election isn't Raft — it's static, lowest live node ID wins. Always keep a node 0.
  • There's a brief two-leaders window when an old leader rejoins after a partition. Documented and non-blocking — committed split-brain is prevented elsewhere, by majority-ack on writes — but it's real, so it's on the page.
  • A couple of endpoints are still stubs in this version. Named, not buried.

None of that is comfortable to publish under your own name. All of it belongs on the plate. A caveat you wrote down is one nobody has to discover in production.

Why the test is the deliverable, not the feature

The cluster code already worked before any of this started — replication, election, failover, all running. What it lacked wasn't capability. It lacked a plate, and you cannot stamp a plate you didn't measure.

"Production-ready" without an envelope test is an adjective: you believe it or you don't, and belief is just fear wearing optimism. With the test it becomes a falsifiable claim anyone can re-run — and a falsifiable claim is the only kind that lets a whole team relax. Strip the test out and what's left isn't speed or risk, it's the fear: the "don't touch it," the non-regression theater over a logo color, the silo nobody wants to disturb. DevOps was supposed to be the discipline that retires all of that. Without the test, you've kept the dashboards and the dread, and quietly thrown away the engineering.

You can put a plate on almost anything

The format generalizes to nearly every load-bearing thing you ship:

  • an API's requests/sec before tail latency cliffs,
  • a queue's depth before backpressure kicks in,
  • a batch job's row count before it OOMs,
  • a cluster's writes/sec before it's just queueing.

It's always the same shape: here's what I hold, here's why it moves, here's where you leave the tested region. And the reward is always the same too — not bragging rights, but the simple ability for you and the people around you to stop being afraid of your own system. Sometimes the entire tax is one afternoon with a stress harness and the nerve to write down what you actually find.


The full numbers, the cluster runbook, and the v0.13.0 release that earned the plate are in the companion post on arcker.org (in French).

Top comments (0)