The merge train was green. Canary baked six hours. Dashboards: healthy.
Friday morning, customers can't get through checkout.
Error rate: normal. Latency: normal. Nothing in the exception tracker. The system is healthy. The feature is broken.
Tests verify code paths. That's the job. But "does this function do what I said" is a different question from "is checkout working right now for real people." These fail independently. A team that trusts the first to answer the second has its confidence in the wrong place.
I walked into Brandfolder to a product going down on a memory leak. Suite was green the whole time. Once we could watch the process behave under real load, the fix came fast. Before that, we were tightening bolts that weren't loose.
When a fleet opens dozens of clean PRs a day, nobody's reading every diff with the right fear in their gut. The only honest source of truth is the running system.
Instrument outcomes, not just exceptions. A user who didn't finish a flow is a signal, even when nothing threw. A 200 with an empty body is the happiest-looking failure in your whole system.
Full essay: https://imacto.com/writing/the-tests-are-green
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)