I'm a big fan of aviation, and one lesson from aviation safety has always stuck with me: accidents rarely happen because of a single mistake. Instead, they're usually the result of several small failures lining up in just the wrong way. Pilots and safety engineers often describe this through the Swiss cheese model: every safety layer has holes, but those holes normally don't matter because another layer catches the problem. An accident only becomes possible when all the holes align.
I accidentally learned the same lesson while working on a frontend application.
The accident involved an icon.
It Started as a DX Improvement
Our codebase used a large icon library. Rendering an icon required importing an SVG and passing it to a helper function. It worked, but it was repetitive and made components harder to read.
To improve the developer experience, I built an icon component. Instead of importing assets manually, developers could provide a name, size, and variant. The component also improved autocomplete and discoverability by narrowing available options based on the selected icon.
Behind the scenes, a build plugin scanned the codebase, detected which icons were being used, and bundled only those assets. The result was great: hundreds of import lines disappeared, icon usage became standardized across the application, and the codebase became easier to work with.
For months, everything worked perfectly.
The Incident
One afternoon I merged a small pull request that replaced an icon in the application's sidebar handle.
A few minutes later, reports started arriving. The sidebar was flickering, part of the application header was somehow rendering inside the sidebar, a giant black bar appeared on the screen (WTF?), navigation stopped working, network requests entered an infinite loop, and eventually browser tabs crashed.
The timing made the culprit obvious. I immediately reverted the change to stop the bleeding and assumed I had somehow broken the icon component itself.
That turned out to be only a small part of the story.
What I eventually discovered wasn't one bug. It was five independent failures that had quietly accumulated over time.
Failure #1: The Build Plugin Missed the Icon
The icon component relied on a build plugin that scanned the repository and bundled the required icon assets. The problem was that our codebase was a modular monolith and the plugin wasn't scanning every package. The newly introduced icon lived in a package outside the scan scope, so as far as the build process was concerned, that icon didn't exist.
No asset was generated. The first hole had appeared.
Failure #2: Missing Assets Didn't Return 404
At runtime, the component tried to fetch the generated SVG. Since the file didn't exist, a 404 would have stopped the entire chain immediately.
Instead, our SPA routing configuration returned index.html for unknown routes. The request succeeded with 200 OK, but the response body contained an entire HTML document. The component expected an SVG and received the application itself.
A second hole aligned with the first.
Failure #3: The Component Trusted the Response
The icon component only checked whether the request succeeded. It never verified what was actually returned.
A successful response was assumed to contain valid SVG content, so the HTML document was injected into the page exactly as if it were an icon.
A third hole.
Failure #4: The Browser Did Exactly What It Was Supposed To
Up until this point, I thought I understood the bug. Missing icon, bad response, end of story.
I was wrong.
This was the part that surprised me the most.
When the browser encountered an HTML document inside an SVG context, it didn't simply ignore it. Instead, it broke out of SVG parsing and started creating real HTML elements. One of those elements happened to be our application's root component.
The browser effectively booted a brand-new copy of the application inside the icon. That new application rendered the same sidebar, which requested the same missing icon, which fetched the same HTML page, which created another application instance. The cycle repeated indefinitely until the browser eventually collapsed under its own weight.
Somehow, an icon had become a recursive application launcher.
At this point, four independent failures had lined up.
Failure #5: Our Tests Couldn't See the Problem
The final failure was the one that annoyed me the most.
Our visual tests never detected the issue because icon resolution worked differently outside production. In development and testing environments, icons were loaded directly from the library. In production, they were loaded from generated assets emitted by the build plugin.
The failing code path only existed in production. Our tests were validating a different system than the one users were actually running.
The Real Root Cause
When people hear this story, they usually ask which bug caused the outage.
I don't think that's the right question.
Every individual failure was relatively harmless:
- A plugin scanned the wrong folders.
- A missing asset returned HTML instead of a 404.
- A component trusted the response.
- A browser followed its parsing rules.
- A test environment differed from production.
None of these issues would have caused an outage on their own.
The outage happened because every layer assumed the previous layer had already validated something. The build process assumed icon references were correct. The server assumed unknown routes should fall back to the SPA. The component assumed successful responses contained SVG. The tests assumed production behaved like development.
Every layer trusted the previous layer.
Nobody verified.
That's what actually broke production.
Fixing It the Aviation Way
Once we understood the chain of failures, we stopped looking for a single fix. Aviation doesn't rely on a single safety mechanism, and neither should software systems.
First, we fixed the build plugin so it scanned every package in the repository. The missing icon could no longer disappear during the build.
Second, we fixed the runtime environment so missing icon requests returned a real 404 before any SPA fallback logic could run. Even if an icon somehow went missing again, the browser would receive an explicit failure instead of the application's HTML page.
Finally, we hardened the icon component itself. Before rendering any response, it now verifies that the payload actually looks like SVG content. We also cache failures so missing assets cannot trigger endless retries.
Any one of those changes would have prevented the incident. Together, they make it significantly harder for a similar failure chain to form again.
Final Thoughts
The funny thing is that the original icon component was still a good idea. It improved the developer experience, reduced boilerplate, and made the codebase easier to work with.
The mistake wasn't building the abstraction. The mistake was assuming the abstraction only needed to work when everything else behaved correctly.
Looking back, what surprised me wasn't that an icon was missing. What surprised me was how many parts of the system silently assumed somebody else had already validated things. The build process trusted the source code, the server trusted the routing configuration, the component trusted the response, and the tests trusted that production behaved like development.
We fixed the build plugin, fixed the server, and hardened the component. The icon works now, but more importantly, the next missing icon should fail in at least three different places before it reaches production.
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.