I Deleted 40,000 Lines of "Dead Code" — Production Broke in 3 Minutes
We all hate dead code. It's the junk drawer of your codebase — nobody knows what it does, nobody's touched it in years, and it's just sitting there taking up space.
So when I ran our code coverage tool and it showed 40,000 lines with zero references, I felt like a hero. I was about to clean up this mess. The PR description was literally: "Housekeeping — removing unused code."
I was so wrong.
The Setup
Our codebase had grown organically over 5 years. Multiple teams, multiple rewrites, at least two "we'll clean it up later" phases that never happened.
The coverage report was clear: these functions, classes, and modules had zero callers. Zero imports. Zero references. The tool even showed me the exact files. I spent maybe 20 minutes verifying — clicked through a few call chains, searched for dynamic references. Nothing.
I opened the PR. One reviewer approved in 5 minutes. The other didn't even look. We merged on a Thursday at 4 PM. Classic.
What Happened Next
Minute 1-3: The Silence
Deploy went green. All tests passed. No alerts fired. I closed my laptop feeling productive.
Minute 4: The First Page
Slack notification: "Checkout is failing."
Not "checkout is slow." Not "checkout is weird." Failing. As in, customers couldn't buy things.
Minute 10: The Investigation
I looked at the error logs. The stack trace pointed to a file I had just deleted. But that's impossible — the coverage tool said nothing called it.
Then I found it.
The Problem: Dynamic References
One of our "legacy" payment integrations used eval() — yes, eval — to dynamically construct payment processor class names from a configuration database.
# config table had: "processor_class": "StripeLegacyProcessor"
processor = eval(f"{config.processor_class}(api_key)")
The coverage tool couldn't see it because the reference was a string in the database, not code. The static analysis had no way to know that "StripeLegacyProcessor" in a database row meant from legacy_payments import StripeLegacyProcessor.
But wait — it gets worse.
The Hidden Web
That one eval() was just the tip of the iceberg. Once I started searching for all dynamic references, I found:
- Database-driven feature routing — feature flags stored as class names, resolved at runtime
- Plugin system — a YAML file listed "active plugins" by class name
- Admin dashboard — dynamically loaded report generators based on user permissions
- Webhook handlers — URL paths mapped to handler classes via a JSON config
Each one was a string reference that my coverage tool couldn't see. Each one broke when I deleted the "dead" code.
The Fix
I reverted the entire PR in about 2 minutes. But the damage was done — we had maybe 15 minutes of checkout downtime, and I had to explain to the CTO why I'd broken the revenue pipeline for a "housekeeping" PR.
What I Learned
1. Coverage Tools Lie (About Dynamically Referenced Code)
Static analysis can only see static references. If your codebase uses any form of dynamic loading — reflection, eval, metaprogramming, configuration-driven instantiation — your coverage report is incomplete.
2. "Dead Code" Is Often "Code That Calls Itself Indirectly"
Before deleting anything, I should have:
- Searched for string literals matching class/function names
- Checked configuration files, database seeds, and migration scripts
- Looked for
getattr(),eval(),exec(),importlib.import_module() - Asked the team that originally wrote the code
3. The Real Dead Code Test Is Runtime, Not Static
A better approach:
- Instrument the code — add logging to every "suspected dead" function
- Wait — run in production for at least one full business cycle
- Verify — only delete functions that logged zero calls over a meaningful period
- Feature flag it first — gate the code behind a flag, disable the flag, watch for breakage
4. PR Culture Matters
My reviewers approved a 40,000-line deletion in minutes. That's a process failure. Large deletions should get the same scrutiny as large additions — maybe more, because the risk is invisible.
The Aftermath
We eventually cleaned up that codebase — but properly. We added runtime instrumentation, waited two weeks, identified the actually dead code (about 8,000 lines), and deleted it in small, verified batches.
The 32,000 lines that looked dead? They were all referenced dynamically. Every single one.
The Takeaway
Dead code is like a haunted house — it looks empty, but something's still living in there. Before you start demo lishing walls, make sure nobody's home.
Next time I see "40,000 lines of unused code," I'm going to assume I just don't understand the codebase well enough yet.
Have you ever deleted code that wasn't actually dead? Share your war stories in the comments.
Top comments (0)