Every senior developer has stared at a file and thought: should I delete this?
You run vulture. You run deadcode. They both tell you the file has zero import
sites. You hover over the delete key.
And then you don't delete it. Because you don't actually know why it's there.
The real question isn't "is it dead?" — it's "why is it dead?"
Dead code falls into very different categories:
Category A — Accidentally orphaned. A file that got left behind when the
calling code was removed. Safe to delete immediately.
Category B — Intentionally parked. "keeping this around until Q2 rollout
completes." The holdback condition may or may not still apply.
Category C — Dynamically loaded. Appears dead to static analysis but gets
importlib.import_module()'d at runtime. Deleting it breaks production.
Category D — Replaced but not removed. The function was superseded by a better
implementation, the PR was merged, and nobody cleaned up. Safe to delete, but you
need to know what replaced it to be confident.
Existing tools treat all four identically: "unused." That's the gap fossil fills.
How fossil works
pip install fossil-code
fossil explain src/billing/legacy_processor.py
fossil runs five stages in under 3 seconds:
Stage 1: Static Analysis
Python's ast module builds a symbol table of everything the file exports. Then it
scans every other file in the repo for references — imports, calls, attribute access,
dynamic patterns (importlib, getattr, __import__). This is not grep; it's an
actual AST traversal that understands Python's import semantics.
Stage 2: Git History Mining
GitPython traverses git log --follow for the target file. It walks commits
newest-to-oldest, checking at each step whether any other file in the repo was still
referencing the target. The first commit where all references drop to zero is the
death commit.
Stage 3: Commit and PR Parsing
The death commit message is parsed for PR references (#441, PR 441,
pull request 441). If found, the PR title and merge context are extracted from the
commit body or (if a GitHub token is configured) via the GitHub API.
Stage 4: Pattern Detection
The file's current content is scanned for deferred-deletion patterns:
TODO: remove after X-
keep for now/keep around until X -
DEPRECATED/@deprecated will be removed in version X-
temporary/temp fix
For each pattern, fossil attempts to verify the condition: does a PR with that
description exist and is it merged? Has a git tag for that version been created?
Has the referenced date passed?
Stage 5: Confidence Scoring
14 weighted signals are aggregated into a 0–100% score:
| Signal | Weight |
|---|---|
| Zero call sites | +30 |
| No dynamic references | +20 |
| Death commit identified | +15 |
| Temporary hold resolved | +10 |
| No reflection patterns | +10 |
| File age > 90 days dead | +8 |
| PR/migration context found | +7 |
| Dynamic import detected | −30 |
| Reflection detected | −20 |
| Modified < 30 days ago | −20 |
| Unresolved "keep for now" | −15 |
| Language unknown (fallback) | −15 |
| Test file references | −10 |
| Ambiguous death commit | −10 |
The output is a Rich panel with every signal explained, a risk label, and a
suggested action.
Scan an entire codebase
fossil scan ./src --threshold 80
Returns a ranked table of all dead files above 80% confidence, with their
dead-since date and risk level. Exit code 4 means nothing above the threshold —
useful for CI gates.
Current state and roadmap
Live now (v0.2.0):
-
fossil explain— full forensic report -
fossil scan— directory scan with confidence threshold -
fossil clean— prioritized deletion backlog - Python deep analysis (AST), text fallback for JS/TS/Java/Go
- Pattern detection with condition verification
- Local SQLite caching (cache hits < 100ms)
Coming next:
- GitHub/GitLab API integration (PR title/body fetch,
--yolodeletion PR creation) - LLM narration for human-readable explanations
- tree-sitter for deep multi-language analysis
Install
pip install fossil-code # Python 3.11+, requires git
GitHub: https://github.com/iamvvekverma/fossil
The confidence scoring weights are a first attempt — I'd genuinely like feedback on
whether the calibration holds for your codebase. What signal should weigh more? What
am I missing?
Top comments (0)