Vivek Verma

Posted on Jun 20

I built a dead code forensics CLI because "this file is unused" is never enough

#cli #showdev #sideprojects #tooling

Every senior developer has stared at a file and thought: should I delete this?

You run vulture. You run deadcode. They both tell you the file has zero import
sites. You hover over the delete key.

And then you don't delete it. Because you don't actually know why it's there.

The real question isn't "is it dead?" — it's "why is it dead?"

Dead code falls into very different categories:

Category A — Accidentally orphaned. A file that got left behind when the
calling code was removed. Safe to delete immediately.

Category B — Intentionally parked. "keeping this around until Q2 rollout
completes." The holdback condition may or may not still apply.

Category C — Dynamically loaded. Appears dead to static analysis but gets
importlib.import_module()'d at runtime. Deleting it breaks production.

Category D — Replaced but not removed. The function was superseded by a better
implementation, the PR was merged, and nobody cleaned up. Safe to delete, but you
need to know what replaced it to be confident.

Existing tools treat all four identically: "unused." That's the gap fossil fills.

How fossil works

pip install fossil-code
fossil explain src/billing/legacy_processor.py

fossil runs five stages in under 3 seconds:

Stage 1: Static Analysis

Python's ast module builds a symbol table of everything the file exports. Then it
scans every other file in the repo for references — imports, calls, attribute access,
dynamic patterns (importlib, getattr, __import__). This is not grep; it's an
actual AST traversal that understands Python's import semantics.

Stage 2: Git History Mining

GitPython traverses git log --follow for the target file. It walks commits
newest-to-oldest, checking at each step whether any other file in the repo was still
referencing the target. The first commit where all references drop to zero is the
death commit.

Stage 3: Commit and PR Parsing

The death commit message is parsed for PR references (#441, PR 441,
pull request 441). If found, the PR title and merge context are extracted from the
commit body or (if a GitHub token is configured) via the GitHub API.

Stage 4: Pattern Detection

The file's current content is scanned for deferred-deletion patterns:

TODO: remove after X
keep for now / keep around until X
DEPRECATED / @deprecated
will be removed in version X
temporary / temp fix

For each pattern, fossil attempts to verify the condition: does a PR with that
description exist and is it merged? Has a git tag for that version been created?
Has the referenced date passed?

Stage 5: Confidence Scoring

14 weighted signals are aggregated into a 0–100% score:

Signal	Weight
Zero call sites	+30
No dynamic references	+20
Death commit identified	+15
Temporary hold resolved	+10
No reflection patterns	+10
File age > 90 days dead	+8
PR/migration context found	+7
Dynamic import detected	−30
Reflection detected	−20
Modified < 30 days ago	−20
Unresolved "keep for now"	−15
Language unknown (fallback)	−15
Test file references	−10
Ambiguous death commit	−10

The output is a Rich panel with every signal explained, a risk label, and a
suggested action.

Scan an entire codebase

fossil scan ./src --threshold 80

Returns a ranked table of all dead files above 80% confidence, with their
dead-since date and risk level. Exit code 4 means nothing above the threshold —
useful for CI gates.

Current state and roadmap

Live now (v0.2.0):

fossil explain — full forensic report
fossil scan — directory scan with confidence threshold
fossil clean — prioritized deletion backlog
Python deep analysis (AST), text fallback for JS/TS/Java/Go
Pattern detection with condition verification
Local SQLite caching (cache hits < 100ms)

Coming next:

GitHub/GitLab API integration (PR title/body fetch, --yolo deletion PR creation)
LLM narration for human-readable explanations
tree-sitter for deep multi-language analysis

Install

pip install fossil-code  # Python 3.11+, requires git

GitHub: https://github.com/iamvvekverma/fossil

The confidence scoring weights are a first attempt — I'd genuinely like feedback on
whether the calibration holds for your codebase. What signal should weigh more? What
am I missing?

DEV Community