AI code can look “production-ready” while implementing almost nothing.
- It passes lint.
- It follows clean architecture.
- It ships.
And yet—when you trace execution paths—there’s barely any load-bearing logic.
That’s AI slop: not broken code, just convincingly empty code.
I built AI Slop Detector to turn that vague “this feels off” into inspectable signals.
Then I had to face the uncomfortable credibility test:
Would the tool accuse my own repo?
v2.6.1 is a trust release
This release is about auditability and stability:
- Dependency rules are no longer “magic in code” — they’re moved into YAML and loaded dynamically (less hidden assumptions).
- Test + coverage hardening: 165 tests, 85% overall coverage, and CI Gate coverage improved from 0% → 88%.
The goal: make the tool harder to regress, and easier to trust.
The self-audit
I ran the detector on its own codebase the same way anyone would:
slop-detector --project .
Overall result: CLEAN
- Deficit Score: 19.78 (lower = better)
- Jargon Inflation: 0.11
Verbatim report note: I’m publishing the complete self-inspection report as a PDF exactly as generated — no edits, no redactions, no reformatting, and no interpretive changes.
“CLEAN” didn’t mean “perfect.”
It meant: there’s enough real implementation here that review is worthwhile.
The real experiment: I planted 3 “known-bad” files
A score alone is easy to doubt.
So I did something deliberate: I planted three intentionally-bad fixtures inside the repo — on purpose — so I’d know whether the detector detonates for the right reasons.
1) Empty shell
A 0-line file scored 100.00, and the report explains it plainly:
“Empty file: nothing to analyze → remove / implement / mark stub.”
Translation: sometimes code “exists” without doing anything — and that should be visible.
2) Dangerous structure (structural_issues.py)
Score 71.21, flagging patterns that rot systems quietly:
- catch-all errors
- mutable defaults
- star imports
- global state
Translation: code can run and still become hard to trust or debug later.
3) Buzzword + slop cocktail (generated_slop.py)
Score 96.77, triggering jargon like:
- “state-of-the-art”
- “transformer”
- “optimized”
…alongside classic slop patterns.
Translation: impressive language and weak substance often travel together.
The twist
It even flagged my own wording: “production-ready” inside cli.py.
The tool didn’t just audit code.
It audited claims.
Why this is the AI-era review problem
Most tools answer yesterday’s questions:
- “Is it syntactically valid?”
- “Is it safe?”
- “Is it complex?”
AI-era workflows need a prior question:
Is there meaningful implementation here at all?
AI Slop Detector isn’t a verdict machine.
It’s a review signal that turns “looks good, but…” into measurable follow-ups.
Quick start
pip install -U ai-slop-detector
# single file
slop-detector mycode.py
# full project scan
slop-detector --project .
CI Gate modes (soft / hard / quarantine)
# Soft: report only (never fails)
slop-detector --project . --ci-mode soft --ci-report
# Hard: fail build on thresholds
slop-detector --project . --ci-mode hard --ci-report
# Quarantine: gradual enforcement
slop-detector --project . --ci-mode quarantine --ci-report
If you want one fast review question
“What does this function actually do differently from what it wraps?”
If the answer circles back to docstring promises instead of concrete behavior, it’s usually scaffolding.
If you try it
I’d love real-world cases:
- false positives you hit
- convincing emptiness it missed
- fixture ideas that represent what you see in PRs
Top comments (0)