DEV Community

Kwansub Yun
Kwansub Yun

Posted on

AI Slop Detector v2.6.1 Self-Audit: I planted 3 bad files — did it catch them?

AI code can look “production-ready” while implementing almost nothing.

  • It passes lint.
  • It follows clean architecture.
  • It ships.

And yet—when you trace execution paths—there’s barely any load-bearing logic.

That’s AI slop: not broken code, just convincingly empty code.

I built AI Slop Detector to turn that vague “this feels off” into inspectable signals.

Then I had to face the uncomfortable credibility test:

Would the tool accuse my own repo?


v2.6.1 is a trust release

This release is about auditability and stability:

  • Dependency rules are no longer “magic in code” — they’re moved into YAML and loaded dynamically (less hidden assumptions).
  • Test + coverage hardening: 165 tests, 85% overall coverage, and CI Gate coverage improved from 0% → 88%.

The goal: make the tool harder to regress, and easier to trust.


The self-audit

I ran the detector on its own codebase the same way anyone would:

slop-detector --project .
Enter fullscreen mode Exit fullscreen mode

Overall result: CLEAN

  • Deficit Score: 19.78 (lower = better)
  • Jargon Inflation: 0.11

Verbatim report note: I’m publishing the complete self-inspection report as a PDF exactly as generatedno edits, no redactions, no reformatting, and no interpretive changes.

“CLEAN” didn’t mean “perfect.”
It meant: there’s enough real implementation here that review is worthwhile.


The real experiment: I planted 3 “known-bad” files

A score alone is easy to doubt.

So I did something deliberate: I planted three intentionally-bad fixtures inside the repo — on purpose — so I’d know whether the detector detonates for the right reasons.


1) Empty shell

A 0-line file scored 100.00, and the report explains it plainly:

“Empty file: nothing to analyze → remove / implement / mark stub.”

Translation: sometimes code “exists” without doing anything — and that should be visible.


2) Dangerous structure (structural_issues.py)

Score 71.21, flagging patterns that rot systems quietly:

  • catch-all errors
  • mutable defaults
  • star imports
  • global state

Translation: code can run and still become hard to trust or debug later.


3) Buzzword + slop cocktail (generated_slop.py)

Score 96.77, triggering jargon like:

  • “state-of-the-art”
  • “transformer”
  • “optimized”

…alongside classic slop patterns.

Translation: impressive language and weak substance often travel together.


The twist

It even flagged my own wording: “production-ready” inside cli.py.

The tool didn’t just audit code.
It audited claims.


Why this is the AI-era review problem

Most tools answer yesterday’s questions:

  • “Is it syntactically valid?”
  • “Is it safe?”
  • “Is it complex?”

AI-era workflows need a prior question:

Is there meaningful implementation here at all?

AI Slop Detector isn’t a verdict machine.
It’s a review signal that turns “looks good, but…” into measurable follow-ups.


Quick start

pip install -U ai-slop-detector

# single file
slop-detector mycode.py

# full project scan
slop-detector --project .
Enter fullscreen mode Exit fullscreen mode

CI Gate modes (soft / hard / quarantine)

# Soft: report only (never fails)
slop-detector --project . --ci-mode soft --ci-report

# Hard: fail build on thresholds
slop-detector --project . --ci-mode hard --ci-report

# Quarantine: gradual enforcement
slop-detector --project . --ci-mode quarantine --ci-report
Enter fullscreen mode Exit fullscreen mode

If you want one fast review question

“What does this function actually do differently from what it wraps?”

If the answer circles back to docstring promises instead of concrete behavior, it’s usually scaffolding.


If you try it

I’d love real-world cases:

  • false positives you hit
  • convincing emptiness it missed
  • fixture ideas that represent what you see in PRs

Top comments (0)