My anomaly detector just scored a perfect 1.000 AUC. Caught every bad sample, zero false positives.
Four months ago I'd never trained a model. So my first instinct wasn't to celebrate — it was: that's
too clean. What am I missing?
Turns out, everything.
The confound. My "bad" samples were trained differently from my "clean" ones — not just bad, but
bad in a way that left an obvious, unrelated fingerprint. The detector wasn't catching the threat. It
was catching my own sloppy experiment design. So I rebuilt the bad samples to be identical to the clean
ones in every way except the one thing I was actually testing. The score dropped from 1.000 to ~0.92 —
and that number was real, because now it could only be measuring the thing I cared about.
The killer check. Even then, I didn't trust it. I added a "does it actually work?" probe — a positive
control that verified my bad samples were genuinely bad before I judged whether they were detectable.
They weren't. A whole class of them had silently failed to install the behaviour I was testing. If I'd
skipped that probe, I'd have published "my detector misses X" — when the truth was "X never existed."
The lesson I keep relearning: a perfect score is not a trophy, it's a smoke alarm. The most valuable
code I wrote this week wasn't the detector. It was the three experiments designed to break my own
result. Intelligence finds the idea. Honesty is what makes the idea true.
Building in public, on a single consumer GPU, self-taught. More soon.
Top comments (0)