This story was shared by a fellow developer on DEV who asked to remain anonymous. If you've got a story to tell — come find me. Your name won't ap...
Some comments have been hidden by the post's author - find out more
For further actions, you may consider blocking this person and/or reporting abuse
the CI/CD gatekeeper scenario without the conspiracy is just as frustrating - overfitted model, every auth fix blocked because auth modules touched historical vulns. no villain, just weeks of exception tickets.
You're right. The version without a villain is actually harder to deal with — at least with a villain you know who to yell at. An overfitted model blocking your auth fixes, and there's no one to complain to. Just a queue of exception tickets. And the worst part? You never know whether the next ticket will go through or get blocked. Its decision logic might as well be a coin flip.
yeah and the coin flip breaks planning more than the actual blocks do. people start batching auth changes just in case, the backlog fills with workarounds. it's not the blocked tickets that kill you, it's the process debt that forms around the unpredictability.
"Process debt" — that's the phrase I was missing. The blocks are surface-level. What actually compounds is the organizational scar tissue: people pre-batching, padding timelines, building workarounds for a system that doesn't know it's unpredictable. You stop planning around what's right and start planning around what might get through.
process debt is the right frame but it"s actually worse — the debt compounds because the workarounds outlive the original unpredictability. team rewires around the bad behavior, system gets patched, but the pre-batching habits and padded timelines stay. you end up carrying the tax without the reason for it.
"Carrying the tax without the reason" — that's a brutal way to frame it. What makes it worse is eventually nobody even remembers where the habits came from. A new joiner inherits a "batch auth changes" rule from a wiki page written by someone who left two years ago. Ask them why — "that's just how we do it here." The original unpredictability is long gone, but the process lives on as culture. At that point, the debt has become identity.
Really appreciate you diggin
the "fix and exploit look identical to a model" framing hits the same wall we run into with MCP tool authorization.
call_payment_service(user_id=X)looks identical whether it's a legitimate agent step or prompt injection from upstream — the model sees code pattern, not intent.the separate monitor was the right call. we ended up with an audit stream for every MCP tool call, keyed by request ID, completely outside the approval gate. gate can fail either way; the log does not care.
worst part: the gate was also the only escalation path. when the system that blocks you is also the appeal mechanism, there's nowhere to stand.
how are you handling where a human can actually override without going through the system that said no?
This is exactly why if you trust AI, you should trust a human more. The Sentinel was right, but it was too stupid to identify why it was right. Technical debt is where AI fails everytime, because what looks right, might have been wrong from the start.
"too stupid to identify why it was right" — that's the sharpest take on VoidSentinel I've read. It flagged the attack. Then cleared it three minutes later. Same model, same path, two different verdicts.
The "trust AI → trust human" part I'm less sure about. Mark was the human in this story, and he was the one using the AI as a shield. I think the scarier combination is: an AI that can't explain itself, plus a human who won't question it.
As for technical debt — VoidSentinel didn't fail because of code rot. It failed because its world only has one dimension: who's changing what. The real fight happens in another dimension: why are they changing it. That dimension doesn't exist in its model. Billions of training tokens can't buy you a plane you don't know is missing.
Exactly. But the problem is Mark trusted the AI over a human expert saying "This was a fix, not a flaw". Sure, trust the AI to flag suspicious code, but if you ask the implementer and they say it wasnt an accident, it was a security risk patched. Then at the very least, it would warrant looking in to. 2 portals sharing 1 API key from years ago, is the worst kind of code rot, namely permissions code-rot, because back in the day, attacks werent sophisticated enough to find the loophole. Half the reason why I built V.A.L.I.D. is as an easy way to upgrade legacy systems to a secure standard (that happens to be AI native and much faster). You can train a model, but a model cant think outside it's parameters, especially true if it's a smaller model (sub 1T) with gaps in it's knowledge (the missing dimension). When that's the case, always trust the human enough to double check it.
That's the real punchline — the AI got to be wrong twice (flag it, then clear the flag), but the human only got to be right once. The fix was intentional. The model couldn't tell the difference between "looks suspicious at first glance" and "prove it's wrong with evidence." So when both are in the loop and the system gives the model the final say, the implementer's context dies with their keystrokes.
V.A.L.I.D. sounds like exactly the kind of tool that shouldn't need to exist in a healthy engineering culture — but absolutely does in this landscape.
Exactly. If it's the AI vs the Expert, the back and forth should stop when the Expert provides proof. Not get shot down by the AI for 'it's not my policy'.
V.A.L.I.D. is an interesting one. I designed it because I hated CSLA's code bloat. So I decided, let me write a framework that fixes Blazor (also react compatible). It generates 85% of the code for you, you just create your DTO, mark items as ValidObjects, with constraints and rules. Then a simple markup of your UI and it fills in the blanks, including UnitTests and a MCP to make whatever you build webMCP compliant, or how I use it, as a great way to add an actually useful AI chatbot. It's written in F# and uses Roslyn for the generating. You can write backend in C# or F#, it'll convert it to the most efficient one for the task, you can write the UI in blazor or JS, it'll wire it up properly. It flags errors at compile time, not run-time. It also has a fuzzer functionality, so if you dont want to use the simple-scripting UI tests, you can just run the fuzzer and it'll blast a BO's objects and visually show you the affected properties. It ditches the WASM VDOM for unmanaged slabs and uses a 128 bit mask for the mutations. Result is Blazor hits around 600 mutations per second, V.A.L.I.D. hits 3600+, while logging state, so if a user hits an error, you can replay their exact state on your end, with a timeline, it natively supports PROPER undo, not that stupid implementation of CSLA. So it does a bit more than provide a skills file and MCP so a model can write it faster without breaking your old code 😅
If you're in to writing .NET, give it a try, I'm using it currently for my automated acounting suite, which is sitting at around 600k LOC, but I only had to write less than 1/5th of it by hand.
The scariest part isn't the AI – it's the person who hides behind it
This story gave me chills. Not because of the $4.2M loss (though that's painful), but because of how perfectly it captures the real danger of AI gatekeeping: when someone uses "the AI said no" as an unassailable shield for their own ego or agenda.
The technical lesson is clear – VoidSentinel couldn't distinguish a fix from an exploit because that's not a solvable problem for a model trained purely on code patterns. Intent isn't in the diff. But the human lesson is even bigger.
Mark didn't block the PR because the AI was right. He blocked it because admitting the AI was wrong would mean admitting he was wrong – about the tool he sold to the board, about the PIP he issued, about his own judgment. The AI wasn't the decision‑maker. It was the excuse.
What strikes me is that Alex didn't need to build a better AI. He built a monitor – a separate, independent observer that didn't try to judge fixes vs attacks, just logged what happened. That's often more valuable than another "smart" system.
Two quick takeaways from this (for anyone building or buying AI security tools):
Always have a human override with teeth. If a developer with domain expertise says "this is a fix, not an attack," there needs to be an escalation path that doesn't end at the same VP who bought the system.
Independence in monitoring is non‑negotiable. Alex's monitor ran outside VoidSentinel. It didn't share the same blind spots. In safety‑critical systems, that's called redundancy. In AI governance, it's called not letting the fox guard the henhouse.
Thanks for sharing this (and to the anonymous engineer who lived it). It's a reminder that the most dangerous line in any meeting is: "The system said no."
Cheers,
Jack
DEV.to/ggle.in
That line — "the AI wasn't the decision-maker, it was the excuse" — cuts to the deepest layer of the whole story.
But there's a quieter damage I keep coming back to. Once "the system said no" becomes a conversation ender, it doesn't just protect someone's ego. It kills the organization's ability to learn. Mark doesn't have to explain why he rejected the PR — "the gate didn't pass." Alex doesn't have to ask why it didn't pass, because he already knows the answer. Next time, the person after him won't ask "is this gate even right?" — they'll just route around it, or stop fixing things altogether.
The gate didn't just block that one fix. It blocked the conversation about why that fix was right in the first place. And a conversation that never happens is a loss no system can compensate for.
This hits close to home. When integrating AI APIs into automated security workflows, I have seen confident false negatives that would have been caught by a human reviewer. A simple secondary validation step using a different model or rule-based check has saved me from similar disasters. The key is never trusting a single AI call for critical decisions.
Nail on the head. The ironic part? VoidSentinel wasn't even wrong — it did detect someone modifying an auth path. It just couldn't tell if you're fixing a hole or punching a new one 😂 Same input, same pattern, opposite intent — system had no idea.
That line "never trust a single AI call for critical decisions" — I'm framing it
Btw, ever had the secondary validation also fall for it? First model was so confident it dragged the second one down the wrong path too? 🤣
AI security gates are useful, but “verdict is final” is where things break.A fix and an exploit can look almost identical to a model. Without human review, ownership context, and audit trails, you are not reducing risk. You are just automating blind spots.
That 'fix vs exploit look identical to a model' line cuts right to what I was trying to show. The team that built the gate assumed the model could tell the difference. It couldn't. And without the audit trail to prove the fix was legitimate, there was no way to overrule the gate — because the process to overrule it had been automated too. The gate didn't just block the fix. It blocked the conversation about whether the fix was right in the first place.
Security by AI alone is a costly bet; if it says no and the breach costs millions, bring humans back in the loop before the next incident.
Exactly. "Bring humans back in the loop before the next incident" — that line hits. The scary part is most orgs wait for the incident to happen before they do it.