DEV Community

Dariusz Newecki
Dariusz Newecki

Posted on

My Audit Caught My Audit Being Wrong

And that's exactly what it's supposed to do.


A few days ago I ran a diagnostic on CORE — the governance system I'm building that supervises AI-generated code. The diagnostic was supposed to investigate why a specific audit rule appeared to be silently failing. Not firing. Producing zero findings against files it should have flagged.

I ran the investigation carefully. Stage by stage. I came to a conclusion.

The conclusion was wrong.

And I only found that out because the system itself told me so.


What I thought was happening

CORE has an audit rule called autonomy.tracing.mandatory. It checks that any class ending in Agent contains a mandatory call to self.tracer.record. The logic is straightforward: if an autonomous agent produces work, that work must be traceable. No tracing call — the rule flags it.

My notes said the rule was firing zero findings against SelfHealingAgent — a class with, in fact, zero tracer references. A rule designed to catch exactly that situation, catching nothing.

That's a governance gap. If a rule exists and silently fails, you don't have an audit system. You have a theatrical one.

So I investigated.


What I actually found

The rule was firing. Correctly. Both findings were present, cleanly, in reports/audit_findings.json:

{
  "check_id": "autonomy.tracing.mandatory",
  "severity": "warning",
  "message": "Line 51: missing mandatory call(s): ['self.tracer.record']",
  "file_path": "src/will/agents/self_healing_agent.py"
}
Enter fullscreen mode Exit fullscreen mode

The system wasn't broken. The diagnostic's starting assumption was broken.

Here's where it came from. CORE's audit output is rendered through Rich — a Python library that produces beautiful terminal tables with color, alignment, and spacing. Rich also truncates long strings to fit columns. So autonomy.tracing.mandatory becomes autonomy.tracing.mandat… on screen.

When I ran grep 'tracing.mandatory' against the captured terminal output to verify the finding, I got zero matches. Not because the finding wasn't there — because Rich had silently eaten the last four characters of the rule name, and my grep pattern was looking for the full string.

I used display output as an oracle. Display output lied.

The JSON source of truth never did.


The stage-by-stage result

I re-ran the diagnostic properly, going to primary sources instead of rendered output:

Stage Status
Rule loaded and mapped PASS — rule extracted, bound to ast_gate engine
Scope resolution PASS — self_healing_agent.py in scope
Engine dispatch PASS — engine ran against the file
Auto-ignore PASS — zero suppressions, nothing dropped silently
Finding emitted PASS — present in audit_findings.json

Every stage passed. The investigation had no failure to explain, because there was no failure. It was investigating a ghost.

Direct engine invocation confirmed it independently:

# Standalone check — no orchestrator involved
for node in ast.walk(tree):
    if GenericASTChecks.is_selected(node, selector):
        err = GenericASTChecks.validate_requirement(node, requirement)
        print(type(node).__name__, getattr(node, 'name', '?'), '->', err)

# Output:
# ClassDef SelfHealingAgent -> missing mandatory call(s): ['self.tracer.record']
Enter fullscreen mode Exit fullscreen mode

Same verdict. No ambiguity.


Why this matters more than "I made a mistake"

I'm building a system where AI generates code and a deterministic governance layer audits it. The entire value proposition is that the governance layer is trustworthy. Not smart — trustworthy. You need to be able to look at a finding and know it reflects reality. You need to be able to look at a clean audit and know the system actually checked.

That's called instrument qualification. In regulated industries — pharmaceuticals, medical devices, aerospace — you don't just validate the product. You validate the instruments you used to measure the product. A thermometer that reads 37°C when the actual temperature is 39°C isn't a minor inconvenience. It's a systematic lie that compounds silently across every reading it ever produces.

I accidentally demonstrated the same principle in software.

When I used grep against Rich-rendered terminal output, I was reading from an instrument I hadn't qualified. Rich is a display library. It's not a data source. It's designed to make things readable to humans, not parseable by machines. Using it as a source of truth for a diagnostic is exactly as reliable as doing a medical measurement with a ruler.

The JSON report is the qualified instrument. It's the canonical output. It doesn't truncate. It doesn't wrap. It doesn't abbreviate for column fit. It says what the system found.

A passing audit with many findings is less honest than a failing audit with fewer real ones. An instrument that gives you clean-looking output that misrepresents reality isn't helping you — it's flattering you.


What I changed

Two things.

One: I added the stale references explicitly to the diagnostic record. My notes had two wrong module paths that would have caused anyone running the diagnostic in the future to hit ImportError immediately. AuditorContext is not in mind.logic.engines.ast_gate.base — it's in mind.governance.audit_context. I documented both as stale references, with the correct paths. Constitutional debt is honest debt. Hiding it helps no one.

Two: I documented the grep-against-Rich anti-pattern. Not as a personal failure, but as a category. If I did it, someone else will do it, or I'll do it again in six months under pressure. The pattern needs a name so it can be recognized.


The uncomfortable version

Here's the uncomfortable version of this story: I almost propagated the wrong conclusion.

If I'd stopped at "zero grep matches, rule is not firing," I would have written a finding that said the governance system had a blind spot. I might have gone looking for a fix in the wrong place. I might have introduced a workaround that solved a problem that didn't exist, while leaving a different problem — the unreliable diagnostic method — completely intact.

In a system that supervises autonomous AI code generation, a wrong finding about your audit rules is worse than a missing finding. A missing finding is a gap. A wrong finding is a confidence injection. You become more certain the system is broken in a specific way, and that certainty guides you away from the actual state.

That's the failure mode I'm most worried about in AI-supervised systems generally. Not that the AI is wrong — everyone accepts the AI might be wrong. The failure mode is when the verification layer produces plausible-looking output that you stop checking.

CORE is built on the assumption that every layer lies until verified. Including the diagnostic layer. Including me.

I'm not a programmer. I'm closer to a lawmaker than a coder. I built a governance system because I understand governance better than I understand AST traversal. Swimming against a current you can't even see clearly is exactly the situation where you need your instruments to be honest. Flattery is the thing that drowns you.

The system didn't flatter me. That's not a bug. That's the only thing I actually need it to do.


CORE is an open-source, deterministic governance runtime for AI-generated code. You can find it at github.com/DariuszNewecki/CORE.

Top comments (0)