CORE Closed Its Audit Trail. Then Found 18 Engine Gaps It Couldn't See Before.

#ai #architecture #python #core

Six weeks ago I published a post here titled "Your Agent Has Two Logs. One of Them Doesn't Exist Yet."

This week, Band B closed. CORE's second log exists.

Here's what that actually means — and why closing it immediately made things harder.

The two-log problem, briefly

Every autonomous system that touches production code has two logs whether it admits it or not.

Log one: what happened. Files changed, tests ran, commits landed.

Log two: why it happened. What finding triggered what proposal. What approval authorized what execution. What execution caused what file change. What file change produced what new finding.

Log two is the audit trail. In a regulated environment, log two isn't optional — it's the difference between a system you can defend and one you can't.

CORE had log one. Log two was missing.

What Band B actually required

Eight issues. Four ADRs. Seven coordinated write-path decisions — where in the code does attribution get written, in what shape, guaranteed by what gate.

The hard part wasn't the code. It was making the causality chain complete. Every link had to be present:

Finding → which proposal claimed it (and when)
Proposal → which execution consumed it (and what commit resulted)
Execution → which new findings it produced

Miss one link and the chain is decoration, not evidence.

196 commits in April. 25 issues closed. Band B: 8 closed, 0 open.

What happened immediately after

Band D opened with 18 issues.

Not because we introduced regressions. Because closing Band B made the engine's integrity gaps visible in a way they weren't before. You can't measure attribution fidelity until attribution exists. Once it does, you can see exactly where the engine fails to populate it correctly.

This is the convergence principle working as designed. The system gets more capable. It immediately finds more problems with itself. The audit PASS holds — 19 active workers, findings are warnings about modularity, not governance failures. But the work queue doesn't shrink when a band closes. It shifts.

What "GxP-load-bearing" means in practice

I've been building CORE in part for environments like pharmaceutical manufacturing — where an AI system that modifies code or configuration needs to prove it acted within authorized boundaries, on authorized intent, with a complete audit trail.

GxP (Good Practice regulations) doesn't care what your system can do. It cares what your system can prove it did.

Band B is the difference between CORE being a capable tool and CORE being a defensible tool. The second log is what makes it defensible.

What's next

Band D: engine integrity. 18 open issues. The system that now has a complete audit trail needs its engine tightened before those traces are fully trustworthy.

Then Band E: external validation. CORE governing a repository it didn't build.

The second log exists. Now we make sure everything it records is true.

CORE is open source: github.com/DariuszNewecki/CORE

Previous in this series: Your Agent Has Two Logs. One of Them Doesn't Exist Yet.