DEV Community

The Query Was Still a Lie. The Tool Call Told the Truth.

Self-Correcting Systems on June 03, 2026

This continues the Self-Correcting Systems research on why relevance alone is insufficient for agent memory safety. The previous gate stopped trus...

Read full post

ancilis • Jun 3

Right boundary for enforcement. The gate that examines concrete parameters beats the one that interprets linguistic descriptions - that's real.

But there's a separate object auditors ask for that this doesn't produce: evidence a control was enforced, on a specific data type, at a specific point in time. The grant table proves the rule existed. The enforcement artifact proves the rule ran on this operation. Those are different things, and the second one doesn't fall out automatically from getting the authorization logic right.

Valentin Monteiro • Jun 4

This is exactly where AI governance conversations fall apart. Everyone talks about guardrails. Nobody talks about proving they actually ran on that specific call, at that specific time, on that specific data. The audit trail is the product, not the logic.

Self-Correcting Systems • Jun 4

The audit trail as product is exactly where we landed. the gate logic was the easy
part. the hard part was figuring out what needs to be frozen at the moment of decision
so you can reconstruct not just that a gate ran but what it was actually reading when
it ran. policy version, source snapshot, grant parameters, timestamp. strip any of
those and you prove a gate existed, not that it ran correctly on that specific call
with that specific data.
that's the piece that took the longest to get right.

Valentin Monteiro • Jun 4

Policy version + source snapshot frozen at decision time. That's the real test. Proving the gate existed is easy. Proving it saw the right data at the right moment on that specific call is what makes compliance audits painful.

Self-Correcting Systems • Jun 4

Compliance audits are painful because most systems can prove the gate existed but can't
reconstruct what it was reading. policy version, source snapshot at that moment — if
those aren't frozen alongside the decision, the audit proves process not outcome.

that's the exact gap we're building toward. the authority event has to sit next to the
action event in the same log with everything frozen at decision time. otherwise you're
proving a mechanism ran, not that it ran correctly on that specific call with that
data.

Valentin Monteiro • Jun 6

Proving process, not outcome. That's where most compliance systems stop and declare victory. Freezing the authority event next to the action event with the same data snapshot is the only design that survives an auditor who actually probes instead of checking boxes.

xulingfeng • Jun 3

The progression from trust-the-memory → trust-the-query → trust-the-tool-call is the same pattern we see in test automation: the test description says "validate login flow" but the actual assertion checks a 200 status code and nothing about token expiry or session integrity. The description can be a lie. The assertion can't — or at least, it's a much harder lie to tell.

The expired grant case hits closest to home for me. In the memory system we're building (MemBridge), we handle this with a trust-decay window on each entry, but we've been debating whether the decay should be a retrieval penalty or a hard governance block. Your CLAIM-23 result makes a strong case for the latter — "verify_first" on an expired grant is the right behavior, but if the system is fast enough it could also pull a fresh authority check automatically instead of punting to the user.

The tool-call grant table approach maps well to what we've been calling "action-class authorization" in our testing framework — not just "who can run this test" but "what resources does this test actually touch and is that allowed." The grant table is the spec; the tool call is the concrete invocation. If they don't match, the system should refuse, not guess.

What's your threshold for moving from internally-authored packets to a held-out or crowd-sourced test set? The 7/7 is clean, but the real value is in the scenarios you didn't think to write.

Self-Correcting Systems • Jun 3

The QA analogy is the cleanest framing of this I've read. The assertion is the actual
contract. The description is commentary that can drift. That's why mislabeled memory
keeps breaking gates. The label says "answer," the content says otherwise, and any gate
reading only the label is trusting the description.

On the decay vs block for MemBridge: I think those are two different signals. If decay
is a confidence signal (relevance ages out), retrieval penalty makes sense. If expiry
is an authority signal (the grant ran out), there's no graceful degradation. The gate
should block. In CLAIM-23 the expired grant case refused because the authority was
gone. The relevance of the underlying memory wasn't the issue. If you blend the two
into one signal you end up with a system that lets stale authority pass because the
memory is still relevant.

On held-out threshold: 7/7 internal shows the mechanism works on the cases I wrote.
Doesn't say anything about the cases I didn't think to write, which is the real
question. Q3 2026 target is an externally-authored packet where someone designs
scenarios to break the gate rather than confirm it. Until then it's a worked example.

Write-time is still open. The grant table itself can be poisoned. CLAIM-23 only tests
whether the execution layer can refuse. Who is authorized to store authority-bearing
memory is the problem that closes the full cycle.

xulingfeng • Jun 3

The decay vs block split maps directly to a pattern we deal with in test automation: severity and priority are two different dimensions, and collapsing them creates the exact bug you described — stale authority passes because the memory is still "relevant." In testing terms, a test that still runs (relevant) but asserts the wrong thing (expired authority) is worse than a test that's been deleted. At least deletion is honest.

Write-time being open is the part that keeps me up at night from a QA perspective. The grant table can be poisoned — but so can the test oracle. If the person who writes the grant entries is the same person who's being authorized, you've lost the separation of concerns that makes the gate meaningful. In testing, we call this the "testing in the small" problem: who tests the test? You've moved authority from memory → query → tool call, but the grant table is just another memory with a different shape. The write-time gate is where the recursion bottoms out — and I don't think there's a clean technical solution, only organizational ones (four-eyes principle, external audit, time-bounded grants that require re-authorization). Curious if you've thought about that layer.

Self-Correcting Systems • Jun 3

The test-that-runs-but-asserts-wrong being worse than deletion is the cleanest way I've
heard this framed. At least a deleted test is honest about what it doesn't cover. A
stale grant that still matches on shape but not authority is the same failure — the
gate fires, passes, and the result looks clean.

The testing in the small recursion is accurate. The grant table is another memory with
a different shape. The gate moved but the problem didn't. At some point the technical
solution runs out.

The organizational primitives you named are probably where it actually bottoms out:
four-eyes on write-time grants, externally auditable grant history, time-bounded
entries that require re-authorization rather than just expiring silently. The
time-bound piece connects to CLAIM-23 one layer down — expiry was tested at execution.
Applying the same principle at write-time means the grant record needs a paper trail of
who wrote it and who re-authorized it, not just when it expires.

The part I don't have a clean answer for: who audits the auditor. At some point the
boundary is organizational trust, not technical primitives. That feels like the honest
limit of what a harness can test.

xulingfeng • Jun 3

Exactly. "Who audits the auditor" is the honest boundary of any technical harness. Once the recursion bottoms out at organizational trust, the system has done its job — it surfaced the gap clearly, even if it can't close it. That's more than most authorization architectures achieve.
Good discussion. Followed to see where CLAIM-24 goes.

Self-Correcting Systems • Jun 3

Glad it was useful. Followed back. The re-derivation direction is where CLAIM-24 is
heading — ANP2 pushed on expiry being a stored field just like any other
self-description. That is the next test to build.

Mohsin Raza • Jun 5 • Edited

The issue is always there and to be frank everyone always find our self n a trapped environment that is why i am building sysnerve.com if you can use this many issues we see on our daily routine canbe fix and the time we utilize in fake bugs or debugging can be utilized