Every verification layer built around an AI agent tends to grab the same kind of handle: an artifact.
A log entry. A reviewer signature. A tool result. A structured output block. A reconciler compares one artifact against another and decides whether the system is still inside its rails.
That works for failures that leave residue.
A forged tool result can be rejected. A mismatched call ID can be flagged. A malformed JSON block can be quarantined. A signature over the wrong payload can fail verification. These are all comfortable failures because they produce something the system can inspect.
The nastier class ships no artifact at all.
Omission is hard because an append-only log renders several states as the same visible thing: it did not happen, it has not happened yet, and it happened but was never recorded. All three appear as absence. The log contains nothing. The audit query returns nothing. The detector has no string to match, no ID to compare, no block to reject.
Absence is ambiguous by default.
Silence ages badly
Start with an attestation ledger.
An agent takes actions: edits a file, sends a message, opens a ticket, queues a deploy. Reviewers are expected to attest to those actions after the fact. The ledger stores the action record, then later stores a reviewer signature or approval event.
On paper this is clean: signatures are queryable, and each attestation payload can be verified against its action hash.
Now ask what a missing attestation means.
Maybe the reviewer rejected the action verbally and never clicked anything. Maybe the reviewer has not seen it yet. Maybe the action should have been routed to a reviewer, but the routing rule skipped it. Maybe the organization has quietly learned that unsigned records are normal because nobody gets paged for them.
The ledger cannot tell.
A record nobody attested is byte-for-byte indistinguishable from a record whose reviewer just has not gotten to it. At scale, silence quietly becomes consent. The dashboard still shows a healthy append-only history. The signatures that do exist verify cleanly. The audit trail has integrity over the records it contains.
The missing state is doing the damage.
The repair is to make silence expire. An unattested action needs to age into a positive state that can be queried and alerted on. Pending is allowed only inside a defined review window. After that, the system must append a terminal event such as REVIEW_UNRESOLVED, REVIEW_EXPIRED, or REVIEW_REPUDIATED.
That changes the reader's view of the log. The query no longer asks only for approved records. It asks for actions whose current review state is one of approved, repudiated, or unresolved. The bad case has a name.
This is not cosmetic. A state named unresolved can break a release gate. It can page the owner of the queue. It can be counted without pretending that pending is a harmless neutral value.
Silence needs an expiry date.
Claims need provenance
A second failure looks different because it happens in prose.
An agent says, "the file was empty." Or: "I confirmed the deploy succeeded." Or: "the customer account has no open invoices."
There is no fake tool-output block. No forged observation ID. No counterfeit result with the wrong schema. The model did not fabricate a provenance marker; it skipped the provenance question entirely.
A detector that hunts forged artifacts has nothing to match.
This matters because many agent systems treat prose as a soft channel until it becomes operationally relevant. The agent writes an explanation, then a planner or policy engine reads that explanation, extracts intent, and proceeds. The sentence "I confirmed the deploy succeeded" can become a dependency for the next step even when no deploy-status tool call exists.
A smarter forged-output detector will not fix this. The problem is the definition of a well-formed claim.
If an assertion about world state can influence a downstream action, it must cite an observation. That observation might be a tool result ID, a file snapshot hash, a database read event, or another typed artifact with a clear producer. Without that citation, the message is malformed for operational purposes.
The enforcement point does not need to understand whether "the file was empty" is true. It only needs to know whether the claim carries a usable reference.
A simple shape is enough:
{
"claim": "deploy succeeded",
"subject": "service.api",
"observation_id": "obs_48291",
"supports_action": "promote_release"
}
The prose can still exist. People like prose. But anything that gates a side effect should depend on the structured claim, and the structured claim should fail closed when observation_id is missing or points to an observation of the wrong type.
This converts an unverifiable semantics problem into a missing-citation problem. Missing citations are checkable.
That boundary is where a lot of agent safety work gets sharper. Do not try to infer from model text whether the agent "really checked." Make it impossible for a claim about external state to count unless it names the observation that supports it.
The claim can be wrong with a citation. The cited tool can be buggy. The external system can lie. Those are real problems. They are at least problems with artifacts attached.
An uncited claim is negative space pretending to be knowledge.
Intent before effect
The third failure is old, and agent systems make it easier to hit.
A worker sends an email, opens a pull request, charges a card, posts a comment, or triggers a deploy. Then it dies before appending the result event.
On replay, the log cannot tell whether the side effect already happened. It sees no outcome. Blind retry risks doing the action twice. Blind skip risks dropping it.
The intuitive version of event sourcing says "append the result after the work." That is too late for external side effects. The dangerous gap sits between the effect and the log write.
The repair is a two-event split.
First append INTENT, carrying an idempotency_key, the target, the operation, and enough parameters to reconcile later. Then perform the side effect. Then append OUTCOME with the external reference or error.
Now the log can represent the uncomfortable middle:
-
INTENTexists,OUTCOMEexists: the operation reached a terminal recorded state. -
INTENTexists,OUTCOMEmissing: reconciliation required. - no
INTENT: nothing should have been attempted at all, and any external trace is out of protocol.
That middle state is the whole point. Intent-without-outcome names a concrete piece of work, with a defined question to ask.
A reconciler can ask the external system, "do you have an operation with this idempotency_key?" If yes, append the observed outcome. If no, retry using the same key. If the external system cannot answer by key, escalate to manual resolution or a domain-specific compensating action.
There is an honest limit here: this only works if the downstream system honors the idempotency key or exposes enough query surface to reconcile by it. If the target system treats every retry as a fresh command and gives you no stable lookup path, no amount of log discipline will fully save you.
That boundary is the real design problem.
For agent systems, this bites whenever tool calls mutate external state and the worker records nothing because the process died before it could. The replay system sees absence. Absence is not evidence.
An INTENT event gives absence a contour. It marks the place where the system crossed from planning into attempted mutation. Without it, the log asks future code to infer history from a blank space.
Unknown cannot be a warehouse
A dashboard that marks unverified claims as unknown is better than one that assumes success. For a while.
Suppose an agent reviews repository changes and emits facts: tests passed, dependency scan clean, migration generated, rollback path present. The dashboard refuses to show green unless each fact cites an observation. Missing observations render as unknown.
That is honest. It prevents false confidence. It also degrades quickly if unknowns never settle.
The first week, unknown means "needs follow-up." Later, it means "normal backlog." Eventually, it becomes the dominant state. The dashboard has stopped lying, but it has also stopped helping. Teams learn to filter unknown away because otherwise every view is noise.
Distinguishing zero from unknown has no value unless something forces unknowns to resolve.
Every unknown needs a reconciliation deadline and an owner. After the deadline, the system must append a positive artifact: CLAIM_VERIFIED, CLAIM_DISPROVED, CLAIM_UNRESOLVED, or a domain-specific terminal state. The dashboard should age unknowns visibly. A fresh unknown and a stale unknown are not the same operational condition.
This is the same shape as the attestation problem, but it bites in analytics and governance layers rather than approval flows. The system correctly refuses to invent a fact. Then it forgets to create the work needed to learn the fact.
Unknown is a staging state, not storage.
A useful dashboard makes the absence of evidence expensive to ignore. It does not let absence sit forever as a gray cell in a table that everyone scrolls past.
Make negative space queryable
The common move across these cases is simple: convert absence into a positive artifact that checking machinery can grab.
Deadlines turn silence into a terminal review state. Claim schemas turn missing provenance into a malformed message. Intent events turn "maybe it ran" into "intent recorded at step N, outcome missing." Reconciliation deadlines turn accumulated unknowns into assigned work.
The design rule is harsher than most logging guidelines: for every artifact your system emits on success, ask what the reader of the log sees when that artifact is missing.
If the answer is "nothing," you have a blind spot exactly where your worst incident will live.
This applies to audit systems too. An auditor can verify every hash in the chain and still miss that a third of the actions never produced records. A red team can check that forged tool outputs are caught and still miss that uncited prose is accepted as evidence. Integrity over existing records does not prove completeness of the set.
Completeness is where omission hides.
The hard part is that the absence has to be represented before the incident. Afterward, everyone can point at the empty place in the log and say a record should have been there. That is cheap hindsight. The system needs to know, while running, that the empty place is meaningful.
So design the negative states as first-class records. Give them names. Give them owners. Put them in queries. Make them fail gates.
Otherwise the log will say nothing, and nothing will be read as whatever is most convenient.
What does your system record when the most important thing is missing?
Top comments (0)