DEV Community

liuhaotian2024-prog
liuhaotian2024-prog

Posted on • Originally published at github.com

Why auditing AI agents requires causal AI, not another LLM

Your Logs Tell You What Happened. They Don't Tell You What Should Have Happened.

Haotian Liu · March 2026


The gap nobody talks about

Your AI agent ran overnight. The result is wrong. You open the terminal — and you see a wall of log lines telling you exactly what the agent did, step by step.

But none of those lines tell you what it was supposed to do. And none of them tell you where it started going off the rails.

This is not a logging problem. This is a structural gap in how we think about agent observability.

Logs record the execution. They do not record the intent. Without intent, you cannot measure deviation. Without measuring deviation, you are not auditing — you are just collecting noise.


Why existing tools don't solve this

Tools like LangSmith, Langfuse, and Arize are genuinely useful for what they do: tracing execution, tracking latency and cost, visualizing call chains. If you need to know how long your agent took or how many tokens it consumed, these tools are excellent.

But they are built on a flat timeline model. They record what happened. They do not record what the system intended to happen. And crucially, most of them evaluate output quality using another LLM as a judge.

This is the paradox: a probabilistic system cannot render certain judgment about another probabilistic system. An LLM evaluator is itself uncertain. Its output varies between runs. Using it to audit an agent is like asking one suspect to verify another suspect's alibi.

You cannot build forensic-grade evidence on probabilistic foundations.


What causal auditing actually means

The alternative is to separate the question into two parts that can be answered deterministically:

1. What was the agent supposed to do? This is defined explicitly, before runtime, as a set of constraints: no staging URLs in production config, trade amount below 500, file writes only within the project directory.

2. What did the agent actually do, and how far did it deviate? This is recorded at runtime by comparing every action against the pre-defined constraints.

Neither question requires an LLM to answer. Both can be answered by deterministic, mathematical comparison.

This is the CIEU model — Causal Intent-Execution Unit. Every monitored action produces a five-tuple:

X_t   — who acted, and under what conditions
U_t   — what the agent actually did
Y*_t  — what the agent was supposed to do (the intent contract)
Y_t+1 — what actually resulted
R_t+1 — how far the outcome diverged from intent, and why
Enter fullscreen mode Exit fullscreen mode

These five fields are written into a local ledger as a hash-chained record. Each record's SHA256 hash is embedded into the next record. Nothing can be silently modified after the fact. The chain is cryptographically verifiable.

This is not a new log format. It is a different category of infrastructure: tamper-evident causal evidence.


A real example: three silent writes

On March 4, 2026, during a routine quant backtesting session, Claude Code attempted three times — 41 minutes apart — to write a staging environment URL into a production config file:

{"endpoint": "https://api.market-data.staging.internal/v2/ohlcv"}
Enter fullscreen mode Exit fullscreen mode

The syntax was valid. No exception was thrown. A conventional logger would have recorded three "file write" events and moved on — quietly corrupting every subsequent backtest result.

Because the function was instrumented with a CIEU constraint:

@k9(deny_content=["staging.internal"], allowed_paths=["./project/**"])
def write_config(path: str, content: dict) -> bool: ...
Enter fullscreen mode Exit fullscreen mode

...all three attempts were flagged immediately, written to the ledger with severity 0.9, and made permanently traceable. The root cause was identified in under a second:

k9log trace --last
→ seq=451  VIOLATION  _write_file
   finding: content contains forbidden pattern 'staging.internal'
   causal_proof: root cause traced to step #449, chain intact
Enter fullscreen mode Exit fullscreen mode

Three attempts. 41 minutes apart. All recorded. All verifiable.


The three moments when this matters

When something goes wrong at 3am. You don't want to read 10,000 log lines. You want to run one command and see: which step deviated, from which constraint, by how much. That is what k9log trace --last gives you — in under a second.

When you need to show proof. Your agent caused a problem in production. Your client asks what happened. You pull up a terminal screenshot. It could have been edited. Nobody trusts it. A SHA256 hash-chained ledger, verified with k9log verify-log, is cryptographic proof that the record has not been tampered with since it was written. That is evidence a screenshot cannot provide.

When you need approval to deploy. Your manager asks: what happens if the agent goes out of bounds? Without a concrete answer, the project dies in the approval meeting. With CIEU constraints defined and a verifiable ledger in place, the answer is: every action is measured against explicit rules, deviations are flagged immediately, and the record cannot be altered retroactively. That answer gets projects approved.


What this looks like in practice

For a Python developer, instrumentation is one decorator:

@k9(deny_content=['staging.internal'], amount={'max': 500})
def execute_trade(symbol: str, amount: float, endpoint: str) -> dict:
    ...
Enter fullscreen mode Exit fullscreen mode

For a Claude Code user, it is one JSON file in the project root — no code changes required:

{"hooks": {
  "PreToolUse":  [{"matcher": "*", "hooks": [{"type": "command", "command": "python -m k9log.hook"}]}],
  "PostToolUse": [{"matcher": "*", "hooks": [{"type": "command", "command": "python -m k9log.hook_post"}]}]
}}
Enter fullscreen mode Exit fullscreen mode

The ledger is stored locally at ~/.k9log/logs/k9log.cieu.jsonl. No data leaves the machine. No tokens are consumed. No per-event billing.


The boundary worth stating clearly

CIEU auditing answers one question: did the agent violate the constraints you defined?

It does not answer: did the agent accomplish the goal you gave it? That question requires evaluation of task completion, which is a different problem — and one that legitimately benefits from LLM evaluation. The two approaches are not in competition. CIEU auditing provides the deterministic foundation; higher-level evaluation can be built on top.

The mistake is trying to use a probabilistic evaluator as a substitute for a deterministic record. These are not interchangeable.


Who is this for

Scenario Entry point Key commands Notes
⭐ Claude Code user One .claude/settings.json k9log trace / stats Zero Python required. Every tool call auto-recorded. Unique differentiator vs all competitors.
✅ Python developer @k9 decorator k9log trace --last / report One decorator per function. Sync and async both supported.
✅ LangChain agent K9CallbackHandler (3 lines) k9log trace / verify-log Native callback hook. Full CIEU records per tool call.
✅ High-risk business ops @k9 + JSON config k9log alerts / causal Finance, config writes, DB ops. Numeric + content constraints.
✅ DevOps / CI pipeline @k9 + ci_check.py ci_check.py / verify-log Pipeline halts on violation. Exit code non-zero. No manual review.
✅ Small team debugging Any entry point k9log trace --last / stats Root cause in under a second. No log archaeology required.
✅ Data security @k9 deny_content k9log verify-log / report File access control. Cryptographic proof of what was touched.
✅ Teaching / tutorials Any entry point k9log report / causal Easiest audience to reach today. HTML report is shareable. Demo violations visually.
🔲 CrewAI / AutoGen Wrapper pattern k9log trace Works via @k9 on tool functions. Native adapters on roadmap.
🔲 Enterprise compliance Full audit chain k9log verify-log / report Future use case. Needs organisational trust-building first.

⭐ = unique differentiator    ✅ = works today    🔲 = roadmap


Try it

K9 Audit is open source under AGPL-3.0. The CIEU architecture is covered by U.S. Provisional Patent Application No. 63/981,777.

If this resonates with a problem you have hit — or if you think the approach is wrong — I want to hear from you.

Top comments (1)

Collapse
 
liuhaotian2024prog profile image
liuhaotian2024-prog

"I wrote this after a real incident on March 4 — three silent writes, 41 minutes apart. Happy to answer any questions about the CIEU model or the implementation."