DEV Community

Teller
Teller

Posted on

Your AI agent called a tool. Can you prove it followed the rules?

Your AI agent called a tool. Can you prove it followed the rules?

Your agent just wrote a file. You have logs. But can you answer this:

Was the policy gate applied before the tool ran — or after?

Logs can't tell you that. Here's how we solved it.

The gap in current agent frameworks

Most frameworks give you a log line like:

[2026-07-07T09:00:01Z] tool:fs_write path=/tmp/report.txt status=ok
Enter fullscreen mode Exit fullscreen mode

That tells you the tool ran. It doesn't tell you:

  • Whether a policy evaluated the call first
  • What the pre-state looked like before the write
  • Whether the agent was within its token and risk budget
  • Which agent in a delegation chain authorized this

For a hobby project, that's fine. For anything touching real data, it's not.

AEP: structured proof, not a log stream

WasmAgent's @wasmagent/aep package records every tool call as an ActionEvidence object — Zod-validated, schema-versioned, with pre/post state digests baked in.

import { AEPEmitter } from "@wasmagent/aep";

const emitter = new AEPEmitter({
  run_id: "run-abc123",
  repo_commit: "5c1102f",
  model_id: "claude-sonnet-4-6",
});

// Before tool execution:
emitter.addAction({
  tool_name: "fs_write",
  state_changing: true,
  capability_decision: {
    capability: "fs_write",
    subject: "agent:run-abc123",
    resource: "/tmp/report.txt",
    decision: "allow",
    reason_code: "policy:default-v1",
  },
  precondition_digest: "sha256:a1b2c3...",
  input_taint_labels: ["user_provided"],
});

// After tool execution:
emitter.addAction({
  tool_name: "fs_write",
  state_changing: true,
  post_state_digest: "sha256:d4e5f6...",
});

emitter.setBudgetLedger({
  token_budget: { limit: 4000, spent: 142 },
  risk_budget:  { limit: 1.0,  spent: 0.2 },
});

const record = emitter.build();
Enter fullscreen mode Exit fullscreen mode

The capability_decision is part of the same record as the action — not a separate log entry that could be reordered or dropped.

OTel spans for everything else

For real-time observability, AEP also emits named OpenTelemetry spans:

import { AEP_SPAN_NAMES } from "@wasmagent/otel-exporter";

// Plug into any OTel collector:
AEP_SPAN_NAMES.TOOL_CALL       // "tool.call"
AEP_SPAN_NAMES.POLICY_CHECK    // "policy.check"
AEP_SPAN_NAMES.SANDBOX_EXEC    // "sandbox.exec"
AEP_SPAN_NAMES.VERIFIER_CHECK  // "verifier.check"
AEP_SPAN_NAMES.LLM_GENERATE    // "llm.generate"
AEP_SPAN_NAMES.MCP_REQUEST     // "mcp.request"
// + 3 more
Enter fullscreen mode Exit fullscreen mode

The spans go to Grafana/Jaeger/etc. The AEPRecord is what you keep for audit and training data.

Multi-agent: delegation chain

In a single-agent setup, this is useful. In a multi-agent setup — orchestrator delegates to a subagent — it becomes essential:

run_context: {
  agent_id: "orchestrator",
  subagent_id: "coder-agent",
  delegation_chain: ["orchestrator", "planner", "coder-agent"],
  scope_lease_id: "lease-xyz",  // ← subagent can only do what parent granted
}
Enter fullscreen mode Exit fullscreen mode

Try it

git clone https://github.com/WasmAgent/wasmagent-js
bun test packages/aep/src/
Enter fullscreen mode Exit fullscreen mode

Next in this series: MCP Trust Pack — the gateway layer that enforces policy before tools execute.

Code: packages/aep · packages/otel-exporter

Top comments (0)