DEV Community

ArkForge
ArkForge

Posted on • Originally published at arkforge.tech

MCP Execution Attestation: What Happens Between Tool Call and Tool Result

MCP gives you a tool_call and a tool_result. Everything in between—the actual execution—is a black box. Here's what happens there, why it matters for compliance and A2A trust, and how to attest it.

MCP Execution Attestation: What Happens Between Tool Call and Tool Result

MCP gives you two events: tool_call and tool_result. The protocol is well-specified. The transport is observable. The schema is typed.

What you don't get is a record of what happened between those two events.

That gap has a name in AI governance work: the execution attestation problem. It's distinct from audit logging. You can log every tool call perfectly and still have no proof of what the tool did—whether it mutated state, which records it touched, whether the result it returned reflects what actually happened.

This matters as soon as agents start running with real authority.


The Black Box Between Call and Result

When a model issues a tool call—say search_orders(customer_id="C-4421", status="pending")—here's what the protocol captures:

  • The model's intent: the tool name and arguments it generated
  • The result: whatever the server returned

What's missing: everything that happened inside the tool.

The tool might have:

  • Queried a database (which records? which version of the data?)
  • Called an external API (which endpoint? what response code?)
  • Written a side effect (a log entry, a cache update, a webhook trigger)
  • Applied business logic that filtered, transformed, or redacted the raw result

None of this is captured. The tool_result you receive is the tool's self-report. You're trusting the tool's word that it ran correctly, returned accurate data, and had only the side effects it was supposed to have.

In a single-agent, developer-controlled setup, this is usually fine. You wrote the tool; you trust it.

In an A2A workflow—where an orchestrator calls tools via a subagent's MCP server—or in any regulated context, it's not fine at all.


Why Audit Logs Don't Solve This

The standard response is "log everything." Application logs capture execution activity. But logs have a fundamental limitation: they're self-generated by the system you're trying to verify.

If a tool logs executed_successfully: true, that log lives in the same trust boundary as the tool itself. The same infrastructure, the same admin access, the same failure modes. If the tool misbehaves—or is compromised—its logs can be inconsistent, missing, or actively misleading.

This is the same problem with any self-attesting evidence. The compliance requirement isn't "show me your logs." It's "show me proof you can't have fabricated."

There's also a structural gap that logs miss entirely: the binding between model intent and execution result. Even with complete logs on both sides, you can't cryptographically prove that the tool_result the model received is the same result the tool produced—and that nothing modified it in transit.


What Execution Attestation Actually Requires

For an MCP tool execution to be attested, you need three things bound together:

  1. The call: what the model requested (tool name + arguments), with a timestamp and session identifier
  2. The execution proof: evidence from outside the tool's trust boundary that it ran—and what it did
  3. The result binding: cryptographic proof that the result delivered to the model is the same result produced by execution

The third point is the one most implementations miss. You can log the call and log the result separately. But without a signed binding that links them, you can't prove they correspond to the same execution event.


The Attestation Pattern

The simplest pattern is an execution envelope: a signed record that wraps the tool call and result together, generated by a component outside the tool's own trust domain.

Agent
  │  tool_call (tool_name, args, call_id)
  ▼
[Attesting Proxy]  ← outside tool's trust boundary
  │  forwards call + seals envelope open
  ▼
MCP Server (Tool executes)
  │  tool_result
  ▼
[Attesting Proxy]  ← seals envelope closed with result hash
  │  returns result to agent + stores signed envelope
  ▼
Agent
  │  tool_result (unchanged)
Enter fullscreen mode Exit fullscreen mode

The proxy generates an execution envelope that contains:

  • Call hash: H(tool_name || args || call_id || timestamp)
  • Result hash: H(result || call_hash) — chains to the call
  • Attestation signature: signed by a key the tool server doesn't control

The key point: the signature is generated by a separate process. If the tool server is compromised, it can't forge the attestation signature without also compromising the attesting proxy.


Three Lines That Wire It In

If you're running a certifying proxy like ArkForge Trust Layer, instrumenting an existing MCP call is minimal:

from arkforge_trust import certify

result = certify(client.call_tool("search_orders", {"customer_id": "C-4421"}))
# result.attestation_id  → verifiable receipt
# result.value           → original tool output, unmodified
Enter fullscreen mode Exit fullscreen mode

The certify() wrapper intercepts the call, routes it through the attesting proxy, and returns the original result alongside a verifiable receipt. The call behavior is unchanged; the attestation runs out of band.

The receipt is a signed JSON envelope. Any downstream system—another agent, a compliance endpoint, an external auditor—can verify it independently against the Trust Layer's public key without contacting your infrastructure.


Where This Shows Up in Practice

A2A workflows: When an orchestrator delegates to a subagent, the subagent's tool calls happen outside the orchestrator's direct observation. Execution attestation lets the orchestrator verify—after the fact—that the subagent ran specific tools with specific arguments and received specific results. This is the foundation of inter-agent accountability.

EU AI Act compliance (Art. 12): High-risk AI systems need logging "sufficient to ensure" that outputs are traceable. Self-generated logs don't satisfy a strict reading of "sufficient." Attested execution envelopes do.

OWASP ASVS for AI agents: The agentic security working groups (OWASP, NIST CBRN, the A2A spec discussions) are converging on the same requirement: tool calls in autonomous agents need integrity proofs, not just logs.

Insurance and contracting: As agents handle higher-value tasks, the question "can you prove what your agent did?" has legal and financial weight. An attestation receipt is the answer to that question.


The Trust Boundary Is the Real Problem

Most of the complexity in execution attestation comes down to trust boundaries. A tool that attests its own execution hasn't solved anything—it's still self-reporting.

The architectural requirement is: the attesting component must be outside the trust boundary of the thing being attested. This is why a sidecar proxy works better than instrumentation inside the tool server itself. And why a managed attestation service works better than a sidecar you also operate.

The three-line snippet above delegates the attestation to an external service. The tool server can't influence what gets signed. If the tool returns fabricated data, the attestation receipt reflects exactly what was returned—and that discrepancy is detectable.


The Gap Is Protocol-Level, Not Implementation-Level

This isn't an MCP bug. The protocol doesn't claim to provide execution attestation. The attestation gap exists in HTTP, in gRPC, in every RPC protocol—because protocols specify message exchange, not execution proof.

The difference now is that models are calling tools autonomously. The execution gap that was acceptable in a developer-controlled tool call is not acceptable when an agent calls transfer_funds or revoke_access without a human in the loop.

Tooling is catching up. The A2A spec is adding agent card verification. OWASP is drafting agentic security controls. MCP security working groups are discussing server-side attestation proposals.

The pattern exists today. The proxies work. The receipts are verifiable. What's missing is adoption—and the defaults in most MCP implementations don't give you any of this unless you build it yourself.

That's the gap worth closing.


ArkForge Trust Layer provides execution attestation for MCP tool calls via a certifying proxy. Receipts are verifiable independently of your infrastructure. arkforge.tech

Top comments (0)