AI agents are everywhere — approving refunds, managing infrastructure, calling APIs, making decisions that affect real users and real money. Multiple frameworks build them (LangChain, CrewAI, AutoGen, OpenClaw, custom stacks), and multiple protocols power their actions (HTTP, MCP, A2A).
But there's a fundamental gap: there is no common observability protocol for what agents actually do.
The Problem
Every framework handles logging differently. Most don't log at all. And none of it is tamper-proof.
AI hallucination is still an unsolved problem. An agent can:
- Fabricate a tool call that never happened
- Misrepresent what the LLM actually returned
- Act without proper authorization
And you'd never know.
Now think about where agents are headed — healthcare, finance, defense, critical infrastructure. In these environments, a rogue agent or a cyber attack exploiting an agent isn't hypothetical. It's inevitable.
And it's not just agents within a single system. Soon it will be common for two different enterprises' agents to transact with each other. Personal agents will negotiate with company agents on your behalf. Without a common observability format, there's no shared source of truth about what happened.
Agent History Protocol (AHP)
I built AHP as an open standard for tamper-evident, hash-chained recording of every AI agent action.
Every HTTP call, MCP tool use, A2A message, LLM inference, and authorization decision gets recorded as a cryptographically linked, append-only record. Each record contains a SHA-256 hash of the previous record. If anything is modified, deleted, inserted, or reordered — the hash chain breaks.
Quick Start
pip install open-ahp
from ahp.core.chain import ChainWriter
from ahp.core.records import BootPayload
# Start recording
writer = ChainWriter("my-agent.ahp")
writer.write_boot(BootPayload(
agent_name="my-agent",
sdk_name="ahp-py",
sdk_version="0.1.0"
))
# Auto-instrumentation intercepts HTTP calls automatically
# Every action gets a hash-chained, append-only record
TypeScript SDK:
npm install open-ahp
Verify and inspect:
# Verify chain integrity
ahp verify --chain my-agent.ahp
# View the action log
ahp log --chain my-agent.ahp
# Export to JSON/CSV
ahp export --chain my-agent.ahp --format csv
Three Conformance Levels
Level 1 — Recording: Every action gets a hash-chained record with timestamps, sequence numbers, protocol type, tool name, parameter/result hashes, response times, and authorization details.
Level 2 — Signing: Ed25519 cryptographic signatures on checkpoint records. Forged records from a different agent fail signature verification. Merkle root signature catches any tampering across the entire chain.
Level 3 — Witness: Independent witness servers hold signed receipts of chain checkpoints. Even if an agent deletes its entire chain file, the witness has cryptographic proof that the chain existed.
What It Detects
| Tampering Attempt | Detection Method |
|---|---|
| Modified records (tool calls, inferences, authorizations) | Hash chain breaks |
| Deleted records from the middle | Sequence gap + hash mismatch |
| Inserted fake records | Hash chain breaks at insertion point |
| Reordered records | Sequence + hash mismatch |
| Agent lying about calls, authorization, or LLM responses | Hash of parameters/results doesn't match |
| Forged records from different agent (Level 2) | Ed25519 signature verification fails |
| Checkpoint tampering (Level 2) | Merkle root signature catches it |
| Agent deleting its own chain (Level 3) | Witness server has independent signed receipts |
What It Does NOT Prevent
Being honest about limitations matters:
- Reading the chain — it's not encrypted
- A compromised agent writing false records going forward — the agent controls its own writer
- Bad actions before they're recorded — AHP records what happened, it can't prevent bad actions
Think of it like a flight recorder on an airplane — it doesn't prevent the crash, but it makes it impossible to lie about what happened afterward. An enterprise auditor can run ahp verify on any chain and know immediately if anything was changed.
Auto-Instrumentation
AHP is framework-agnostic. The SDKs auto-instrument at the protocol level:
Python: Intercepts urllib, requests, and httpx — covers any framework that makes HTTP calls (which is all of them).
TypeScript: Intercepts globalThis.fetch — covers any Node.js agent framework.
Drop it into any existing agent system. No code changes. It records what agents do, not how they're built.
Why a Common Protocol
Just like HTTP standardized web communication and OpenTelemetry standardized application observability, agents need a shared format for accountability. Not a framework-specific logger — a standard that any framework can write, any auditor can verify, and any tool can inspect.
AHP is that standard.
Links
- GitHub: github.com/iamanandsingh/agent-history-protocol
- Full Specification: Included in the repo
- License: Apache 2.0
Feedback, issues, and contributions welcome. If you're building agents in production, I'd especially love to hear what observability gaps you're dealing with today.
Top comments (0)