DEV Community

lulzasaur
lulzasaur

Posted on

Add Tamper-Proof Audit Logs to Your AI Agent in 10 Minutes

Add Tamper-Proof Audit Logs to Your AI Agent in 10 Minutes

If you're building AI agents that make decisions, call tools, or interact with users, you need an audit trail. Not just logs — a cryptographically verifiable chain that proves no entries were modified or deleted after the fact.

This matters for compliance, debugging, and trust. When your agent makes a $10,000 trading decision or sends an email on behalf of a user, you need to prove exactly what happened and that the record hasn't been tampered with.

I built an audit log API with HMAC chain verification specifically for AI agents. Every log entry is cryptographically linked to the previous one — if anyone modifies a single entry, the chain breaks and verification fails.

How HMAC Chain Verification Works

Each log entry gets an HMAC (Hash-based Message Authentication Code) computed from:

  1. The entry's content (agent_id, action, details, outcome)
  2. The previous entry's HMAC
  3. A server-side secret key

This creates a chain where every entry depends on all previous entries. Modify entry #50 and entries #51 through #1000 all fail verification. It's the same principle behind blockchain, but without the overhead.

Entry 1: HMAC(content_1 + "genesis")     → hmac_1
Entry 2: HMAC(content_2 + hmac_1)        → hmac_2
Entry 3: HMAC(content_3 + hmac_2)        → hmac_3
...
Tamper entry 2? → hmac_2 changes → hmac_3 invalid → chain broken
Enter fullscreen mode Exit fullscreen mode

Quick Start: Log Your First Agent Action

import requests

API_URL = "https://agent-audit-log-api-production.up.railway.app"

# Log an agent action
response = requests.post(f"{API_URL}/v1/logs", json={
    "agent_id": "trading-bot-001",
    "action": "tool_call",
    "details": {
        "tool": "execute_trade",
        "ticker": "AAPL",
        "shares": 100,
        "price": 185.50,
        "side": "buy"
    },
    "outcome": "success"
})

entry = response.json()
print(f"Logged: {entry['id']}")
print(f"HMAC: {entry['hmac']}")
print(f"Sequence: {entry['entry_sequence']}")
Enter fullscreen mode Exit fullscreen mode

That's it. One POST request and you have a tamper-proof audit entry. The API handles HMAC computation, chain linking, and storage.

Supported Action Types

The API is flexible about what you log. Common patterns:

# Session lifecycle
requests.post(f"{API_URL}/v1/logs", json={
    "agent_id": "support-agent-42",
    "action": "session_start",
    "details": {"user_id": "usr_abc123", "channel": "slack"},
    "outcome": "success"
})

# Tool calls
requests.post(f"{API_URL}/v1/logs", json={
    "agent_id": "support-agent-42",
    "action": "tool_call",
    "details": {"tool": "search_knowledge_base", "query": "refund policy"},
    "outcome": "success"
})

# Decisions
requests.post(f"{API_URL}/v1/logs", json={
    "agent_id": "support-agent-42",
    "action": "decision",
    "details": {
        "decision": "escalate_to_human",
        "reason": "Customer requesting account deletion",
        "confidence": 0.92
    },
    "outcome": "success"
})

# Errors
requests.post(f"{API_URL}/v1/logs", json={
    "agent_id": "support-agent-42",
    "action": "error",
    "details": {"error": "API timeout", "retry_count": 3},
    "outcome": "failure"
})
Enter fullscreen mode Exit fullscreen mode

Supported entry types: tool_call, tool_result, decision, action, error, session_start, session_end, memory, agent_action, agent_finish, llm_start, llm_end.

Verify Chain Integrity

The whole point of HMAC chains is verification. One call tells you if the audit trail is intact:

# Verify the chain for an agent
verify = requests.post(f"{API_URL}/v1/verify", json={
    "agent_id": "trading-bot-001"
})

result = verify.json()
print(f"Chain intact: {result['valid']}")
print(f"Chain length: {result['chain_length']}")
print(f"Errors: {result['errors']}")
Enter fullscreen mode Exit fullscreen mode

If valid is true, every entry in the chain is unmodified since creation. If it's false, the response tells you exactly where the chain broke.

JavaScript Integration

const API_URL = "https://agent-audit-log-api-production.up.railway.app";

class AuditLogger {
  constructor(agentId) {
    this.agentId = agentId;
  }

  async log(action, details, outcome = "success") {
    const res = await fetch(`${API_URL}/v1/logs`, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        agent_id: this.agentId,
        action,
        details,
        outcome
      })
    });
    return res.json();
  }

  async verify() {
    const res = await fetch(`${API_URL}/v1/verify`, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ agent_id: this.agentId })
    });
    return res.json();
  }

  async getHistory(limit = 50) {
    const res = await fetch(
      `${API_URL}/v1/logs?agent_id=${this.agentId}&limit=${limit}`
    );
    return res.json();
  }
}

// Usage
const audit = new AuditLogger("my-agent-001");

await audit.log("session_start", { user: "alice", channel: "api" });
await audit.log("tool_call", { tool: "web_search", query: "weather NYC" });
await audit.log("decision", { response: "Current temp is 72F", model: "gpt-4" });
await audit.log("session_end", { messages: 3, duration_ms: 4500 });

// Verify nothing was tampered with
const integrity = await audit.verify();
console.log(`Chain valid: ${integrity.valid} (${integrity.chain_length} entries)`);
Enter fullscreen mode Exit fullscreen mode

LangChain Integration

If you're using LangChain, you can drop in an audit callback that automatically logs every LLM call, tool use, and agent decision:

from langchain.callbacks.base import BaseCallbackHandler
import requests

class AuditLogCallback(BaseCallbackHandler):
    def __init__(self, agent_id, api_url):
        self.agent_id = agent_id
        self.api_url = api_url

    def _log(self, action, details, outcome="success"):
        requests.post(f"{self.api_url}/v1/logs", json={
            "agent_id": self.agent_id,
            "action": action,
            "details": details,
            "outcome": outcome
        })

    def on_llm_start(self, serialized, prompts, **kwargs):
        self._log("llm_start", {"model": serialized.get("name", "unknown")})

    def on_llm_end(self, response, **kwargs):
        self._log("llm_end", {
            "tokens": response.llm_output.get("token_usage", {}) if response.llm_output else {}
        })

    def on_tool_start(self, serialized, input_str, **kwargs):
        self._log("tool_call", {"tool": serialized.get("name"), "input": input_str[:500]})

    def on_tool_end(self, output, **kwargs):
        self._log("tool_result", {"output": str(output)[:500]})

    def on_agent_finish(self, finish, **kwargs):
        self._log("agent_finish", {"output": finish.return_values.get("output", "")[:500]})


# Drop into any LangChain agent
from langchain.agents import initialize_agent

audit = AuditLogCallback(
    agent_id="langchain-agent-001",
    api_url="https://agent-audit-log-api-production.up.railway.app"
)

agent = initialize_agent(tools, llm, callbacks=[audit])
agent.run("What's the weather in NYC?")
# Every LLM call, tool use, and decision is now audit-logged with HMAC verification
Enter fullscreen mode Exit fullscreen mode

Why Not Just Use CloudWatch / Datadog / Regular Logs?

Regular logging systems are designed for debugging. They're mutable — anyone with admin access can edit or delete entries. That's fine for application logs, but not for audit trails where the integrity of the record matters.

The HMAC chain gives you:

  • Tamper evidence: Any modification is immediately detectable
  • Non-repudiation: You can prove an action happened (or didn't)
  • Compliance: Meets audit trail requirements for regulated industries
  • Agent-native: Built specifically for the log-every-action pattern of AI agents

Use Cases

Financial agents: Log every trade decision with full context. Prove to regulators exactly what information the agent had when it made a decision.

Customer support bots: Track every response and escalation. When a customer disputes what the bot said, you have a verifiable record.

Healthcare AI: Document every recommendation and data access. HIPAA requires audit trails for systems accessing patient data.

Autonomous coding agents: Log every file edit, test run, and deployment. When something breaks, trace exactly what the agent did.

Multi-agent systems: Each agent gets its own chain. Verify individual agent behavior independently.

Try It

The API is live:

Free tier gives you 1,000 entries/month — enough to instrument a prototype agent end to end.

Top comments (0)