Implementing Visual Audit Trails for LLM Agents in Production — A Step-by-Step Guide

#mcp #aiagents #compliance #observability

Implementing Visual Audit Trails for LLM Agents in Production — A Step-by-Step Guide

Your LLM agent is live in production. It's handling 500+ customer requests per day. It accesses databases, calls APIs, writes to Slack. One day, a customer claims the agent took an unauthorized action. Your logs show: "Agent made API call." Your auditor asks: "What did the agent see? What did it decide?"

You have no answer.

This is the audit trail gap. Text logs show what happened. They don't show what the agent saw and decided. Video proof solves this.

Why Compliance Requires Visual Proof

Text audit logs are insufficient for high-risk AI scenarios. Here's why regulators require visual proof:

EU AI Act (August 2026 deadline): High-risk AI systems must maintain "readily available information on the operation of the system." Screenshots prove operation. Text logs require interpretation.

SOC 2 Type II: Auditors ask: "Show us the agent's view when it made that decision." A video showing the exact screen state, narrated step-by-step, answers the question. A log line doesn't.

HIPAA (healthcare): Healthcare agents handling PHI (protected health information) must prove they accessed/modified data correctly. Visual evidence is stronger in audits and legal discovery.

Fintech (SEC/FINRA): Trading agents must prove they followed execution rules. "Agent executed trade" doesn't prove "Agent verified customer had authority and balance before trading." A video does.

The Architecture: MCP + PageBolt Integration

The pattern is simple:

Agent runs its workflow (calls multiple MCP servers)
Wrapper captures before/after state for each step
PageBolt records screenshots + narration
Artifact becomes immutable proof

Here's the architecture:

┌─────────────────────────────────────────────────────────┐
│ LLM Agent (Claude, etc.)                                │
│ ├─ MCP Server 1 (Database)                              │
│ ├─ MCP Server 2 (API)                                   │
│ └─ MCP Server 3 (Notifications)                         │
└────────────────┬────────────────────────────────────────┘
                 │
                 ▼
         ┌──────────────────┐
         │ Audit Wrapper    │ <- Capture state before/after
         │ (Python/JS)      │
         └────────┬─────────┘
                  │
                  ▼
         ┌──────────────────┐
         │ PageBolt API     │ <- Record screenshots + narration
         │ /screenshot      │
         │ /record_video    │
         └────────┬─────────┘
                  │
                  ▼
         ┌──────────────────┐
         │ Audit Trail      │
         │ (MP4 + metadata) │ <- Immutable proof
         └──────────────────┘

Implementation: Step-by-Step Code Example

Here's a wrapper for audit trail capture with visual narration:

import json
import urllib.request
import time
import hashlib
from datetime import datetime
from anthropic import Anthropic

class AuditedLLMAgent:
    def __init__(self, pagebolt_api_key: str, agent_name: str):
        self.pagebolt_api_key = pagebolt_api_key
        self.agent_name = agent_name
        self.client = Anthropic()
        self.audit_trail = []

    def execute_with_audit(self, workflow_description: str, mcp_servers: list) -> dict:
        """
        Execute LLM agent workflow with full visual audit trail.

        Args:
            workflow_description: What the agent is supposed to do
            mcp_servers: List of {name, mcp_endpoint, action_description}

        Returns:
            Audit trail artifact with video URL and metadata
        """

        workflow_id = hashlib.md5(
            (self.agent_name + str(time.time())).encode()
        ).hexdigest()[:8]

        print(f"\n[Audit Trail {workflow_id}] Starting workflow: {workflow_description}")

        # Phase 1: Capture initial state
        initial_state = self._capture_state("WORKFLOW_START")
        self.audit_trail.append({
            "phase": "start",
            "timestamp": datetime.utcnow().isoformat(),
            "state": initial_state
        })

        # Phase 2: Execute agent for each MCP server
        for step_num, server in enumerate(mcp_servers, 1):
            print(f"\n[Step {step_num}] Executing: {server['action_description']}")

            # Capture BEFORE state
            state_before = self._capture_state(f"BEFORE_{server['name']}")

            # Execute agent action
            agent_result = self._execute_agent_action(
                workflow_description,
                server,
                state_before
            )

            # Capture AFTER state
            state_after = self._capture_state(f"AFTER_{server['name']}")

            # Record step in audit trail (including server URL for video)
            self.audit_trail.append({
                "step": step_num,
                "server": server["name"],
                "action": server["action_description"],
                "server_url": server.get("mcp_endpoint", "about:blank"),
                "state_before": state_before,
                "agent_result": agent_result,
                "state_after": state_after,
                "timestamp": datetime.utcnow().isoformat()
            })

            print(f"  ✓ State captured before/after")

        # Phase 3: Generate narrated video from audit trail
        print(f"\n[Finalizing] Generating narrated audit video...")

        video_url = self._generate_audit_video(workflow_id)

        return {
            "workflow_id": workflow_id,
            "agent_name": self.agent_name,
            "workflow_description": workflow_description,
            "timestamp": datetime.utcnow().isoformat(),
            "steps_completed": len(mcp_servers),
            "audit_video_url": video_url,
            "audit_trail_metadata": self.audit_trail
        }

    def _capture_state(self, label: str) -> dict:
        """Capture system state via PageBolt screenshot."""
        try:
            req = urllib.request.Request(
                "https://pagebolt.dev/api/v1/screenshot",
                data=json.dumps({
                    "url": "about:blank",
                    "width": 1280,
                    "height": 720
                }).encode('utf-8'),
                headers={
                    'x-api-key': self.pagebolt_api_key,
                    "Content-Type": "application/json"
                }
            )
            with urllib.request.urlopen(req) as resp:
                filename = f"state_{label}_{int(time.time())}.png"
                with open(filename, 'wb') as f:
                    f.write(resp.read())
                return {"label": label, "file": filename}
        except Exception as e:
            return {"label": label, "error": str(e)}

    def _execute_agent_action(self, workflow: str, server: dict, state_before: dict) -> str:
        """Execute one agent action via Claude."""
        prompt = f"""
        You are an audited LLM agent in production.

        Workflow: {workflow}
        Current step: {server['action_description']}
        MCP Server: {server['name']}

        Execute this action and explain what you did.
        Be specific about what data you accessed and what decision you made.
        """

        response = self.client.messages.create(
            model="claude-opus-4-6",
            max_tokens=512,
            messages=[{"role": "user", "content": prompt}]
        )

        return response.content[0].text

    def _generate_audit_video(self, workflow_id: str) -> str:
        """Generate narrated video from audit trail."""

        # Construct narration script
        narration_lines = [
            f"Audit trail for workflow {workflow_id}.",
            f"Agent: {self.agent_name}."
        ]

        # Build video steps with navigate + screenshot for each MCP server step
        video_steps = []

        for entry in self.audit_trail:
            if entry.get("phase") == "start":
                narration_lines.append("Starting workflow execution.")
                continue
            if "step" in entry:
                narration_lines.append(
                    f"Step {entry['step']}: {entry['action']}. "
                    f"Agent result: {entry['agent_result'][:60]}..."
                )
                # Add navigate + screenshot for this step
                video_steps.append({
                    "action": "navigate",
                    "url": entry.get("server_url", "about:blank")
                })
                video_steps.append({
                    "action": "screenshot",
                    "note": f"Step {entry['step']}: {entry['action']}"
                })

        narration_script = " ".join(narration_lines)

        # Call PageBolt record_video with navigate + screenshot steps
        req = urllib.request.Request(
            "https://pagebolt.dev/api/v1/record_video",
            data=json.dumps({
                "steps": video_steps,
                "audioGuide": {
                    "enabled": True,
                    "script": narration_script,
                    "voice": "aria"
                }
            }).encode('utf-8'),
            headers={
                'x-api-key': self.pagebolt_api_key,
                "Content-Type": "application/json"
            }
        )

        try:
            with urllib.request.urlopen(req) as resp:
                result = json.loads(resp.read())
                video_url = result.get('url')
                print(f"✓ Audit video generated: {video_url}")
                return video_url
        except Exception as e:
            print(f"✗ Video generation failed: {e}")
            return None

# Usage: Deploy an audited agent in production
if __name__ == "__main__":
    agent = AuditedLLMAgent(
        pagebolt_api_key="YOUR_API_KEY",
        agent_name="customer-support-agent-v1"
    )

    # Define MCP servers and actions
    mcp_workflow = [
        {
            "name": "CustomerDB",
            "mcp_endpoint": "https://api.example.com/customer-db",
            "action_description": "Retrieve customer record and verify authorization"
        },
        {
            "name": "BillingAPI",
            "mcp_endpoint": "https://api.example.com/billing",
            "action_description": "Check account balance and transaction history"
        },
        {
            "name": "Notifications",
            "mcp_endpoint": "https://api.example.com/notifications",
            "action_description": "Send confirmation to customer Slack"
        }
    ]

    # Execute with audit trail
    audit_result = agent.execute_with_audit(
        workflow_description="Process refund request for customer #12345",
        mcp_servers=mcp_workflow
    )

    # Result includes immutable video proof
    print(f"\nAudit trail complete.")
    print(f"Video: {audit_result['audit_video_url']}")
    print(f"Share with compliance team for review.")

Deployment Checklist

Before deploying audited agents to production:

[ ] Audit video storage: Store video artifacts in secure, immutable storage (S3 with versioning disabled, or audit-purpose cloud storage)
[ ] Retention policy: Define how long to retain videos (HIPAA: 6 years; EU AI Act: duration of AI Act compliance)
[ ] Access control: Restrict audit video access to compliance and security teams only
[ ] Performance impact: Test that audit capture doesn't add >5% latency to agent workflows
[ ] API key rotation: Rotate PageBolt API keys monthly; use a secrets manager (AWS Secrets Manager, HashiCorp Vault)
[ ] Monitoring: Alert if audit trail generation fails for any agent execution
[ ] Testing: Dry-run audited workflows in staging before production launch

Real-World Impact

Fintech example: A trading agent executes 10,000 trades daily. During an SEC audit, regulators ask: "Prove this agent followed execution rules." Video proof from PageBolt shows every decision. Audit passes.

Healthcare example: A scheduling agent books patient appointments. A patient claims the agent double-booked her. Video proof shows the agent correctly verified availability and booked one slot. Dispute resolved instantly.

Enterprise example: A data access agent handles thousands of database queries. When a security incident occurs, forensic analysis of audit videos identifies which queries were authorized and which were not. Incident response accelerates by days.

Get Started

Step 1: Sign up free at pagebolt.dev — 100 API requests/month.

Step 2: Get your API key.

Step 3: Wrap your LLM agent with the AuditedLLMAgent class above.

Step 4: Deploy to production. Every agent execution now generates immutable visual proof.

Compliance is no longer a theoretical requirement for AI agents — it's operational. The agents that win in regulated industries will be the ones with forensic proof of every decision.

Visual audit trails aren't optional anymore. They're infrastructure.

Ready to deploy audited agents? Try PageBolt free →