DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

How to Generate an Audit Trail for AI Agent Actions (With Visual Proof)

How to Generate an Audit Trail for AI Agent Actions (With Visual Proof)

You've deployed an AI agent to handle customer refunds. It works perfectly in testing.

But your compliance officer asks: "How do we prove what the agent actually did in the browser?"

You show them text logs from LangSmith or Langfuse. They're not satisfied.

Text logs tell you what the agent claimed to do. Visual proof shows what it actually did.

This is the gap between logs and compliance.

The Problem: Text Logs Aren't Audit Proof

Observability platforms (LangSmith, Langfuse, OpenTelemetry) capture:

  • Agent decisions
  • Tool calls and responses
  • Token usage
  • Latency metrics

But they don't capture what the agent actually saw or clicked.

Example: Your agent logs say "clicked refund button." But did it? What was on screen? Did the page load correctly?

For compliance (HIPAA, SOC 2, PCI-DSS, EU AI Act), you need visual evidence.

The Solution: Screenshot After Each Agent Step

Add a screenshot after every agent action:

import anthropic
from pathlib import Path
from datetime import datetime

client = anthropic.Anthropic()

def agent_with_visual_proof(task: str):
    """Run agent and capture screenshot proof after each step."""

    audit_trail = {
        "task": task,
        "timestamp": datetime.now().isoformat(),
        "steps": []
    }

    # Define tools with screenshot capture
    tools = [
        {
            "name": "take_screenshot",
            "description": "Capture current page state for audit trail",
            "input_schema": {
                "type": "object",
                "properties": {
                    "url": {"type": "string", "description": "URL to screenshot"},
                    "reason": {"type": "string", "description": "Why this screenshot matters"}
                },
                "required": ["url", "reason"]
            }
        },
        {
            "name": "process_refund",
            "description": "Process customer refund",
            "input_schema": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string"},
                    "amount": {"type": "number"}
                },
                "required": ["order_id", "amount"]
            }
        }
    ]

    messages = [
        {
            "role": "user",
            "content": task
        }
    ]

    step_count = 0

    while True:
        response = client.messages.create(
            model="claude-opus-4-5-20251101",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )

        # Check if agent is done
        if response.stop_reason == "end_turn":
            break

        # Process tool calls
        if response.stop_reason == "tool_use":
            step_count += 1

            for content_block in response.content:
                if content_block.type == "tool_use":
                    tool_name = content_block.name
                    tool_input = content_block.input

                    print(f"Step {step_count}: {tool_name}")
                    print(f"  Input: {tool_input}")

                    # Capture screenshot for audit trail
                    if tool_name == "take_screenshot":
                        screenshot_result = capture_screenshot(
                            tool_input["url"],
                            f"step-{step_count}",
                            tool_input["reason"]
                        )
                        tool_result = screenshot_result

                    elif tool_name == "process_refund":
                        # Process refund and capture proof
                        refund_result = {
                            "status": "approved",
                            "order_id": tool_input["order_id"],
                            "amount": tool_input["amount"],
                            "reference": f"REF-{step_count}-{tool_input['order_id']}"
                        }

                        # Screenshot after refund
                        screenshot_path = capture_screenshot(
                            "https://app.example.com/refunds",
                            f"step-{step_count}-refund-proof",
                            f"Refund {refund_result['reference']} processed"
                        )

                        refund_result["proof_screenshot"] = screenshot_path
                        tool_result = refund_result

                    # Record in audit trail
                    audit_trail["steps"].append({
                        "step": step_count,
                        "action": tool_name,
                        "input": tool_input,
                        "result": tool_result,
                        "timestamp": datetime.now().isoformat()
                    })

                    # Add tool result to conversation
                    messages.append({
                        "role": "assistant",
                        "content": response.content
                    })

                    messages.append({
                        "role": "user",
                        "content": [
                            {
                                "type": "tool_result",
                                "tool_use_id": content_block.id,
                                "content": str(tool_result)
                            }
                        ]
                    })

    return audit_trail

def capture_screenshot(url: str, step_id: str, reason: str) -> dict:
    """Capture screenshot via PageBolt API."""
    import requests

    response = requests.post(
        "https://api.pagebolt.dev/v1/screenshot",
        headers={
            "Authorization": f"Bearer {os.getenv('PAGEBOLT_API_KEY')}",
            "Content-Type": "application/json"
        },
        json={
            "url": url,
            "format": "png",
            "width": 1280,
            "height": 720,
            "fullPage": True,
            "blockBanners": True
        }
    )

    if response.status_code != 200:
        return {"error": f"Screenshot failed: {response.status_code}"}

    # Save screenshot
    filename = f"audit-trail/{step_id}-{datetime.now().timestamp()}.png"
    Path(filename).parent.mkdir(parents=True, exist_ok=True)

    with open(filename, "wb") as f:
        f.write(response.content)

    return {
        "screenshot_path": filename,
        "reason": reason,
        "url": url
    }

# Run agent with visual audit trail
if __name__ == "__main__":
    import os

    audit = agent_with_visual_proof(
        "Process refund for order ORDER-12345 with amount $50"
    )

    # Save audit trail as JSON with screenshot references
    import json
    with open("audit-trail.json", "w") as f:
        json.dump(audit, f, indent=2)

    print(f"Audit trail saved with {len(audit['steps'])} steps")
    for step in audit["steps"]:
        print(f"  Step {step['step']}: {step['action']}{step.get('result', {}).get('screenshot_path', 'N/A')}")
Enter fullscreen mode Exit fullscreen mode

Real Use Case: Autonomous Customer Service Refund

Customer initiates refund request. Agent:

  1. Screenshot initial state — customer data page
  2. Retrieve order details — agent calls order API
  3. Screenshot order confirmation — verify customer info
  4. Process refund — submit refund form
  5. Screenshot refund confirmation — proof of success
  6. Send customer notification — email with refund ID

Each step has:

  • Tool call with input
  • Result (API response or form submission)
  • Screenshot evidence of what happened on screen

This creates a complete visual audit trail for compliance audits.

Compliance Frameworks: What They Require

Framework Requirement Solution
HIPAA Audit logs with evidence Screenshots of patient data access
SOC 2 Detailed access logs Before/after screenshots of changes
PCI-DSS Transaction proof Screenshots of payment processing
EU AI Act Decision transparency Screenshots of agent actions/reasoning
GDPR Data handling proof Screenshots of data deletion/handling

Visual proof satisfies all of them.

Architecture: Visual Audit Trail System

┌─────────────────────────────────────────────────────────┐
│ AI Agent (Claude)                                       │
│ ├─ Observability (LangSmith/Langfuse)                 │
│ ├─ Text logs: "clicked refund button"                 │
│ └─ Visual proof: screenshot after each step            │
└──────────┬──────────────────────────────────────────────┘
           │
           ├─ Store logs in observability platform
           │
           └─ Capture screenshots via PageBolt
              ├─ Screenshot after tool calls
              ├─ Screenshot after form submissions
              └─ Screenshot after navigation
                 │
                 ▼
           ┌──────────────────────────┐
           │ Audit Trail Storage      │
           ├─ step-1.png             │
           ├─ step-2.png             │
           ├─ step-3.png             │
           └─ audit-trail.json       │
                 │
                 ▼
           ┌──────────────────────────┐
           │ Compliance Report        │
           │ - Text logs              │
           │ - Screenshots            │
           │ - Timeline               │
           │ - Decision points        │
           └──────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Generating Compliance Reports

def generate_audit_report(audit_trail: dict) -> str:
    """Generate HTML report with screenshots for auditors."""

    html = f"""
    <html>
    <head><title>AI Agent Audit Trail</title></head>
    <body>
        <h1>Audit Trail Report</h1>
        <p><strong>Task:</strong> {audit_trail['task']}</p>
        <p><strong>Timestamp:</strong> {audit_trail['timestamp']}</p>

        <h2>Agent Actions</h2>
    """

    for step in audit_trail["steps"]:
        html += f"""
        <div style="border: 1px solid #ccc; margin: 20px 0; padding: 10px;">
            <h3>Step {step['step']}: {step['action']}</h3>
            <p><strong>Input:</strong> {step['input']}</p>
            <p><strong>Result:</strong> {step['result']}</p>
            <p><strong>Time:</strong> {step['timestamp']}</p>
        """

        if isinstance(step['result'], dict) and 'screenshot_path' in step['result']:
            html += f"""
            <h4>Visual Proof</h4>
            <img src="{step['result']['screenshot_path']}" style="max-width: 100%; border: 1px solid #ddd;">
            """

        html += "</div>"

    html += "</body></html>"
    return html

# Generate report
report_html = generate_audit_report(audit)
with open("audit-report.html", "w") as f:
    f.write(report_html)

print("Audit report generated: audit-report.html")
Enter fullscreen mode Exit fullscreen mode

Pricing

Plan Requests/Month Cost Best For
Free 100 $0 Testing, low-volume agents
Starter 5,000 $29 10–50 agent runs/month
Growth 25,000 $79 100–500 agent runs/month
Scale 100,000 $199 1000+ agent runs/month

At 5 screenshots per agent run, Starter covers 1,000 agent executions.

Summary

  • ✅ Text logs from LangSmith/Langfuse document agent decisions
  • ✅ Screenshots from PageBolt document agent actions
  • ✅ Together they create compliance-ready audit trails
  • ✅ Visual proof satisfies HIPAA, SOC 2, PCI-DSS, EU AI Act
  • ✅ Generate HTML reports with embedded screenshots
  • ✅ Store alongside observability logs for complete evidence

Get started free: pagebolt.dev — 100 requests/month, no credit card required.

Top comments (0)