DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

Building AI Agents That Pass Security Audits: Visual Proof + Governance

Building AI Agents That Pass Security Audits: Visual Proof + Governance

Your SOC 2 auditor just asked a question that stopped your deployment: "How do we know what the AI agent actually did?"

You have API logs. You have database transaction records. You have token usage metrics. But when an autonomous agent chains 5 tools together in 8 seconds and modifies customer data, your auditor wants one thing: visual proof of what happened on screen.

Text logs aren't enough. Compliance frameworks don't ask for logs. They ask for evidence of control. And for AI agents, that evidence is visual.

What Auditors Actually Ask For

When SOC 2 auditors see "AI agent," they think: "What if it malfunctions and we can't prove it didn't?"

Here's what they require:

SOC 2 Type II:

  • Demonstrable proof of what the system accessed
  • Evidence that controls prevented unauthorized actions
  • Ability to reproduce the audit trail (not just read logs)

HIPAA (healthcare):

  • Visual proof of what PHI the agent accessed
  • Evidence that the agent didn't exfiltrate data
  • Audit logs you can show to regulators in real time

GDPR (data privacy):

  • Proof that the agent respected user consent
  • Evidence of what data was processed
  • Ability to demonstrate compliance during inspections

SOC 3 / FedRAMP / NIST:

  • Behavioral audit trails (not just API call logs)
  • Real-time visibility into agent actions
  • Forensic evidence in case of breach

Text logs don't cut it. An API log shows GET /customer/12345. A visual audit trail shows: the agent fetched the customer page, extracted the name, ran a compliance check, and returned the result—all captured as screenshots.

Why Logs Aren't Enough

Scenario 1: Agent Hallucination
Your agent logs: "Processed payment of $10,000". Your auditor asks: "Did it actually process the payment? What did the payment page look like when the agent submitted it?"

Without visual proof, you can't answer.

Scenario 2: Silent Failures
API logs show: "HTTP 200 OK". But the webpage rendered an error that the agent missed. The transaction didn't actually complete.

Logs said success. The page said failure. Auditors ask which one is true.

Scenario 3: Lateral Movement
Your agent logs show normal API calls. But did it access systems outside its intended scope? Did it exfiltrate credentials? Visual audit trails show exactly what the agent saw—and what it didn't.

Scenario 4: Compliance Evidence
When regulators audit you in 18 months, can you replay what your agent did today? Logs decay. Screenshots are forever.

Real Production Example: Claude Agent + PageBolt MCP

Here's how to build agents that pass audits using Claude Agent SDK and PageBolt:

Agent Code:

const Anthropic = require("@anthropic-ai/sdk").Anthropic;

const client = new Anthropic();

// PageBolt MCP for visual audit trails
const tools = [
  {
    name: "take_screenshot",
    description: "Take a screenshot of the current page for audit proof",
    input_schema: {
      type: "object",
      properties: {
        url: { type: "string", description: "URL to screenshot" }
      },
      required: ["url"]
    }
  },
  {
    name: "fill_form",
    description: "Fill a form field on the current page",
    input_schema: {
      type: "object",
      properties: {
        selector: { type: "string" },
        value: { type: "string" }
      },
      required: ["selector", "value"]
    }
  }
];

async function auditableAgent(task) {
  const messages = [];
  const auditTrail = [];

  // Initial request with audit context
  messages.push({
    role: "user",
    content: `Task: ${task}\n\nIMPORTANT: Before and after each action, take a screenshot for compliance audit. This is proof of your actions for SOC 2/HIPAA auditors.`
  });

  let response = await client.messages.create({
    model: "claude-opus-4-5-20251101",
    max_tokens: 4096,
    tools: tools,
    messages: messages
  });

  while (response.stop_reason === "tool_use") {
    const toolUse = response.content.find(block => block.type === "tool_use");

    if (toolUse.name === "take_screenshot") {
      // PageBolt API call - captures visual audit trail
      const screenshot = await fetch("https://api.pagebolt.com/screenshot", {
        method: "GET",
        headers: {
          "Authorization": `Bearer ${process.env.PAGEBOLT_API_KEY}`
        },
        params: { url: toolUse.input.url }
      }).then(r => r.json());

      auditTrail.push({
        timestamp: new Date().toISOString(),
        action: "screenshot",
        url: toolUse.input.url,
        imageId: screenshot.id
      });

      messages.push({
        role: "assistant",
        content: response.content
      });

      messages.push({
        role: "user",
        content: [{
          type: "tool_result",
          tool_use_id: toolUse.id,
          content: `Screenshot captured: ${screenshot.id}`
        }]
      });
    }

    // Continue agent loop
    response = await client.messages.create({
      model: "claude-opus-4-5-20251101",
      max_tokens: 4096,
      tools: tools,
      messages: messages
    });
  }

  return {
    result: response.content,
    auditTrail: auditTrail  // Return visual proof
  };
}

// Execute with audit trail
const result = await auditableAgent("Check customer account and verify compliance flags");
console.log("Audit trail:", result.auditTrail);
Enter fullscreen mode Exit fullscreen mode

What This Does:

  • Agent takes screenshots before and after each action
  • Screenshots stored in PageBolt (tamper-proof cloud)
  • Audit trail includes timestamps, URLs, visual proof
  • Auditors can replay the entire agent session as a video

When You Need Visual Proof vs. Logs

Scenario Logs Enough? Visual Proof Needed?
Public API call (no sensitive UI) ✅ Yes ❌ No
Database transaction ✅ Yes ❌ No
Agent accesses customer data ❌ No ✅ Yes
Agent fills compliance form ❌ No ✅ Yes
Agent accesses regulated systems ❌ No ✅ Yes
Agent accesses healthcare records ❌ No ✅ MUST
Malfunction investigation ❌ No ✅ Yes

Rule of thumb: If your agent touches a UI or regulated data, capture visual proof.

Cost + Effort

Self-hosted audit trails:

  • Puppeteer screenshots: 300–500MB per agent instance
  • Infrastructure: $3,500+/month
  • Ops: 20+ hours/month (debugging, monitoring, scaling)

PageBolt MCP approach:

  • Hosted screenshots: $29/month (10k requests)
  • Setup: 30 minutes (integrate MCP)
  • Ops: ~0 hours/month

Passing Your Audit

When your SOC 2 auditor asks "Show us proof of agent actions," you'll have:

  • Visual replay of every step
  • Timestamps tied to business logs
  • Evidence of what the agent accessed
  • Proof it didn't exceed permissions
  • Real-time audit trail you can demo live

That's compliance evidence auditors actually trust.

Next Steps

  1. Add PageBolt MCP to your Claude Agent SDK setup
  2. Wrap agent actions with screenshot calls
  3. Store audit trail in your compliance system
  4. Demo to your auditor before the engagement

Try PageBolt free — 100 requests/month, no credit card. Build one auditable agent. See how it changes the conversation with regulators.

Top comments (0)