Custodia-Admin

Posted on Mar 7 • Originally published at pagebolt.dev

agentlens, unworldly, and the text audit trail gap — why visual replay is still missing

#compliance #observability #aiagents #audit

agentlens, unworldly, and the text audit trail gap — why visual replay is still missing

agentlens just shipped immutable audit trail logging for AI agents. unworldly launched weeks before with tamper-evident logs. Both solve a real problem: tracking what your agent did.

Problem solved: ✅ immutable log of every action
Problem not solved: visual proof of what the agent actually saw

This gap is bigger than it looks.

The Text Audit Trail Problem

agentlens logs every agent action:

{
  "timestamp": "2026-03-07T14:23:15Z",
  "agent_id": "refund_processor_v2",
  "action": "click",
  "selector": "button[name=submit]",
  "result": "success"
}

This is perfect for forensics. It's immutable. It's auditable. It's tamper-evident.

But it answers one question, not two.

Question 1: What did the agent do?
Answer: Text log shows it (click, fill, navigate, submit).

Question 2: What did the agent see?
Answer: Text log does not show it.

Regulators ask both questions. Compliance teams ask both questions. Auditors ask both questions.

Real Scenario: The Compliance Interview

Compliance auditor: "On March 2, your agent processed a $500 refund. Show me the evidence."

You show the text log:

action: navigate, url: /refunds/123
action: fill, selector: input[name=amount], value: 500
action: click, selector: button[type=submit]
result: success

Auditor: "That shows what the agent said it did. But what did it actually see? What was on the screen? Did the form show $500? Did the confirmation say 'refund approved'? How do I know the agent filled the right field?"

You have no answer. Text logs don't capture screen state.

The Gap: Text vs Visual

Text audit trails (agentlens, unworldly, LangSmith) tell you:

What the agent decided to do
What API calls it made
What responses it got
Timestamps and metadata

Visual audit trails (screenshots, videos, PDFs) show you:

What was actually on the screen
What the agent clicked on
What the form looked like before/after
The full interaction sequence

For regulated workflows, you need both.

Why Regulators Demand Visual Proof

Three reasons:

1. Behavioral verification: Logs say "agent filled amount field with 500." A screenshot of the filled form proves it actually happened. Logs can be faked or interpreted wrong. Screenshots are harder to fake.

2. Compliance standards: SOC 2 Type II audits explicitly require "evidence of correct behavior." Text logs aren't evidence — they're assertions. Screenshots are evidence.

3. Liability: If something goes wrong and regulators investigate, your defense is: "Here's the immutable log AND here's the screenshot proving what the agent saw." Not: "Here's a log that says it worked."

Text logs alone put you in a weaker position.

The Opportunity for agentlens and unworldly

Both projects are solving the right problem. Immutable audit trails are essential. But they're solving half the problem.

The teams that win will pair text audit trails (for forensics) with visual replay (for proof).

If agentlens or unworldly ship screenshot/video capture, they move from "immutable logs" to "immutable logs + visual proof." That's a stronger moat.

Until then, they're incomplete.

The Complementary Solution

You don't replace text audit trails. You pair them with visual proof.

Pattern:

Your agent runs:
1. Navigate to /refunds
2. Fill amount field
3. Click submit
4. Get confirmation

agentlens logs: [all 4 steps with timestamps]
PageBolt captures: [screenshot before, screenshot after, video of full flow]

Auditor sees:
- Text log proves what happened (forensics)
- Screenshots prove what the agent saw (compliance)
- Video proves the sequence (behavior verification)

Text audit trail tools are getting better every month. agentlens, unworldly, LangSmith, Helicone — they're all building comprehensive logging.

None of them are building visual replay. Because visual replay requires:

Taking screenshots at the right moments
Recording video of multi-step sequences
Storing visual artifacts server-side
Syncing visual proof with text logs

That's infrastructure-heavy. Most audit trail projects stay focused on text.

The Strategic Implication

If you're using agentlens for audit trails, you still need visual proof for regulators. agentlens logs that the agent navigated to /refunds. PageBolt screenshots what the /refunds page looked like.

If you're using unworldly for tamper-evident logs, you still need to show auditors what the agent actually saw. unworldly proves the log is immutable. PageBolt proves what happened on screen.

They're not competing. They're complementary.

Same relationship we see with LangSmith + PageBolt. LangSmith captures what the LLM decided. PageBolt captures what the user actually saw as a result.

Text audit trails are table stakes. Visual replay is the next layer.

Getting Started

If you're building with agentlens or unworldly, add PageBolt for visual replay:

Capture screenshots at checkpoints: form filled, submission confirmed, error caught
Record videos of multi-step workflows: navigation → fill → submit → confirmation
Store visuals server-side: alongside your text audit trail
Pair them for compliance: auditors see logs AND screenshots

Free tier: 100 requests/month. Enough to capture visual proof for 20–30 workflows per month.

For teams building audit trail infrastructure: Add visual replay to your audit stack

For teams getting started: Get started at pagebolt.dev/signup

Text audit trails are necessary. Visual replay is non-negotiable for regulated workflows. The teams that build both win.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.