Monitoring AI Agent Actions in Production: A Developer's Guide
You deploy an AI agent to production. It's supposed to fill out forms, make API calls, and report back. For the first week, everything works. Then on Wednesday, a customer reports: "The agent submitted my form twice and now my data is corrupted."
You check the logs. Your agent says:
2026-03-17T14:32:15Z Agent started task
2026-03-17T14:32:18Z Form filled
2026-03-17T14:32:19Z Submit clicked
2026-03-17T14:32:20Z Task completed
But the logs don't answer: What did the agent actually see on screen? Did the form really fill? Did the submit button click? Or did the page freeze after your agent clicked?
Text logs alone aren't enough. You need to see what your agent saw.
The Problem: Blind Agents
Right now, your agent monitoring probably includes:
- Log output (text statements)
- API call traces (what endpoints were hit)
- Error messages (if something broke)
But none of this answers: What did the UI actually show the agent?
Common blind spots:
- Form validation errors that logs missed
- Page redirects your agent didn't expect
- Visual elements (buttons, links) that moved or disappeared
- Stale page state (cached HTML)
Result: Agents make mistakes and you have no visual evidence of what went wrong.
The Solution: Screenshot-Based Monitoring
Every time your agent takes an action, capture a screenshot. Not for humans to debug — for the agent itself to verify what it's seeing, and for you to audit what happened.
Here's the pattern:
const fetch = require('node-fetch');
const fs = require('fs');
class MonitoredAgent {
constructor(apiKey) {
this.apiKey = apiKey;
this.screenshots = [];
}
async captureScreenshot(url, label) {
const response = await fetch('https://api.pagebolt.com/v1/screenshot', {
method: 'POST',
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: url,
viewport: { width: 1280, height: 720 },
format: 'png'
})
});
const buffer = await response.buffer();
const filename = `screenshot-${label}-${Date.now()}.png`;
fs.writeFileSync(`./audit-trail/${filename}`, buffer);
this.screenshots.push({
timestamp: new Date().toISOString(),
label: label,
filename: filename,
url: url
});
return buffer;
}
async executeTask(url) {
console.log(`Agent starting task on ${url}`);
// Screenshot 1: Initial state
await this.captureScreenshot(url, 'initial-state');
// Agent does work (fill form, click button, etc.)
await this.fillForm(url);
// Screenshot 2: After form fill
await this.captureScreenshot(url, 'after-fill');
// Click submit
await this.clickSubmit(url);
// Screenshot 3: After submit
await this.captureScreenshot(url, 'after-submit');
// Wait for confirmation
await new Promise(resolve => setTimeout(resolve, 2000));
// Screenshot 4: Final state
await this.captureScreenshot(url, 'final-state');
return this.getAuditTrail();
}
getAuditTrail() {
return {
task: 'form_submission',
timestamp: new Date().toISOString(),
screenshots: this.screenshots,
status: 'completed'
};
}
}
// Usage
const agent = new MonitoredAgent(process.env.PAGEBOLT_API_KEY);
const auditTrail = await agent.executeTask('https://example.com/form');
// Save audit trail
fs.writeFileSync(
`./audit-trails/${Date.now()}.json`,
JSON.stringify(auditTrail, null, 2)
);
Real-World Example: Procurement Workflow Agent
Let's say you have an agent that processes purchase requests. At each decision point, capture a screenshot:
async function procurementAgent(prUrl) {
const agent = new MonitoredAgent(process.env.PAGEBOLT_API_KEY);
const auditTrail = [];
try {
// Step 1: Read requisition
await agent.captureScreenshot(prUrl, 'read-requisition');
const amount = await agent.extractAmount(prUrl);
// Step 2: Check approval rules
const requiresApproval = amount > 10000;
await agent.captureScreenshot(prUrl, 'approval-check');
// Step 3: Route for approval
if (requiresApproval) {
await agent.submitForApproval(prUrl);
await agent.captureScreenshot(prUrl, 'submitted-for-approval');
} else {
await agent.auto-approve(prUrl);
await agent.captureScreenshot(prUrl, 'auto-approved');
}
// Step 4: Final state
await agent.captureScreenshot(prUrl, 'final-state');
// Return complete audit trail
return agent.getAuditTrail();
} catch (error) {
// On error, capture final screenshot
await agent.captureScreenshot(prUrl, 'error-state');
throw error;
}
}
Audit trail includes:
- Screenshot at each decision point
- Timestamps of each action
- URL state at each step
- Complete visual record of what the agent saw
For compliance: "Here's what the agent saw when it approved the purchase." Auditors can literally see the UI the agent was interacting with.
Why Screenshots > Text Logs
Text logs tell you what the agent thinks happened:
[14:32:19] Form field "amount" filled with value "5000"
[14:32:20] Submit button clicked
[14:32:21] Request successful
Screenshots show what actually happened:
- Screenshot 1: Form loaded correctly
- Screenshot 2: Form filled (but validation error showing in red)
- Screenshot 3: Submit button is disabled (grayed out)
- Screenshot 4: Modal popup blocked submission
Huge difference. The agent's logs say "submit clicked" but the screenshot shows "button is disabled." Text logs are incomplete.
Governance & Compliance
In regulated industries (fintech, healthcare, insurance), visual proof matters:
- Audit trail: Regulators want to see: "Here's the exact UI state when the agent made decision X"
- Debugging: Support says "agent rejected my application" → you show 5 screenshots proving why
- Liability: "Agent made wrong decision" → you have visual evidence of exactly what information was available
Screenshots are the governance layer that text logs can't provide.
Cost & Implementation
PageBolt approach:
- Starter plan: $29/month (5,000 screenshots)
- Typical agent: 4-10 screenshots per task
- Volume: 500 tasks/month = 2,000-5,000 screenshots ✓ Covered by Starter
- Total: $29/month
Self-hosted approach:
- Puppeteer + Node.js: $50-100/month infrastructure
- Storage: $5-10/month for screenshots
- DevOps overhead: 2-4 hours/month
- Total: $55-110/month + time
For production AI agents, PageBolt's $29/month Starter plan is both cheaper and simpler.
Getting Started
- Sign up free at pagebolt.dev/pricing — 100 screenshots/month
- Add the
captureScreenshot()function to your agent - Call it before/after critical actions
- Save screenshots + metadata to your audit trail
- Compliance + governance layer ready
Your agent is now fully transparent. Every action, every decision, every page state — captured and auditable.
Start free: pagebolt.dev/pricing. 100 screenshots/month, no credit card required.
Top comments (0)