DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

How to Add Visual Proof to Your MCP Server (Screenshots + Replay)

How to Add Visual Proof to Your MCP Server (Screenshots + Replay)

Your MCP tool just ran 50 actions on a website. But your user only sees JSON output. They don't know what actually happened.

This is the MCP visibility problem.

MCP servers (Model Context Protocol) are powerful — they let Claude, Cursor, and Windsurf automate browser tasks. But they're invisible. A user calls your tool, gets back structured data, and has to trust that what you reported is what actually happened.

What if you could show them?

The Problem: MCP Tools Are Text-Only

When you implement an MCP handler for browser automation (e.g., filling a form, navigating a site, capturing data), the output is JSON:

{
  "success": true,
  "filled_fields": ["name", "email", "submit"],
  "submitted": true
}
Enter fullscreen mode Exit fullscreen mode

The user can't verify it. Did the form actually fill? Did the submit button actually click? Or did your tool encounter an error and return a lie?

MCP tools need visual proof.

The Solution: PageBolt API Inside Your MCP Handler

Add a screenshot call to your MCP tool. After each major action (navigation, form submission, data extraction), capture visual proof.

import fetch from 'node-fetch';
import fs from 'fs';

const PAGEBOLT_API_KEY = process.env.PAGEBOLT_API_KEY;

// MCP tool handler for form submission
async function fillAndSubmitForm(url, formData) {
  // Automated browser actions happen here (via Puppeteer, Playwright, etc.)
  // ...

  // Capture visual proof
  const screenshotResponse = await fetch('https://api.pagebolt.dev/v1/screenshot', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${PAGEBOLT_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      url: url,
      format: 'png',
      width: 1280,
      height: 720
    })
  });

  if (!screenshotResponse.ok) {
    throw new Error(`Screenshot failed: ${screenshotResponse.status}`);
  }

  const buffer = await screenshotResponse.arrayBuffer();
  const filename = `form-submitted-${Date.now()}.png`;
  fs.writeFileSync(filename, Buffer.from(buffer));

  return {
    success: true,
    filled_fields: Object.keys(formData),
    submitted: true,
    visual_proof: filename,  // Return filename to user
    message: `Form submitted successfully. Visual proof: ${filename}`
  };
}
Enter fullscreen mode Exit fullscreen mode

Now your MCP tool returns:

{
  "success": true,
  "visual_proof": "form-submitted-1234567890.png",
  "message": "Form submitted. See attached screenshot for proof."
}
Enter fullscreen mode Exit fullscreen mode

The user can verify exactly what happened.

Real Example: MCP Tool for Web Form Automation

Here's a complete MCP tool that automates form filling and captures visual proof:

const { Server } = require('@modelcontextprotocol/sdk/server/index.js');
const { StdioServerTransport } = require('@modelcontextprotocol/sdk/server/stdio.js');
const { ListToolsRequestSchema, CallToolRequestSchema } = require('@modelcontextprotocol/sdk/types.js');
const fetch = require('node-fetch');
const fs = require('fs');

const server = new Server({
  name: 'form-automator-with-proof',
  version: '1.0.0'
});

const PAGEBOLT_API_KEY = process.env.PAGEBOLT_API_KEY;

// List available tools
server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: 'screenshot_form',
        description: 'Take a screenshot of a web form to verify its current state',
        inputSchema: {
          type: 'object',
          properties: {
            url: {
              type: 'string',
              description: 'URL of the form to screenshot'
            }
          },
          required: ['url']
        }
      }
    ]
  };
});

// Handle tool calls
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === 'screenshot_form') {
    const { url } = request.params.arguments;

    const response = await fetch('https://api.pagebolt.dev/v1/screenshot', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${PAGEBOLT_API_KEY}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        url: url,
        format: 'png',
        fullPage: true,
        blockBanners: true
      })
    });

    if (!response.ok) {
      return {
        content: [{
          type: 'text',
          text: `Screenshot failed: ${response.status}`
        }]
      };
    }

    const buffer = await response.arrayBuffer();
    const filename = `proof-${Date.now()}.png`;
    fs.writeFileSync(filename, Buffer.from(buffer));

    return {
      content: [{
        type: 'text',
        text: `Screenshot captured: ${filename}. Form is visible and ready for automation.`
      }]
    };
  }
});

const transport = new StdioServerTransport();
server.connect(transport);
Enter fullscreen mode Exit fullscreen mode

Claude can now call screenshot_form tool to verify form state before attempting to fill it.

MCP Tools That Benefit from Visual Proof

Web Scraping Tools — Verify page load before extracting data:

{
  name: 'scrape_ecommerce_listings',
  description: 'Scrape product listings. Screenshot before extraction for proof.'
}
Enter fullscreen mode Exit fullscreen mode

Form Automation Tools — Show filled fields:

{
  name: 'fill_application_form',
  description: 'Fill an application form and capture screenshot of completed form'
}
Enter fullscreen mode Exit fullscreen mode

Testing Tools — Visual regression proof:

{
  name: 'test_checkout_flow',
  description: 'Walk through checkout and screenshot each step'
}
Enter fullscreen mode Exit fullscreen mode

Data Extraction Tools — Prove extracted data matches page:

{
  name: 'extract_contact_info',
  description: 'Extract contact info and screenshot source for verification'
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters for AI Agents

When Claude or Cursor runs your MCP tool:

  1. Tool executes (fills forms, navigates pages, extracts data)
  2. Tool captures visual proof (screenshot via PageBolt API)
  3. Tool returns: result + visual evidence
  4. User/AI can verify the action actually worked

This creates accountability. The AI agent can see what happened. The user can trust the output.

Implementation Checklist

  • ✅ Add PageBolt API key to your environment
  • ✅ Import fetch (node-fetch for Node.js)
  • ✅ In your MCP handler, call PageBolt /v1/screenshot after each major action
  • ✅ Use response.arrayBuffer() to get binary PNG
  • ✅ Save screenshot to file or upload to your storage
  • ✅ Return screenshot filename/URL in MCP tool response
  • ✅ Document the visual proof in tool description

Pricing

Plan Requests/Month Cost Best For
Free 100 $0 Development & testing
Starter 5,000 $29 Small MCP tools
Growth 25,000 $79 Production MCP servers
Scale 100,000 $199 Enterprise agent platforms

Summary

MCP servers give AI agents superpowers. Visual proof gives your MCP server credibility.

  • ✅ Add PageBolt screenshots to MCP tool handlers
  • ✅ Capture proof after critical actions (form submit, navigation, extraction)
  • ✅ Return screenshot + structured result
  • ✅ Users see what actually happened
  • ✅ AI agents can verify and trust the output

Get started: Try PageBolt free — 100 requests/month, no credit card required →

Top comments (0)