Custodia-Admin

Posted on Mar 7 • Edited on Mar 25 • Originally published at pagebolt.dev

How to Add Visual Proof to Your MCP Server in 5 Minutes

#ai #security #mcp #devtools

How to Add Visual Proof to Your MCP Server in 5 Minutes

Your MCP server does useful work. It orchestrates agents, runs automations, calls APIs, navigates web apps.

You log everything. Tool calls, responses, state changes. Good.

But when someone asks "what did the agent actually see?" your logs are silent. You can tell them what methods were called. You can't show them what happened on screen.

That's the gap visual proof fills. In 5 minutes, you can add a PageBolt integration to your MCP server that captures screenshots at any step.

Why This Matters

MCP servers are invisible by default. When Claude uses your tools, there's no UI to watch, no screenshot to see, no proof of what actually happened.

For compliance, debugging, and user trust, that visibility matters.

A screenshot costs nothing—one API call, one binary response, one PNG file stored. But it answers: "Did the agent see what we expected?"

The Integration (5 Minutes)

Add one new tool to your MCP server that calls PageBolt's screenshot endpoint.

1. Create a screenshot tool in your tools list:

{
    "name": "capture_screenshot",
    "description": "Capture a screenshot of a URL or HTML. Returns PNG binary with timestamp.",
    "inputSchema": {
        "type": "object",
        "properties": {
            "url": {
                "type": "string",
                "description": "URL to capture (e.g., https://example.com/form)"
            },
            "selector": {
                "type": "string",
                "description": "Optional CSS selector to capture specific element"
            }
        },
        "required": ["url"]
    }
}

2. Implement the handler:

import requests
import base64
import datetime

def handle_capture_screenshot(url: str, selector: str = None) -> dict:
    """
    Call PageBolt API to capture screenshot.
    Store PNG locally with timestamp. Return path.
    """

    pagebolt_api_key = os.getenv("PAGEBOLT_API_KEY")

    payload = {
        "url": url,
        "format": "png",
        "width": 1280,
        "height": 720,
        "blockBanners": True,
        "blockAds": True
    }

    if selector:
        payload["selector"] = selector

    response = requests.post(
        "https://pagebolt.dev/api/v1/screenshot",
        headers={'x-api-key': pagebolt_api_key},
        json=payload
    )

    if response.status_code == 200:
        # Store PNG with timestamp
        timestamp = datetime.datetime.utcnow().isoformat()
        filename = f"screenshots/{timestamp.replace(':', '-')}.png"
        os.makedirs("screenshots", exist_ok=True)

        with open(filename, 'wb') as f:
            f.write(response.content)

        return {
            "success": True,
            "path": filename,
            "url": url,
            "timestamp": timestamp
        }
    else:
        return {
            "success": False,
            "error": f"PageBolt API error: {response.status_code}"
        }

3. Register in your MCP server:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my_server")

@mcp.tool()
async def capture_screenshot(url: str, selector: str = None) -> str:
    result = handle_capture_screenshot(url, selector)
    if result["success"]:
        return f"Screenshot saved: {result['path']} at {result['timestamp']}"
    return f"Error: {result['error']}"

4. Set your API key:

export PAGEBOLT_API_KEY="your_api_key_here"

That's it. Your MCP server now captures visual proof.

What You Get

When an agent using your server calls capture_screenshot, it:

Takes a PNG of the URL at that exact moment
Stores it locally with ISO timestamp
Returns the path and timestamp to the agent
Gives you immutable visual proof of what the agent saw

The agent can call it before form submission, after confirmation, whenever it needs proof.

Practical Example

Agent workflow:

1. Agent: navigate to /refunds
2. Agent: capture_screenshot("https://yourapp.com/refunds")
   → Returns: screenshots/2026-03-07T14-32-15.555Z.png
3. Agent: fill form with refund details
4. Agent: click submit
5. Agent: capture_screenshot("https://yourapp.com/refunds?status=confirmed")
   → Returns: screenshots/2026-03-07T14-32-22.891Z.png
6. Agent: done — you have visual proof before and after

Later, when auditors ask "did the form actually submit?", you show them the PNG.

Storing for Later

If you need persistence beyond local storage:

# Upload to S3
import boto3

s3 = boto3.client('s3')
s3.put_object(
    Bucket='audit-screenshots',
    Key=filename,
    Body=response.content,
    Metadata={
        'timestamp': timestamp,
        'url': url,
        'agent-trace-id': trace_id  # correlate with agent logs
    }
)

Now your visual proof lives with your logs. Together, they tell the complete story.

Next Steps

Get API key: https://pagebolt.dev/signup (100 requests/month free)
Add the tool to your MCP server (copy the handler above)
Set PAGEBOLT_API_KEY in your environment
Test: call capture_screenshot with any URL
Done

Your MCP server now has visual proof. Agents can show, not just tell, what they did.

MCP servers orchestrate automation invisibly. Screenshots make that invisible work visible. Five minutes of integration. Unlimited proof.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.