DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

How to add browser automation to any MCP server using PageBolt

How to add browser automation to any MCP server using PageBolt

You're building an AI agent with Claude. Your agent needs to interact with the web — take screenshots, generate PDFs, record demo videos, inspect page structure.

You could:

  1. Write Python scripts to call Puppeteer (fragile, maintenance burden)
  2. Manage your own headless browser pool (infrastructure overhead)
  3. Use PageBolt's MCP server (tool call, done)

Option 3 takes 5 minutes.

PageBolt provides an open MCP server that gives Claude, Cursor, and Windsurf native access to browser automation tools. Your agent calls take_screenshot() directly. No Python. No subprocess management. No infrastructure.

What is PageBolt MCP?

MCP (Model Context Protocol) is a standard for AI agents to call external tools. PageBolt implements the MCP spec so AI agents can:

  • Take screenshots of any URL
  • Generate PDFs from HTML or web pages
  • Record browser interactions as narrated videos
  • Inspect page structure (CSS selectors, element text)
  • Run multi-step browser sequences
  • All via direct function calls in Claude Code, Cursor IDE, Windsurf

PageBolt MCP translates these function calls into hosted API requests. No infrastructure on your end.

Installation: 2 minutes

Step 1: Install PageBolt MCP globally

npm install -g pagebolt-mcp
Enter fullscreen mode Exit fullscreen mode

Step 2: Get your API key

Sign up at pagebolt.dev. Free tier: 100 requests/month.

Copy your API key.

Step 3: Configure Claude Desktop

Edit ~/.claude/claude_desktop_config.json:

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Restart Claude Desktop. Done.

Step 4: Configure Cursor (optional)

Edit .cursor/mcp.json in your project root (or global Cursor settings):

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Restart Cursor. Now your agent can call PageBolt tools.

Tools available

Once installed, your agent can call:

take_screenshot(url, options)

Capture a screenshot of any URL.

Agent: "Take a screenshot of example.com on mobile"
Tool: take_screenshot("https://example.com", {
  "device": "iphone_14_pro",
  "fullPage": true
})
Result: base64 PNG image
Enter fullscreen mode Exit fullscreen mode

generate_pdf(url, options)

Generate a PDF from a URL or HTML.

Agent: "Generate a PDF of the checkout page"
Tool: generate_pdf("https://mystore.com/checkout", {
  "format": "A4",
  "margin": "1in"
})
Result: base64 PDF
Enter fullscreen mode Exit fullscreen mode

record_video(steps, narration)

Record a browser interaction as an MP4 with narration.

Agent: "Record a demo of adding a product to cart and checking out"
Tool: record_video([
  {"action": "navigate", "url": "https://mystore.com"},
  {"action": "click", "selector": "button.add-to-cart"},
  {"action": "click", "selector": "a.checkout"}
], {
  "narration": true,
  "audioGuide": "Adding product to cart. Proceeding to checkout."
})
Result: base64 MP4 video
Enter fullscreen mode Exit fullscreen mode

inspect_page(url)

Get structured map of page elements — buttons, inputs, links with CSS selectors.

Agent: "Inspect the login page and find the submit button"
Tool: inspect_page("https://myapp.com/login")
Result: {
  "forms": [{
    "selector": "form.login-form",
    "inputs": [
      {"selector": "#email", "type": "text", "placeholder": "Email"},
      {"selector": "#password", "type": "password", "placeholder": "Password"}
    ],
    "buttons": [{"selector": "button[type=submit]", "text": "Sign In"}]
  }]
}
Enter fullscreen mode Exit fullscreen mode

run_sequence(steps, options)

Execute a multi-step browser automation sequence.

Agent: "Run through the checkout flow and tell me if it succeeds"
Tool: run_sequence([
  {"action": "navigate", "url": "https://mystore.com/checkout"},
  {"action": "fill", "selector": "input[name=email]", "value": "test@example.com"},
  {"action": "click", "selector": "button[type=submit]"},
  {"action": "wait_for", "selector": ".order-confirmation", "timeout": 5000}
], {
  "captureScreenshots": true
})
Result: success/failure + screenshot evidence
Enter fullscreen mode Exit fullscreen mode

Practical examples

Example 1: Screenshot-taking agent

User: "Take a screenshot of each competitor's pricing page"
Agent: Calls take_screenshot() for each URL
Result: PNG images for comparison analysis
Enter fullscreen mode Exit fullscreen mode

Your agent can now autonomously capture and analyze screenshots without you writing any Puppeteer code.

Example 2: Demo video generator

User: "Record a narrated demo of our checkout flow"
Agent: Calls record_video() with checkout steps + narration script
Result: MP4 demo video auto-generated, saved to S3
Enter fullscreen mode Exit fullscreen mode

Instead of manually recording a 5-minute screencast, your agent does it in 30 seconds.

Example 3: PDF report automation

User: "Generate a PDF report of all product pages"
Agent: 
  1. Inspect each product page (using inspect_page)
  2. Take screenshot
  3. Generate PDF (using generate_pdf)
  4. Combine into report
Result: Multi-page PDF report, no manual work
Enter fullscreen mode Exit fullscreen mode

Example 4: Automated web testing

User: "Test if our checkout flow works on mobile and desktop"
Agent:
  1. Run checkout sequence on iPhone preset
  2. Run checkout sequence on desktop preset
  3. Compare screenshots for visual regression
Result: Pass/fail with evidence (screenshots)
Enter fullscreen mode Exit fullscreen mode

Real-world use case: AI agent that demos products

Here's a practical agent workflow:

User: "Record a demo of the new dashboard feature"

Agent:
1. Calls inspect_page() on staging dashboard
2. Gets CSS selectors for "Add Widget" button, filter controls, etc.
3. Calls record_video() with step-by-step interactions
   - Navigate to dashboard
   - Click "Add Widget"
   - Select "Sales Chart"
   - Configure chart options
   - Click "Save"
4. Adds narration: "Adding a sales chart widget to your dashboard..."
5. Returns MP4 file

Result: Professional demo video, auto-generated, ready to ship
Enter fullscreen mode Exit fullscreen mode

No manual screencast. No waiting for video editor. Done in 30 seconds.

Cost comparison: Manual vs agent-driven

Task Manual AI Agent (PageBolt MCP)
Screenshot 10 URLs 2 minutes 10 seconds
Record 5-minute demo 20 minutes 1 minute
Generate PDF report 15 minutes 2 minutes
Test checkout flow 10 minutes 30 seconds
Total time saved per week 10+ hours

Getting started

  1. Install: npm install -g pagebolt-mcp
  2. Get API key: Sign up at pagebolt.dev
  3. Configure: Add to Claude Desktop / Cursor config
  4. Use: Your agent now has browser tools natively

No infrastructure. No setup. Your AI agent is now a power user of the web.

Limits and caveats

  • Authentication: Pass cookies/headers if needed
  • JavaScript rendering: Full Chromium, waits for network idle
  • Localhost: Not accessible (we're a hosted service)
  • Rate limits: Free tier 100 requests/month, paid tiers scale

Next steps

Your agent can now:

  • ✅ Screenshot any website
  • ✅ Generate PDFs on demand
  • ✅ Record and narrate videos
  • ✅ Inspect page structure
  • ✅ Run complex browser sequences

Use these tools to automate tasks that previously required manual work or fragile Python scripts.

Start free — 100 requests/month, no credit card. Add browser automation to your MCP server in 5 minutes.

Top comments (0)