Custodia-Admin

Posted on Mar 2 • Originally published at pagebolt.dev

How to add browser automation to any MCP server using PageBolt

#mcp #browserautomation #claude #aiagents

How to add browser automation to any MCP server using PageBolt

You're building an AI agent with Claude. Your agent needs to interact with the web — take screenshots, generate PDFs, record demo videos, inspect page structure.

You could:

Write Python scripts to call Puppeteer (fragile, maintenance burden)
Manage your own headless browser pool (infrastructure overhead)
Use PageBolt's MCP server (tool call, done)

Option 3 takes 5 minutes.

PageBolt provides an open MCP server that gives Claude, Cursor, and Windsurf native access to browser automation tools. Your agent calls take_screenshot() directly. No Python. No subprocess management. No infrastructure.

What is PageBolt MCP?

MCP (Model Context Protocol) is a standard for AI agents to call external tools. PageBolt implements the MCP spec so AI agents can:

Take screenshots of any URL
Generate PDFs from HTML or web pages
Record browser interactions as narrated videos
Inspect page structure (CSS selectors, element text)
Run multi-step browser sequences
All via direct function calls in Claude Code, Cursor IDE, Windsurf

PageBolt MCP translates these function calls into hosted API requests. No infrastructure on your end.

Installation: 2 minutes

Step 1: Install PageBolt MCP globally

npm install -g pagebolt-mcp

Step 2: Get your API key

Copy your API key.

Step 3: Configure Claude Desktop

Edit ~/.claude/claude_desktop_config.json:

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Restart Claude Desktop. Done.

Step 4: Configure Cursor (optional)

Edit .cursor/mcp.json in your project root (or global Cursor settings):

{
  "mcpServers": {
    "pagebolt": {
      "command": "pagebolt-mcp",
      "env": {
        "PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Restart Cursor. Now your agent can call PageBolt tools.

Tools available

Once installed, your agent can call:

`take_screenshot(url, options)`

Capture a screenshot of any URL.

Agent: "Take a screenshot of example.com on mobile"
Tool: take_screenshot("https://example.com", {
  "device": "iphone_14_pro",
  "fullPage": true
})
Result: base64 PNG image

`generate_pdf(url, options)`

Generate a PDF from a URL or HTML.

Agent: "Generate a PDF of the checkout page"
Tool: generate_pdf("https://mystore.com/checkout", {
  "format": "A4",
  "margin": "1in"
})
Result: base64 PDF

`record_video(steps, narration)`

Record a browser interaction as an MP4 with narration.

Agent: "Record a demo of adding a product to cart and checking out"
Tool: record_video([
  {"action": "navigate", "url": "https://mystore.com"},
  {"action": "click", "selector": "button.add-to-cart"},
  {"action": "click", "selector": "a.checkout"}
], {
  "narration": true,
  "audioGuide": "Adding product to cart. Proceeding to checkout."
})
Result: base64 MP4 video

`inspect_page(url)`

Get structured map of page elements — buttons, inputs, links with CSS selectors.

Agent: "Inspect the login page and find the submit button"
Tool: inspect_page("https://myapp.com/login")
Result: {
  "forms": [{
    "selector": "form.login-form",
    "inputs": [
      {"selector": "#email", "type": "text", "placeholder": "Email"},
      {"selector": "#password", "type": "password", "placeholder": "Password"}
    ],
    "buttons": [{"selector": "button[type=submit]", "text": "Sign In"}]
  }]
}

`run_sequence(steps, options)`

Execute a multi-step browser automation sequence.

Agent: "Run through the checkout flow and tell me if it succeeds"
Tool: run_sequence([
  {"action": "navigate", "url": "https://mystore.com/checkout"},
  {"action": "fill", "selector": "input[name=email]", "value": "test@example.com"},
  {"action": "click", "selector": "button[type=submit]"},
  {"action": "wait_for", "selector": ".order-confirmation", "timeout": 5000}
], {
  "captureScreenshots": true
})
Result: success/failure + screenshot evidence

Practical examples

Example 1: Screenshot-taking agent

User: "Take a screenshot of each competitor's pricing page"
Agent: Calls take_screenshot() for each URL
Result: PNG images for comparison analysis

Your agent can now autonomously capture and analyze screenshots without you writing any Puppeteer code.

Example 2: Demo video generator

User: "Record a narrated demo of our checkout flow"
Agent: Calls record_video() with checkout steps + narration script
Result: MP4 demo video auto-generated, saved to S3

Instead of manually recording a 5-minute screencast, your agent does it in 30 seconds.

Example 3: PDF report automation

User: "Generate a PDF report of all product pages"
Agent: 
  1. Inspect each product page (using inspect_page)
  2. Take screenshot
  3. Generate PDF (using generate_pdf)
  4. Combine into report
Result: Multi-page PDF report, no manual work

Example 4: Automated web testing

User: "Test if our checkout flow works on mobile and desktop"
Agent:
  1. Run checkout sequence on iPhone preset
  2. Run checkout sequence on desktop preset
  3. Compare screenshots for visual regression
Result: Pass/fail with evidence (screenshots)

Real-world use case: AI agent that demos products

Here's a practical agent workflow:

User: "Record a demo of the new dashboard feature"

Agent:
1. Calls inspect_page() on staging dashboard
2. Gets CSS selectors for "Add Widget" button, filter controls, etc.
3. Calls record_video() with step-by-step interactions
   - Navigate to dashboard
   - Click "Add Widget"
   - Select "Sales Chart"
   - Configure chart options
   - Click "Save"
4. Adds narration: "Adding a sales chart widget to your dashboard..."
5. Returns MP4 file

Result: Professional demo video, auto-generated, ready to ship

No manual screencast. No waiting for video editor. Done in 30 seconds.

Cost comparison: Manual vs agent-driven

Task	Manual	AI Agent (PageBolt MCP)
Screenshot 10 URLs	2 minutes	10 seconds
Record 5-minute demo	20 minutes	1 minute
Generate PDF report	15 minutes	2 minutes
Test checkout flow	10 minutes	30 seconds
Total time saved per week	—	10+ hours

Getting started

Install: npm install -g pagebolt-mcp
Get API key: Sign up at pagebolt.dev
Configure: Add to Claude Desktop / Cursor config
Use: Your agent now has browser tools natively

No infrastructure. No setup. Your AI agent is now a power user of the web.

Limits and caveats

Authentication: Pass cookies/headers if needed
JavaScript rendering: Full Chromium, waits for network idle
Localhost: Not accessible (we're a hosted service)
Rate limits: Free tier 100 requests/month, paid tiers scale

Next steps

Your agent can now:

✅ Screenshot any website
✅ Generate PDFs on demand
✅ Record and narrate videos
✅ Inspect page structure
✅ Run complex browser sequences

Use these tools to automate tasks that previously required manual work or fragile Python scripts.

Start free — 100 requests/month, no credit card. Add browser automation to your MCP server in 5 minutes.

DEV Community

How to add browser automation to any MCP server using PageBolt

How to add browser automation to any MCP server using PageBolt

What is PageBolt MCP?

Installation: 2 minutes

Step 1: Install PageBolt MCP globally

Step 2: Get your API key

Step 3: Configure Claude Desktop

Step 4: Configure Cursor (optional)

Tools available

`take_screenshot(url, options)`

`generate_pdf(url, options)`

`record_video(steps, narration)`

`inspect_page(url)`

`run_sequence(steps, options)`

Practical examples

Example 1: Screenshot-taking agent

Example 2: Demo video generator

Example 3: PDF report automation

Example 4: Automated web testing

Real-world use case: AI agent that demos products

Cost comparison: Manual vs agent-driven

Getting started

Limits and caveats

Next steps

Top comments (0)