How to add browser automation to any MCP server using PageBolt
You're building an AI agent with Claude. Your agent needs to interact with the web — take screenshots, generate PDFs, record demo videos, inspect page structure.
You could:
- Write Python scripts to call Puppeteer (fragile, maintenance burden)
- Manage your own headless browser pool (infrastructure overhead)
- Use PageBolt's MCP server (tool call, done)
Option 3 takes 5 minutes.
PageBolt provides an open MCP server that gives Claude, Cursor, and Windsurf native access to browser automation tools. Your agent calls take_screenshot() directly. No Python. No subprocess management. No infrastructure.
What is PageBolt MCP?
MCP (Model Context Protocol) is a standard for AI agents to call external tools. PageBolt implements the MCP spec so AI agents can:
- Take screenshots of any URL
- Generate PDFs from HTML or web pages
- Record browser interactions as narrated videos
- Inspect page structure (CSS selectors, element text)
- Run multi-step browser sequences
- All via direct function calls in Claude Code, Cursor IDE, Windsurf
PageBolt MCP translates these function calls into hosted API requests. No infrastructure on your end.
Installation: 2 minutes
Step 1: Install PageBolt MCP globally
npm install -g pagebolt-mcp
Step 2: Get your API key
Sign up at pagebolt.dev. Free tier: 100 requests/month.
Copy your API key.
Step 3: Configure Claude Desktop
Edit ~/.claude/claude_desktop_config.json:
{
"mcpServers": {
"pagebolt": {
"command": "pagebolt-mcp",
"env": {
"PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Restart Claude Desktop. Done.
Step 4: Configure Cursor (optional)
Edit .cursor/mcp.json in your project root (or global Cursor settings):
{
"mcpServers": {
"pagebolt": {
"command": "pagebolt-mcp",
"env": {
"PAGEBOLT_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
Restart Cursor. Now your agent can call PageBolt tools.
Tools available
Once installed, your agent can call:
take_screenshot(url, options)
Capture a screenshot of any URL.
Agent: "Take a screenshot of example.com on mobile"
Tool: take_screenshot("https://example.com", {
"device": "iphone_14_pro",
"fullPage": true
})
Result: base64 PNG image
generate_pdf(url, options)
Generate a PDF from a URL or HTML.
Agent: "Generate a PDF of the checkout page"
Tool: generate_pdf("https://mystore.com/checkout", {
"format": "A4",
"margin": "1in"
})
Result: base64 PDF
record_video(steps, narration)
Record a browser interaction as an MP4 with narration.
Agent: "Record a demo of adding a product to cart and checking out"
Tool: record_video([
{"action": "navigate", "url": "https://mystore.com"},
{"action": "click", "selector": "button.add-to-cart"},
{"action": "click", "selector": "a.checkout"}
], {
"narration": true,
"audioGuide": "Adding product to cart. Proceeding to checkout."
})
Result: base64 MP4 video
inspect_page(url)
Get structured map of page elements — buttons, inputs, links with CSS selectors.
Agent: "Inspect the login page and find the submit button"
Tool: inspect_page("https://myapp.com/login")
Result: {
"forms": [{
"selector": "form.login-form",
"inputs": [
{"selector": "#email", "type": "text", "placeholder": "Email"},
{"selector": "#password", "type": "password", "placeholder": "Password"}
],
"buttons": [{"selector": "button[type=submit]", "text": "Sign In"}]
}]
}
run_sequence(steps, options)
Execute a multi-step browser automation sequence.
Agent: "Run through the checkout flow and tell me if it succeeds"
Tool: run_sequence([
{"action": "navigate", "url": "https://mystore.com/checkout"},
{"action": "fill", "selector": "input[name=email]", "value": "test@example.com"},
{"action": "click", "selector": "button[type=submit]"},
{"action": "wait_for", "selector": ".order-confirmation", "timeout": 5000}
], {
"captureScreenshots": true
})
Result: success/failure + screenshot evidence
Practical examples
Example 1: Screenshot-taking agent
User: "Take a screenshot of each competitor's pricing page"
Agent: Calls take_screenshot() for each URL
Result: PNG images for comparison analysis
Your agent can now autonomously capture and analyze screenshots without you writing any Puppeteer code.
Example 2: Demo video generator
User: "Record a narrated demo of our checkout flow"
Agent: Calls record_video() with checkout steps + narration script
Result: MP4 demo video auto-generated, saved to S3
Instead of manually recording a 5-minute screencast, your agent does it in 30 seconds.
Example 3: PDF report automation
User: "Generate a PDF report of all product pages"
Agent:
1. Inspect each product page (using inspect_page)
2. Take screenshot
3. Generate PDF (using generate_pdf)
4. Combine into report
Result: Multi-page PDF report, no manual work
Example 4: Automated web testing
User: "Test if our checkout flow works on mobile and desktop"
Agent:
1. Run checkout sequence on iPhone preset
2. Run checkout sequence on desktop preset
3. Compare screenshots for visual regression
Result: Pass/fail with evidence (screenshots)
Real-world use case: AI agent that demos products
Here's a practical agent workflow:
User: "Record a demo of the new dashboard feature"
Agent:
1. Calls inspect_page() on staging dashboard
2. Gets CSS selectors for "Add Widget" button, filter controls, etc.
3. Calls record_video() with step-by-step interactions
- Navigate to dashboard
- Click "Add Widget"
- Select "Sales Chart"
- Configure chart options
- Click "Save"
4. Adds narration: "Adding a sales chart widget to your dashboard..."
5. Returns MP4 file
Result: Professional demo video, auto-generated, ready to ship
No manual screencast. No waiting for video editor. Done in 30 seconds.
Cost comparison: Manual vs agent-driven
| Task | Manual | AI Agent (PageBolt MCP) |
|---|---|---|
| Screenshot 10 URLs | 2 minutes | 10 seconds |
| Record 5-minute demo | 20 minutes | 1 minute |
| Generate PDF report | 15 minutes | 2 minutes |
| Test checkout flow | 10 minutes | 30 seconds |
| Total time saved per week | — | 10+ hours |
Getting started
-
Install:
npm install -g pagebolt-mcp - Get API key: Sign up at pagebolt.dev
- Configure: Add to Claude Desktop / Cursor config
- Use: Your agent now has browser tools natively
No infrastructure. No setup. Your AI agent is now a power user of the web.
Limits and caveats
- Authentication: Pass cookies/headers if needed
- JavaScript rendering: Full Chromium, waits for network idle
- Localhost: Not accessible (we're a hosted service)
- Rate limits: Free tier 100 requests/month, paid tiers scale
Next steps
Your agent can now:
- ✅ Screenshot any website
- ✅ Generate PDFs on demand
- ✅ Record and narrate videos
- ✅ Inspect page structure
- ✅ Run complex browser sequences
Use these tools to automate tasks that previously required manual work or fragile Python scripts.
Start free — 100 requests/month, no credit card. Add browser automation to your MCP server in 5 minutes.
Top comments (0)