Custodia-Admin

Posted on Feb 26 • Originally published at pagebolt.dev

How to use PageBolt MCP tools in a LangChain or LlamaIndex agent

#ai #langchain #python #webdev

How to Use PageBolt MCP Tools in a LangChain or LlamaIndex Agent

MCP (Model Context Protocol) isn't just for IDE assistants like Claude Desktop and Cursor. You can connect any MCP server to a Python agent framework programmatically — and get every tool it exposes automatically, with no HTTP wrappers to write.

Here's how to give a LangChain or LlamaIndex agent access to PageBolt's 8 browser tools via MCP.

Why MCP over direct API calls

When you wrap the PageBolt API manually as LangChain tools, you define one function per endpoint and maintain the schemas yourself. With MCP:

All tools are discovered automatically from the server — add a new MCP tool and your agent gets it without a code change
Tool descriptions, parameter schemas, and validation come from the server
Same config pattern as Claude Desktop/Cursor — one server, many clients

LangChain: using `langchain-mcp-adapters`

The langchain-mcp-adapters package bridges MCP servers and LangChain tool interfaces.

pip install langchain-mcp-adapters langchain-openai langgraph

Connect and load tools

import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

async def run_agent_with_mcp():
    # Start the PageBolt MCP server as a subprocess
    server_params = StdioServerParameters(
        command="npx",
        args=["-y", "pagebolt-mcp"],
        env={"PAGEBOLT_API_KEY": os.environ["PAGEBOLT_API_KEY"]},
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Load all PageBolt tools as LangChain-compatible tools
            tools = await load_mcp_tools(session)
            print(f"Loaded {len(tools)} tools: {[t.name for t in tools]}")
            # → ['take_screenshot', 'generate_pdf', 'create_og_image',
            #    'run_sequence', 'record_video', 'inspect_page',
            #    'list_devices', 'check_usage']

            # Build a ReAct agent with the tools
            model = ChatOpenAI(model="gpt-4o", temperature=0)
            agent = create_react_agent(model, tools)

            # Run a task
            result = await agent.ainvoke({
                "messages": [
                    {
                        "role": "user",
                        "content": (
                            "Screenshot https://news.ycombinator.com "
                            "and tell me the top 5 story titles"
                        ),
                    }
                ]
            })
            return result["messages"][-1].content


if __name__ == "__main__":
    print(asyncio.run(run_agent_with_mcp()))

Multi-task agent loop

import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

async def build_agent():
    """Build a persistent agent with PageBolt MCP tools."""
    server_params = StdioServerParameters(
        command="npx",
        args=["-y", "pagebolt-mcp"],
        env={"PAGEBOLT_API_KEY": os.environ["PAGEBOLT_API_KEY"]},
    )
    # Note: keep the session open for the agent's lifetime
    read, write = await stdio_client(server_params).__aenter__()
    session = await ClientSession(read, write).__aenter__()
    await session.initialize()

    tools = await load_mcp_tools(session)
    model = ChatOpenAI(model="gpt-4o", temperature=0)
    agent = create_react_agent(model, tools)
    return agent


async def main():
    agent = await build_agent()

    tasks = [
        "Take a full-page screenshot of https://example.com and describe the layout",
        "Inspect https://example.com — list all buttons and links with their selectors",
        "Generate a PDF of https://example.com",
        "What devices are available for mobile screenshots?",
        "How many API requests have I used this month?",
    ]

    for task in tasks:
        print(f"\nTask: {task}")
        result = await agent.ainvoke({"messages": [{"role": "user", "content": task}]})
        print(f"Result: {result['messages'][-1].content}")


asyncio.run(main())

LlamaIndex: native MCP tool support

LlamaIndex has a built-in BasicMCPClient that connects to MCP servers directly.

pip install llama-index llama-index-tools-mcp llama-index-llms-openai

import asyncio
import os
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

async def run_llamaindex_agent():
    # Connect to the PageBolt MCP server
    mcp_client = BasicMCPClient(
        command="npx",
        args=["-y", "pagebolt-mcp"],
        env={"PAGEBOLT_API_KEY": os.environ["PAGEBOLT_API_KEY"]},
    )

    # Wrap as LlamaIndex tools
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()
    print(f"Loaded {len(tools)} tools: {[t.metadata.name for t in tools]}")

    # Build a ReAct agent
    llm = OpenAI(model="gpt-4o", temperature=0)
    agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)

    # Run tasks
    response = agent.chat(
        "Screenshot https://pagebolt.dev on an iPhone 14 Pro and "
        "describe the mobile layout"
    )
    print(response)

    response = agent.chat(
        "Inspect https://example.com/login and find the CSS selectors "
        "for the username and password fields"
    )
    print(response)


asyncio.run(run_llamaindex_agent())

LlamaIndex: workflow with sequential tool calls

import asyncio
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

async def visual_qa_workflow(url: str) -> str:
    """Inspect a page, then screenshot it, then report findings."""
    mcp_client = BasicMCPClient(
        command="npx",
        args=["-y", "pagebolt-mcp"],
        env={"PAGEBOLT_API_KEY": os.environ["PAGEBOLT_API_KEY"]},
    )
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-4o", temperature=0)
    agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)

    result = agent.chat(f"""
    Perform a quick audit of {url}:
    1. Use inspect_page to map all interactive elements
    2. Use take_screenshot to capture the full page
    3. Report: how many forms, buttons, and links are on the page?
       Does the visual layout look complete?
    """)
    return str(result)


result = asyncio.run(visual_qa_workflow("https://example.com"))
print(result)

Using Claude as the LLM (Anthropic + LangChain)

import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent

async def run_with_claude():
    server_params = StdioServerParameters(
        command="npx",
        args=["-y", "pagebolt-mcp"],
        env={"PAGEBOLT_API_KEY": os.environ["PAGEBOLT_API_KEY"]},
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await load_mcp_tools(session)

            # Claude is natively multimodal — it can see the screenshots it takes
            model = ChatAnthropic(model="claude-sonnet-4-6", temperature=0)
            agent = create_react_agent(model, tools)

            result = await agent.ainvoke({
                "messages": [{
                    "role": "user",
                    "content": (
                        "Screenshot https://example.com in dark mode and "
                        "describe the color scheme and typography"
                    ),
                }]
            })
            return result["messages"][-1].content


print(asyncio.run(run_with_claude()))

What tools you get

Once connected, your agent has access to all 8 PageBolt MCP tools:

Tool	What it does
`take_screenshot`	Screenshot any URL, HTML, or Markdown — 30+ params including device, dark mode, ad blocking
`generate_pdf`	Convert any URL or HTML to PDF
`create_og_image`	Generate social card images from templates or custom HTML
`run_sequence`	Multi-step browser automation (navigate, click, fill, screenshot)
`record_video`	Record browser automation as MP4 with cursor effects and AI voice narration
`inspect_page`	Get all interactive elements with CSS selectors — use before `run_sequence`
`list_devices`	List 25+ device presets (iPhone, iPad, MacBook, Galaxy, etc.)
`check_usage`	Check current API quota

MCP vs direct HTTP: when to use each

	MCP approach	Direct HTTP approach
Setup	Install MCP client package, start server subprocess	Write tool wrappers manually
Tools available	All 8 automatically	Only what you implement
Schema maintenance	Server-managed	Manual
Best for	Agent frameworks, reusable pipelines	Lightweight scripts, specific endpoints
Requires Node.js	Yes (for `npx pagebolt-mcp`)	No

The MCP approach is better for agent frameworks where you want all tools available and don't want to maintain HTTP wrapper code. Direct HTTP (article 104) is better for targeted scripts or environments without Node.js.

Try it free — 100 requests/month, no credit card. → Get started in 2 minutes

DEV Community

How to use PageBolt MCP tools in a LangChain or LlamaIndex agent

How to Use PageBolt MCP Tools in a LangChain or LlamaIndex Agent

Why MCP over direct API calls

LangChain: using `langchain-mcp-adapters`

Connect and load tools

Multi-task agent loop

LlamaIndex: native MCP tool support

LlamaIndex: workflow with sequential tool calls

Using Claude as the LLM (Anthropic + LangChain)

What tools you get

MCP vs direct HTTP: when to use each

Top comments (0)

How to Use PageBolt MCP Tools in a LangChain or LlamaIndex Agent

Why MCP over direct API calls

LangChain: using langchain-mcp-adapters

Connect and load tools

Multi-task agent loop

LlamaIndex: native MCP tool support

LlamaIndex: workflow with sequential tool calls

Using Claude as the LLM (Anthropic + LangChain)

What tools you get

MCP vs direct HTTP: when to use each

LangChain: using `langchain-mcp-adapters`