Get Your Hands Dirty - AgentCore - Browser

#agents #ai #automation #aws

Strands Agents with Bedrock AgentCore Browser

Integrating Strands Agents with the Amazon Bedrock AgentCore Browser enables AI agents to interact with web browsers dynamically in a managed environment. This capability allows agents to move beyond static text and execute complex web-based tasks such as navigating websites, programmatically filling out forms, and extracting real-time information from web pages.

The Browser tool provides a secure, isolated sandbox where the agent can utilize the Playwright library to connect to a remote browser session via a WebSocket. This allows the agent to interact with page elements, execute JavaScript, and capture screenshots to verify its actions.

A. Direct Browser Interaction (Non-Agentic)

You can use the AgentCore Browser to perform manual automation tasks by connecting to a session and using Playwright commands to navigate and capture data.

import boto3
from bedrock_agentcore.tools.browser_client import browser_session
from playwright.async_api import async_playwright

region = "us-east-1"

# Connect to a managed browser session
with browser_session(region) as client:
    ws_url, headers = client.generate_ws_headers()

    async with async_playwright() as playwright:
        browser = await playwright.chromium.connect_over_cdp(endpoint_url=ws_url, headers=headers)
        page = await browser.new_page()

        # Navigate and capture a screenshot
        await page.goto("https://builder.aws.com/")
        await page.screenshot(path="screenshot.jpg")

B. Autonomous Web Interaction with Strands Agent

By adding the AgentCoreBrowser tool to a Strands Agent, the agent can autonomously browse the web to find information, summarize articles, or perform transactions based on natural language instructions.

import boto3
from strands import Agent
from strands.models import BedrockModel
from strands_tools.browser import AgentCoreBrowser

# Initialize the browser tool
agentcore_browser = AgentCoreBrowser(region="us-east-1")

# Create an agent with browsing capabilities
agent = Agent(
    model=BedrockModel(model_id="us.amazon.nova-pro-v1:0"),
    tools=[agentcore_browser.browser],
)

# Request the agent to perform a multi-step web task
query = ("Go to https://builder.aws.com/learn/topics/amazon-bedrock-agentcore, "
         "find the first article link, and summarize its content.")
response = await agent.invoke_async(query)
print(response.message["content"][0]["text"])

Key Takeaway: Bedrock AgentCore Browser transforms agents into interactive web operators. Whether for automated quality assurance testing, real-time data collection, or complex business process automation, the agent can navigate the internet with the same visual and interactive capabilities as a human user.