DEV Community

vishalmysore
vishalmysore

Posted on

Selenium with A2A and MCP for AI Agents

Hey fellow developers! 👋 I'm excited to share something I've built and I'd love your feedback and contributions. I've created a web automation system that brings together the power of A2A (Agent-to-Agent) protocol and MCP (Model Context Protocol) with Selenium WebDriver. The best part? You can try it live right now!

🎮 Try the Live Demo

Want to see it in action before diving in? Here are some things to try (remember, results may vary based on your prompts):

  • Run automated web tests (start with simple navigation tasks)
  • Capture screenshots (works best with stable pages)
  • Execute natural language commands (be specific and clear in your instructions)
  • Watch agents communicate in real-time

💡 Pro Tip: When using the demo, try to:

  • Be specific in your instructions
  • Start with simple commands and gradually increase complexity
  • If something doesn't work, try rephrasing your prompt
  • Share what prompts worked best for you!

🚀 What I've Built (And You Can Too!)

I've created a web automation agent that's ready for you to use, extend, and improve. Here's what it can do:

Core Features (Try them live!):

  • Executes Selenium-based web automation tasks with natural language
  • Captures and validates UI elements automatically
  • Can Communicates between agents using A2A and MCP protocols using poc java client
  • Integrates with your favorite AI models (Gemini, OpenAI, Claude, Grok)
  • Provides real-time agent communication and task monitoring

🎯 Live Demo Highlights:

  • Test web automation scenarios instantly
  • Watch agents collaborate in real-time
  • Experiment with different AI models
  • No setup required - just visit the demo URL!

🛠️ Technology Stack

  • a2ajava: The Swiss Army knife for building agent applications
  • Selenium WebDriver: For web automation
  • Spring Boot: For the application framework
  • AI Integration: Support for multiple LLM platforms

🔥 Key Features

  1. Multi-Protocol Support

    • A2A (Agent-to-Agent) protocol for agent communication
    • MCP (Model Context Protocol) for AI model integration
    • Seamless interoperability between protocols
  2. Multi-Language Support

    • Java (primary)
    • Kotlin
    • Groovy
  3. Multi-Platform AI Integration

    • Gemini
    • OpenAI
    • Claude
    • Grok
  4. Advanced Integration Features

    • Selenium automation
    • Human-in-the-loop workflows
    • Multi-LLM voting for consensus-based decisions

💻 Get Started in Minutes

Want to join the development? Here's how to get started:

🌐 Try it Online First

Visit our live demo and try these sample commands:

  1. Navigate to a website
  2. Capture screenshots
  3. Validate UI elements
  4. Watch real-time agent communication

🔧 Local Setup for MCP

{
    "webbrowsingagent": {
        "command": "java",
        "args": [
            "-jar",
            "/work/a2a-mcp-bridge/target/mcp-connector-full.jar",
            "http://localhost:7860/"
        ],
        "timeout": 30000
    }
}
Enter fullscreen mode Exit fullscreen mode
  1. Remote Server Connection
{
    "webbrowsingagent": {
        "command": "java",
        "args": [
            "-jar",
            "/work/a2a-mcp-bridge/target/mcp-connector-full.jar",
            "https://vishalmysore-a2amcpselenium.hf.space"
        ],
        "timeout": 30000
    }
}
Enter fullscreen mode Exit fullscreen mode

For Setup with A2A add this as remote agent in your a2a client, I have tried to add as much documetation as possible but if something is missing or not working please let me know

Source code is here

⚠️ Work in Progress Notice: This project is under active development and may have bugs or unstable features. The system's behavior is highly dependent on the quality and clarity of prompts provided. I am continuously improving it and welcome your feedback!

Top comments (0)