I Built a Self-Healing AI Browser Test Agent That Runs in Parallel Docker Containers

#ai #testing #python #automation

Try It Yourself
BrowserAgent is open source on GitHub:
👉 github.com/hcarrillo001/BrowserAgent-
Headed demo (watch the AI control the browser):
👉 youtu.be/mJkOIn08f8Q
Headless parallel containers demo:
👉 youtu.be/zaiAGPcbzH4
Drop a .txt file in the testcases/ folder, run one command, and watch it go.

I've been a QA engineer for a while now, and there's one problem that never goes away: QA teams are almost always outnumbered.
Development teams grow. Features ship faster. But QA stays small. Writing automated tests, maintaining them when the UI changes, keeping up with new features, it's a constant uphill battle. And the bottleneck is always the same: you need someone technical enough to write the code.
That frustration is what led me to build BrowserAgent.

What is BrowserAgent?
BrowserAgent is an AI-powered browser test automation framework where your test cases are plain English .txt files. No Selenium. No XPath. No CSS selectors.

Just this:

Navigate to: https://the-internet.herokuapp.com/login
Login with username "tomsmith" and password "SuperSecretPassword!"
Verify the login was successful by checking for the message "You logged into a secure area!"
Logout by clicking the logout button
Verify the logout was successful
Close the browser.

At the end output either "FINAL_RESULT: PASS" or "FINAL_RESULT: FAIL"
Claude AI reads that file, opens a real Chrome browser, and executes every step autonomously. When it's done, pytest evaluates the result and generates an HTML report with embedded JSON.
Anyone can write a test case. Product managers, business analysts, even clients. You don't need to know how to code.

It's Not Just for Testing, It Can Scrape Too
Because BrowserAgent controls a real browser, it's not limited to test automation. You can use it to scrape websites and extract data in any format. Just describe what you want in plain English and the agent navigates the site, finds the data, and returns it in structured JSON, no API key, no scraping library required. Any site a human can browse, BrowserAgent can scrape.

Where It Started
This isn't my first attempt at this. About 6 months ago I built an earlier version using the Playwright MCP server. It worked, but it was slow. MCP loads large tool schemas into context on every turn which adds overhead and burns through tokens fast.
When the new playwright cli was introduced everything changed. It's designed specifically for AI agents, compact commands, fast execution, token efficient. I already had the foundation from my MCP project so I iterated on it and BrowserAgent came together in a couple of days.

The Hardest Part: Running Multiple Tests at Once
The biggest challenge wasn't getting the AI to control the browser that part was surprisingly smooth. The hard part was running multiple test cases in parallel.
My first instinct was to just tell playwright cli to open multiple browsers at the same time. That didn't work. The CLI uses a single shared session so multiple agents would bleed into each other, opening tabs in the same browser instead of separate windows.
The solution was Docker containers. Each test case gets its own completely isolated container with its own Chrome browser, filesystem, and network stack. No shared sessions, no interference. 3 test cases = 3 containers running simultaneously. The orchestrator spins them all up at once via Python threading and waits for everything to finish before generating the report.
It's also self-healing. Traditional automation breaks the moment a button moves or gets renamed — your selectors stop working and someone has to go fix the code. BrowserAgent uses Claude AI to interpret the page dynamically on every step. If something changes, the agent figures it out.

Why a Python Script and Not a UI Client?
A lot of people ask why I built this as a Python script instead of running it through Claude Desktop or another UI client. The answer is flexibility.
When you run it as a Python script you can do things a UI client can never do:

Write results directly to a database
Plug it into a CI/CD pipeline
Schedule it to run every night automatically
Trigger it on every deployment
Integrate with Slack or email for notifications

Running it through a UI client locks you into that client. A Python script integrates with everything.

What Surprised Me
Honestly? How fast AI has moved.
Six months ago I was fighting with MCP server latency and token costs. Today I can write a plain English description of a test case and watch an AI agent execute it in a real browser in real time. That's remarkable.
But what really struck me is the bigger picture. AI is going to accelerate software development significantly. More features, faster release cycles, more code being written. That means more bugs. QA teams are already stretched thin, that gap is only going to grow.
The answer isn't to hire more QA engineers who write Selenium scripts. The answer is to build tools that let anyone write tests. If a product manager can describe what they want to verify in plain English, they should be able to run that test themselves.
I think this might be how we test software in the future. And I'd love for someone else to look at this and see the same value.

If you're a QA engineer, SDET, or just someone frustrated with brittle test automation, I'd love to hear your thoughts. What would make this useful for your team?

DEV Community

I Built a Self-Healing AI Browser Test Agent That Runs in Parallel Docker Containers

Top comments (0)