DEV Community

Cover image for I built an AI agent that runs manual test cases in a real browser
Broxhq
Broxhq

Posted on

I built an AI agent that runs manual test cases in a real browser

The problem

Every deploy — same manual test steps. Login, open the form,
fill the fields, check the result. Over and over.

I wanted to skip the Playwright/Selenium boilerplate and just
paste my existing test cases as plain text.

What I built

qpilot — an AI agent that reads your manual test case and
executes it in a real Chrome browser step by step.

You write this:

  1. Go to https://myapp.com/login
  2. Enter email and password
  3. Click Login
  4. Verify dashboard is visible

The agent opens Chrome, clicks, fills forms, and reports
pass/fail/warn per step with evidence from the page.

If it hits an OTP or captcha — it pauses and asks you directly.

How it works

  • Playwright controls the real Chrome browser
  • Each step: snapshot → action → snapshot → report
  • Claude Haiku reads the snapshot (ARIA tree) and decides what to click
  • Element refs (e.g. e12) are used for precise targeting
  • Context window is managed to avoid hitting token limits

Try it

npx qpilot

No code. No config. No Selenium.

Stack

TypeScript, Playwright, Claude Haiku via Anthropic API.

Open source: qpilot

Curious what you think — especially about edge cases
you'd want it to handle.

Top comments (3)

Collapse
 
xulingfeng profile image
xulingfeng

Same insight we had with deep-test — manual test steps as plain text should just work without framework boilerplate. Curious: when the agent hits a step it can't resolve (unseen UI element, captcha, dynamic state), does it fail hard or is there a fallback to human-in-the-loop?

Collapse
 
broxhq profile image
Broxhq

Good question — both.

If an element isn't found or the state is wrong, the agent marks the step fail and moves on (or stops immediately if it's a critical step like login).

If it needs something it can't know — OTP, SMS code, captcha — the run pauses and a dialog appears in the UI. You type the value, the agent continues from the same point.

Human-in-the-loop is built in, but only triggered when actually needed.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.