I built an AI agent that runs manual test cases in a real browser

#ai #automation #showdev #testing

The problem

Every deploy — same manual test steps. Login, open the form,
fill the fields, check the result. Over and over.

I wanted to skip the Playwright/Selenium boilerplate and just
paste my existing test cases as plain text.

What I built

qpilot — an AI agent that reads your manual test case and
executes it in a real Chrome browser step by step.

You write this:

Go to https://myapp.com/login
Enter email and password
Click Login
Verify dashboard is visible

The agent opens Chrome, clicks, fills forms, and reports
pass/fail/warn per step with evidence from the page.

If it hits an OTP or captcha — it pauses and asks you directly.

How it works

Playwright controls the real Chrome browser
Each step: snapshot → action → snapshot → report
Claude Haiku reads the snapshot (ARIA tree) and decides what to click
Element refs (e.g. e12) are used for precise targeting
Context window is managed to avoid hitting token limits

Try it

npx qpilot

No code. No config. No Selenium.

Stack

TypeScript, Playwright, Claude Haiku via Anthropic API.

Open source: qpilot

Curious what you think — especially about edge cases
you'd want it to handle.

Top comments (3)

xulingfeng • Jun 3

Same insight we had with deep-test — manual test steps as plain text should just work without framework boilerplate. Curious: when the agent hits a step it can't resolve (unseen UI element, captcha, dynamic state), does it fail hard or is there a fallback to human-in-the-loop?

Broxhq • Jun 4

Good question — both.

If an element isn't found or the state is wrong, the agent marks the step fail and moves on (or stops immediately if it's a critical step like login).

If it needs something it can't know — OTP, SMS code, captcha — the run pauses and a dialog appears in the UI. You type the value, the agent continues from the same point.

Human-in-the-loop is built in, but only triggered when actually needed.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.