DEV Community

Cover image for AutoQA-Agent: Write Acceptance Tests in Markdown, Run Them with AI + Playwright
NEE
NEE

Posted on

AutoQA-Agent: Write Acceptance Tests in Markdown, Run Them with AI + Playwright

AutoQA-Agent is a Docs-as-Tests CLI: you write acceptance tests in plain Markdown, and it runs them via a lightweight Claude Agent SDK loop + Playwright.

It focuses on:

  • Reducing script fragility with snapshot-first, ref-first interactions
  • Letting non-engineers contribute (Markdown specs)
  • Leaving great artifacts (logs / snapshots / screenshots / traces)
  • Exporting passing runs into standard @playwright/test specs

TL;DR

  • Write test steps in Markdown.
  • Run: autoqa run <spec-or-dir> --url <baseUrl>
  • Get artifacts under .autoqa/runs/<runId>/...
  • If the spec passes, AutoQA-Agent can export to tests/autoqa/*.spec.ts

Why this exists

UI automation tends to break for boring reasons:

  • Locators become unstable after small UI refactors.
  • Test code is often unreadable for PMs/QAs.
  • Failures are hard to diagnose without good context.

AutoQA-Agent treats acceptance tests as living documentation, then uses an agent loop to drive a real browser and recover from transient failures.

Quick start

Prerequisites

  • Node.js >= 20
  • Claude Code authorized (recommended) or ANTHROPIC_API_KEY

Install & build

git clone https://github.com/terryso/AutoQA-Agent.git
cd AutoQA-Agent
npm install
npm run build
npm link # optional
Enter fullscreen mode Exit fullscreen mode

Initialize

autoqa init
Enter fullscreen mode Exit fullscreen mode

Run a spec

autoqa run specs/saucedemo-01-login.md --url https://www.saucedemo.com/
Enter fullscreen mode Exit fullscreen mode

Debug (headed browser)

autoqa run specs/saucedemo-01-login.md --url https://www.saucedemo.com/ --debug
Enter fullscreen mode Exit fullscreen mode

What a Markdown spec looks like

# Login

## Preconditions
- Test account exists

## Steps
1. Navigate to /login
2. Verify the login form is visible
3. Fill the username field with standard_user
4. Fill the password field with secret_sauce
5. Click the "Login" button
6. Verify the user is redirected to dashboard
Enter fullscreen mode Exit fullscreen mode

Notes:

  • Base URL is provided via --url (the “Base URL” line in preconditions is for readability).
  • Steps starting with Verify/Assert (also supports Chinese “验证/断言”) are treated as assertions.

How it works (high level)

  • Parse Markdown into preconditions + ordered steps.
  • Run an observe → act → recover loop using Claude Agent SDK.
  • Use accessibility snapshots (with stable refs) to drive ref-first actions.
  • When a tool/assertion fails, return structured errors (instead of crashing) so the agent can retry, bounded by guardrails.

Artifacts you get for free

After a run:

.autoqa/runs/<runId>/
├── run.log.jsonl
├── ir.jsonl
├── screenshots/
├── snapshots/
└── traces/
Enter fullscreen mode Exit fullscreen mode

This makes failures much easier to debug locally and in CI.

Export to @playwright/test

Successful specs can be exported to:

tests/autoqa/*.spec.ts
Enter fullscreen mode Exit fullscreen mode

So you can keep what worked and run it as classic Playwright tests later.

Roadmap (short)

  • Richer export (more semantic parsing + more assertion mappings)
  • More example specs and demo projects
  • Continuous improvement in docs and diagrams

Try it / contribute

Top comments (0)