After 6 years doing QA automation in Fintech, I got tired of the same cycle:
Test fails in CI
Download the report
Spend 20 minutes reading stack traces
Realize it was a one-line selector issue
So I built a custom Playwright reporter that does the debugging for you.
What it does
When a test fails, the reporter automatically:
Captures the test name, error message, and stack trace
Sends it to an AI with a structured prompt
Gets back a root cause, a TypeScript fix, and a prevention tip
Saves everything to test-results/ai-diagnosis.md
Here's what the output looks like:
ā Login āŗ locked out user sees lock message
š Analyzing failure: "locked out user sees lock message"
ā locked out user sees lock message
1. Root cause
The selector [data-test="error"] matched but the assertion
expected different text than what Sauce Demo returned.
2. Fix
await expect(loginPage.errorMessage).toContainText(
'Epic sadface: Sorry, this user has been locked out.'
);
3. Prevention
Always assert the exact error string shown in the UI, not a partial guess.
š Diagnosis saved ā test-results/ai-diagnosis.md
The architecture
The key design decision was making the AI provider swappable at runtime via environment variable. No code changes needed ā just update your .env:
bash# Free default
AI_PROVIDER=groq
GROQ_API_KEY=your_key
Or swap to Claude / OpenAI
AI_PROVIDER=claude
ANTHROPIC_API_KEY=your_key
This is powered by a simple AIProvider interface:
typescriptexport interface AIProvider {
analyzeFailure(testName: string, error: string): Promise;
}
Each provider (Groq, Claude, OpenAI) implements this interface. The factory reads the env var and returns the right one:
typescriptexport function createProvider(): AIProvider {
const provider = process.env.AI_PROVIDER ?? 'groq';
switch (provider) {
case 'claude': return new ClaudeProvider();
case 'openai': return new OpenAIProvider();
default: return new GroqProvider(); // free, no credit card
}
}
The reporter itself
Playwright lets you build custom reporters by implementing the Reporter interface. The key hook is onTestEnd:
typescriptexport class AIReporter implements Reporter {
async onTestEnd(test: TestCase, result: TestResult) {
if (result.status !== 'failed') return;
const provider = createProvider();
const diagnosis = await provider.analyzeFailure(
test.title,
result.error?.message ?? 'Unknown error'
);
console.log(`\nš Analyzing failure: "${test.title}"\n`);
console.log(diagnosis);
fs.appendFileSync('test-results/ai-diagnosis.md', diagnosis);
}
}
Clean and minimal ā it only runs when a test fails, so it doesn't slow down passing suites.
The full project structure
The repo also includes a full working example with Sauce Demo:
playwright-ai-reporter/
āāā src/ai/
ā āāā AIProvider.interface.ts # shared contract
ā āāā GroqProvider.ts # free default
ā āāā ClaudeProvider.ts
ā āāā OpenAIProvider.ts
ā āāā providerFactory.ts
āāā reporters/
ā āāā AIReporter.ts
āāā tests/saucedemo/
ā āāā login.spec.ts
ā āāā cart.spec.ts
ā āāā checkout.spec.ts
āāā pages/ # Page Object Models
āāā fixtures/ # shared Playwright fixtures
āāā .github/workflows/
āāā playwright.yml # CI already configured
CI with GitHub Actions
The workflow runs on every push. It uploads the HTML report as an artifact always, and uploads ai-diagnosis.md only when there are failures ā so you always know exactly what broke and why.
yaml- name: Upload AI diagnosis
if: failure()
uses: actions/upload-artifact@v4
with:
name: ai-diagnosis
path: test-results/ai-diagnosis.md
AI provider comparison
ProviderCostSpeedQualityBest forGroq (Llama 3)FreeVery fastGoodPortfolio, small teamsClaude Haiku~$0.001/testFastVery goodMedium teamsClaude Sonnet~$0.005/testMediumExcellentEnterpriseGPT-4o-mini~$0.003/testFastVery goodAlternative
I use Groq for local dev (free tier is more than enough) and would use Claude Sonnet for a production CI pipeline where diagnosis quality matters.
What's next
A few things I want to add:
Screenshot attachment in the diagnosis when available
Grouping repeated failures to avoid duplicate AI calls
Slack/Teams notification with the diagnosis embedded
Try it yourself
bashgit clone https://github.com/sechavarriar/playwright-ai-reporter
cd playwright-ai-reporter
npm install
npx playwright install chromium
cp .env.example .env
Add your free Groq key from console.groq.com
npm test
The repo is at https://github.com/sechavarriar/playwright-ai-reporter ā feedback and PRs very welcome. Especially curious if anyone has ideas for handling flaky tests differently.
Top comments (0)