Stop Drowning in CI Noise: QAI Agent Clusters Your Test Failures and Tells You What Actually Broke

#github #testing #playwright #devops

You open a PR. CI is red. There are 47 failed tests.

Now what?

You scroll through a wall of test names. Some look related. Some look flaky. Some are probably the same root cause repeated across 20 test cases. You don't know which to fix first, or whether it's even safe to merge.

This is CI noise — and it's eating engineering time every single day.

What QAI Agent does

QAI Agent is a GitHub Action that runs after your tests and posts an intelligent summary directly on the pull request.

It does three things:

1. Clusters failures by root cause

Instead of showing you 47 test names, it groups tests that failed for the same underlying reason. If 30 tests all hit the same null pointer, that's one cluster — one thing to fix.

It works by normalizing error messages: stripping timestamps, line numbers, UUIDs, memory addresses, file paths, and variable values, then hashing the result. Tests with the same normalized signature are the same failure.

2. Scores PR risk

Based on the fail rate and number of unique failure patterns, it outputs a risk level: low, medium, or high. You can use this to automatically block merges on high-risk PRs.

3. Analyzes Playwright traces (optional)

If you're using Playwright and save traces on failure, QAI Agent will unzip and analyze them locally — no cloud required. It detects five failure categories:

Cause	How it's detected
UI Changed	Locator not found, strict mode violation
Backend Error	HTTP 5xx response during test
Test Bug	Assertion errors in console logs
Timing / Flaky	Timeout on step
Environment Failure	Network failures, ECONNREFUSED

Setup in 60 seconds

Add one step to your existing workflow, after your tests run:

- name: QAI Agent
  uses: useqai/qai-agent@v1
  if: always()
  with:
    junit-path: 'test-results/results.xml'

Your workflow needs pull-requests: write permission:

jobs:
  test:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
      contents: read
    steps:
      - uses: actions/checkout@v4
      - name: Run tests
        run: npx playwright test --reporter=junit
      - name: QAI Agent
        uses: useqai/qai-agent@v1
        if: always()
        with:
          junit-path: 'test-results/results.xml'
          trace-path: 'test-results/**/*.zip'   # optional, for RCA

That's it. No account. No API key. No configuration.

The PR comment it generates

Every PR gets a comment like this:

It shows:

Risk level and merge recommendation
Failed tests with their error messages
Failure clusters (grouped by root cause)
RCA analysis from Playwright traces (if provided)

The comment is upserted — it updates in place when you push new commits, so it doesn't spam your PR timeline.

Block merges on high risk

QAI Agent exposes outputs you can use in subsequent steps:

- name: QAI Agent
  id: qai
  uses: useqai/qai-agent@v1
  with:
    junit-path: 'test-results/results.xml'

- name: Block merge on high risk
  if: steps.qai.outputs.risk-level == 'high'
  run: |
    echo "High risk — investigate failures before merging"
    exit 1

Available outputs: risk-level, risk-score, failed-tests, total-tests, cluster-count.

Works with any JUnit-compatible framework

Framework	How to get JUnit output
Playwright	`--reporter=junit`
Jest	`--reporters=jest-junit`
Vitest	`--reporter=junit`
pytest	`--junitxml=results.xml`
Maven/JUnit	built-in
Go (gotestsum)	`--junitfile results.xml`

What it doesn't do without the cloud

No historical context — without connecting a cloud backend, QAI Agent only sees the current run. It can't tell you "this failure has been flaky for 3 weeks."
No LLM explanations — the RCA is rule-based, not AI-generated. It detects categories of failure, not the specific cause in your code.
Playwright traces only — the RCA analysis only works with Playwright trace zip files, not other test frameworks.

A cloud platform useqai.dev adds historical trends chart across all runs, flakiness tracking, cross-repo visibility, and LLM-powered root cause analysis.

Try it

GitHub Action: useqai/qai-agent on the Marketplace
Source: github.com/useqai/qai-agent
Live PR comment demo: https://github.com/useqai/qai-agent/pull/2
Dashboard: https://useqai.dev

If you try it, open an issue or leave a comment here — especially if you run into a framework or JUnit variant that doesn't parse correctly. Happy to fix it.