What Is Agentic QA and Why It Changes Everything

#qa #ai #agents #testing

Sound familiar? In the first article, we set up a project scaffold designed for AI. But a good structure only gets you so far if the AI is just a code suggester. Useful, but not transformative. You still have to know what to ask, verify what it wrote, adapt it to your project, and repeat for every file.

"I asked Claude Code to write a test and it gave me something that kind of works"

Sound familiar? That's AI as a code suggester. Useful, but not transformative. You still have to know what to ask, verify what it wrote, adapt it to your project, and repeat for every file.

Agentic QA is different. Instead of an AI that answers questions, you have an AI that takes actions. It reads your codebase, opens a browser, navigates your application, discovers elements, and generates code that fits directly into your existing structure.

The difference is the difference between a search engine and an employee.

🤔 Agent vs Chatbot: What's the Actual Difference?

A chatbot responds to a single message. It has no memory of your project, no access to your files, and no ability to take actions beyond generating text.

An agent operates in a loop:

It receives a goal (create a test suite for the registration page)
It takes actions to gather information (reads files, navigates the browser)
It uses what it found to produce an output (generates the page object, Faker factory, test suite, etc.)
It verifies the output (runs the tests, checks for linting errors)
It iterates if something is wrong

The agent isn't just writing code. It's operating inside your development environment, making decisions based on real information.

You: Add a e2e test for the checkout flow

Agent:
  → Opens browser, navigates to the product page
  → Discovers: all the necessary actions to complete the checkout flow
  → Reads pages/ to understand existing page object conventions
  → Reads tests/ to understand the test structure and imports
  → Reads .claude/skills/test-standards to check test standards
  → Generates checkout.page.ts, checkout.spec.ts, factory functions, Zod schemas, etc.
  → Runs the test: npx playwright test checkout.spec.ts
  → Confirms: 1 passed, 0 failed
  → If something is wrong, it reads the error and iterates again

That entire loop happens without you writing a line. The agent is doing all the work.

🎯 Why Test Automation Is a Perfect Fit

Not every software task is well-suited for agents. But test automation has properties that make it almost ideal:

It's highly structured. Tests follow patterns. Page objects follow patterns. Data factories follow patterns. Patterns are exactly what AI agents are good at recognizing and reproducing.
It's verifiable. After generating a test, the agent can run it. A passing test is objective confirmation that the output is correct. The agent doesn't have to guess. It can check.
It's repetitive. Writing a page object for the tenth page in your app involves the same thinking as the first. That repetition is tedious for humans and trivial for agents.
It requires codebase context. A good test has to fit the existing project. An agent that can read your files produces output that integrates cleanly. A chatbot produces output you have to adapt manually.

Important to note that "Human in the loop" is still mandatory. The agent is not a replacement for a human. It is a tool to help you do your job faster and better. You still need to review the output and decide what to do next.

🧠 The Mental Shift

The biggest change is not technical. It is how you think about your role.

Before agentic QA: You are the writer. You design the test, write the page object, wire up the fixture, create the factory, run the tests.

With agentic QA: You are the reviewer and director. You built the architecture. You define the goal, review the output, catch anything the agent missed, and decide what to test next.

This doesn't mean less work. It means different work, higher-level work. You spend more time thinking about what to test and less time typing boilerplate.

🛠️ What Makes an Agent Reliable?

Here's the thing nobody tells you. A raw AI agent, pointed at your codebase with no guidance, will produce inconsistent results. Sometimes good, sometimes generic, sometimes wrong in subtle ways.

What makes an agent reliable is context and constraints:

Rules about what it must always do
Rules about what it must never do
Detailed expertise for specific tasks
A workflow it follows before generating code

In this scaffold, all of that lives in two places: CLAUDE.md (the orchestrator) and .claude/skills/ (the expertise files). Together they turn a general-purpose AI into something that behaves like a senior QA engineer who has been on your project for months.

That's exactly what the next two articles are about.

🙏🏻 Thank you for reading! The concept of agentic QA sounds futuristic, but the tools exist today and the scaffold is built to use them. Next up: how CLAUDE.md works and why it's the most important file in the project.

You can find the Public README.md file for the scaffold on GitHub: Playwright Scaffold

You can get access to the private GitHub repository here: Get Access