AI Tools for Existing Playwright + Pytest Frameworks: What Actually Works

Purpose

Research and evaluate AI-powered tools and workflows to improve test automation efficiency, specifically for test creation speed and reducing maintenance time when UI or business flows change. Focus on tools compatible with an existing Playwright + pytest (Python) stack and IntelliJ IDE.

Current Workflow & Pain Points

The two primary pain points in test automation are:

Creating new tests: Requires manually assembling context (page objects, fixture patterns, example tests) and writing tests that match existing conventions. The copy-paste workflow works but is slow and repetitive.

Updating tests when UI or flows change: When the product changes, tests break. Diagnosing which tests are affected, understanding what changed, and fixing them to match the new behavior consumes significant time.

Tools Evaluated

Claude Code (Anthropic) — Recommended

Claude Code is a terminal-based AI coding assistant that works with your entire codebase as context. It integrates with IntelliJ via a plugin (currently in beta) and can read, generate, and modify files directly in the project.

Key advantages:

Works in IntelliJ via plugin or integrated terminal. No IDE switch required.
Reads the full repository — page objects, fixtures, test files so generated code matches existing patterns and conventions.
Supports a CLAUDE.md configuration file in the project root which contains definitions of framework conventions, naming patterns, fixture usage, and domain context. This ensures output is framework-specific and not generic.
Suggests changes via IntelliJ's native diff viewer, making review and approval straightforward.
Shares IDE diagnostics (lint errors, syntax issues) automatically.
Available on Pro plan ($20/month), which is sufficient for regular usage.

Used for: Generated a change billing test using Claude Code with full project context. The output followed existing page object patterns, used the correct fixtures, and required minimal manual adjustment.

Playwright MCP (Model Context Protocol)

Playwright MCP is a server that gives AI tools live browser access. Instead of manually inspecting the DOM for selectors or using codegen tools, Claude Code can navigate the application, interact with elements, and read the actual page structure.

Useful for: Discovering selectors on new or changed pages without manually opening DevTools / Codegen. Especially valuable when new UI elements are added as part of feature changes. Requires guidance on which flow to walk through (natural language instructions).

Playwright Agents (Planner / Generator / Healer) — Not Compatible Yet

Playwright v1.56 introduced three AI agents that can generate test plans, create test code, and automatically fix broken tests. The Healer agent is particularly interesting for maintenance. It replays failing tests, inspects the live UI, and patches selectors or waits.

However, these agents currently only support TypeScript/JavaScript. There is an open feature request for Python support but no timeline.

Cursor — Viable Alternative

Cursor is an AI-powered IDE (VS Code-based) that provides full codebase context and inline AI editing. Comparable to Claude Code in capabilities for test generation.

Disadvantage: Requires switching from IntelliJ to a VS Code-based editor, which means losing existing IDE configuration, shortcuts, and debugging setup. The functionality overlap with Claude Code did not justify the migration cost.

Platform-Based Tools (Testim, Mabl, Katalon, ContextQA)

These are full test automation platforms with AI features including self-healing selectors, test generation from natural language, and visual test builders.

Not recommended because:

They require adopting their platform and abandoning your existing framework.
Generated test code is generic and does not match existing page object structure, fixture patterns, or naming conventions.
You lose domain-specific knowledge already embedded in your current test suite.
Migrating away from a platform later is expensive.

Qase Aiden

Evaluated previously and joined a live demo call. Generates test code but it is generic and does not adapt to codebase patterns. Same limitation as the platform tools above.

Implementation

Completed:

Installed Claude Code CLI
Set up Playwright MCP server for live browser access during test creation
Created CLAUDE.md in project root with framework conventions, project structure, page object patterns, fixture descriptions, test naming conventions, and domain context
Successfully generated a test using Claude Code with full project context — output matched existing framework patterns

Next steps:

Continue using Claude Code for upcoming test generation (simple vs complex tests and comparison between them)
Use Claude Code for upcoming test maintenance and updates to measure time savings vs manual approach
Continue monitoring Playwright Agents for Python support
Research and write about Javascript agent healers

Takeaway

After evaluating the available tools, the best results came from bringing AI into the existing codebase rather than switching to a new platform. The md file made the biggest difference — once the framework conventions were clearly described, the generated code matched existing patterns consistently. There's a clear improvement in speed for both test creation and maintenance, but it still requires human guidance, architectural thinking, and review. It's a powerful assistant, not a replacement, but one wonders what else it will be capable of in the future.

I'm a solo QA automation engineer and founder based in Slovenia. I build test frameworks, evaluate tooling, and write about what actually works in QA. Find me on LinkedIn.