<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: nklars0</title>
    <description>The latest articles on DEV Community by nklars0 (@nikolarss0n).</description>
    <link>https://dev.to/nikolarss0n</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3785491%2Fb6471bf3-b70d-4161-a149-0726716fc2c0.jpeg</url>
      <title>DEV Community: nklars0</title>
      <link>https://dev.to/nikolarss0n</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nikolarss0n"/>
    <language>en</language>
    <item>
      <title>I Fixed 110 Failing E2E Tests in 2 Hours Without Writing a Single Line of Test Code</title>
      <dc:creator>nklars0</dc:creator>
      <pubDate>Sun, 22 Feb 2026 21:09:18 +0000</pubDate>
      <link>https://dev.to/nikolarss0n/i-fixed-110-failing-e2e-tests-in-2-hours-without-writing-a-single-line-of-test-code-2mfd</link>
      <guid>https://dev.to/nikolarss0n/i-fixed-110-failing-e2e-tests-in-2-hours-without-writing-a-single-line-of-test-code-2mfd</guid>
      <description>&lt;p&gt;110 failing Playwright tests. Login flows, multi-step form wizards, search filters, file uploads, complex user workflows. Some failures were missing UI steps. Some were dirty state from previous runs. Some were stale selectors. I fixed all of them in 2 hours. I didn't write a single line of test code.&lt;/p&gt;

&lt;p&gt;I built a &lt;a href="https://github.com/kaizen-yutani/playwright-autopilot" rel="noopener noreferrer"&gt;https://github.com/kaizen-yutani/playwright-autopilot&lt;/a&gt; that does it.&lt;/p&gt;

&lt;p&gt;How the debugging workflow actually works&lt;/p&gt;

&lt;p&gt;When you run a test through the plugin, a lightweight capture hook injects into Playwright's worker process. It monkey-patches BrowserContext._initialize to add an instrumentation listener — no modifications to Playwright's source code, works with any existing installation.&lt;/p&gt;

&lt;p&gt;From that point, every browser action is recorded:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;DOM snapshots — full ARIA tree of the page captured before and after each click, fill, select, and navigation. When a test fails, you see exactly what the page looked like at the moment of failure, and what it looked like one step before.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Network requests — URL, method, status code, timing, request body, response body. Filter by status (400+ to find failed API calls), by URL pattern, or by method.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Console output — errors, warnings, and logs tied to the specific action that produced them. Not a wall of text — scoped to the step that matters.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Screenshots — captured at the point of failure.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI doesn't dump all of this into context at once. It's built on MCP (Model Context Protocol), so it pulls data on demand — action timeline first, then drills into the specific failing step, checks the DOM snapshot, inspects the network response, reads the console. 32 tools, each returning just what's needed. Token-efficient by design.&lt;/p&gt;

&lt;p&gt;It thinks in user flows, not selectors&lt;/p&gt;

&lt;p&gt;Before touching code, the agent maps the intended user journey: "a user logs in, fills out a multi-step form, uploads a file, submits." It walks through the steps a real user would perform and compares that against what the test actually did.&lt;/p&gt;

&lt;p&gt;When a step is missing — a dropdown never selected, a required field never filled, a radio button never clicked — it finds the existing page object method in your codebase and adds the call. No new abstractions. Minimal diff.&lt;/p&gt;

&lt;p&gt;It follows your architecture&lt;/p&gt;

&lt;p&gt;Page Object Model, business/service layer, whatever pattern your team uses — it reads your codebase and works within it. Uses getByRole(), getByTestId(), web-first assertions. No page.evaluate() hacks, no waitForTimeout, no try/catch around Playwright actions.&lt;/p&gt;

&lt;p&gt;If the application itself is broken — 500s regardless of input, unhandled exceptions in app code — it tells you that instead of working around it.&lt;/p&gt;

&lt;p&gt;It learns and remembers&lt;/p&gt;

&lt;p&gt;After a test passes, the plugin automatically saves the verified user flow — the exact sequence of interactions that make up the happy path. Next time that test breaks, the agent already knows the intended journey and jumps straight to identifying what changed.&lt;/p&gt;

&lt;p&gt;Run e2e_build_flows once across your suite and it captures every test's journey. The agent gets faster over time.&lt;/p&gt;

&lt;p&gt;A real example&lt;/p&gt;

&lt;p&gt;A checkout test was failing with "locator resolved to hidden element." The usual debugging path: open trace viewer, find the step, read the DOM, realize a country dropdown was never selected so the shipping section never rendered. 20 minutes if you're fast.&lt;/p&gt;

&lt;p&gt;The plugin found the same root cause in one run. It pulled the DOM snapshot at the failing step, saw the unselected dropdown with its options sitting right there in the ARIA tree, searched the page objects for selectCountry(), found it, added the call in the service layer, re-ran the test. Passed. One fix, 12 seconds of AI thinking.&lt;/p&gt;

&lt;p&gt;Get started&lt;/p&gt;

&lt;h2&gt;
  
  
  Add the marketplace
&lt;/h2&gt;

&lt;p&gt;/plugin marketplace add kaizen-yutani/playwright-autopilot&lt;/p&gt;

&lt;h2&gt;
  
  
  Install the plugin
&lt;/h2&gt;

&lt;p&gt;/plugin install kaizen-yutani/playwright-autopilot&lt;/p&gt;

&lt;p&gt;Then prompt: Fix all failing e2e tests&lt;/p&gt;




&lt;p&gt;&lt;a href="https://github.com/kaizen-yutani/playwright-autopilot" rel="noopener noreferrer"&gt;https://github.com/kaizen-yutani/playwright-autopilot&lt;/a&gt; — star it, try it on your flakiest test, tell me what breaks.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>playwright</category>
      <category>testing</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
