<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: anjo zulaybar</title>
    <description>The latest articles on DEV Community by anjo zulaybar (@anjo_zulaybar_0b0a0e967eb).</description>
    <link>https://dev.to/anjo_zulaybar_0b0a0e967eb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3956087%2Fd39dc7b1-807b-42ba-94fb-3f1f79ab9a1d.png</url>
      <title>DEV Community: anjo zulaybar</title>
      <link>https://dev.to/anjo_zulaybar_0b0a0e967eb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anjo_zulaybar_0b0a0e967eb"/>
    <language>en</language>
    <item>
      <title>We stopped writing Playwright selectors and let AI figure it out</title>
      <dc:creator>anjo zulaybar</dc:creator>
      <pubDate>Thu, 28 May 2026 08:02:46 +0000</pubDate>
      <link>https://dev.to/anjo_zulaybar_0b0a0e967eb/we-stopped-writing-playwright-selectors-and-let-ai-figure-it-out-1hh8</link>
      <guid>https://dev.to/anjo_zulaybar_0b0a0e967eb/we-stopped-writing-playwright-selectors-and-let-ai-figure-it-out-1hh8</guid>
      <description>&lt;h2&gt;
  
  
  The problem with selector-based testing
&lt;/h2&gt;

&lt;p&gt;If you've maintained a Playwright or Cypress test suite for more than a few months, you know the drill. A designer renames a class, a developer restructures a form, and suddenly 30 tests are broken — not because the feature broke, but because .submit-btn became [data-action="submit"].&lt;/p&gt;

&lt;p&gt;You end up in a loop: fix selectors, ship, selectors break, fix selectors. The tests stop being useful because nobody trusts them.                                                                                  &lt;/p&gt;

&lt;h2&gt;
  
  
  What we built
&lt;/h2&gt;

&lt;p&gt;We built Confidence Gate — an AI-powered test execution engine where you describe test steps in plain English and the system figures out the rest.&lt;/p&gt;

&lt;p&gt;Instead of:                                                                                                                                                                                                         &lt;/p&gt;

&lt;p&gt;await page.locator('[data-testid="email-input"]').fill('&lt;a href="mailto:user@example.com"&gt;user@example.com&lt;/a&gt;');&lt;br&gt;&lt;br&gt;
await page.locator('button[type="submit"]').click();&lt;br&gt;&lt;br&gt;
await expect(page).toHaveURL('/dashboard');&lt;/p&gt;

&lt;p&gt;You write:                                                                                                                                                                                                          &lt;/p&gt;

&lt;p&gt;{ "action": "enter the email from the test data in the email field",&lt;br&gt;&lt;br&gt;
"expected": "the email field contains the entered address" }&lt;br&gt;
"expected": "the dashboard is displayed and the login form is gone" }&lt;/p&gt;

&lt;p&gt;The engine translates each step into a typed intent, resolves the target element from the accessibility tree, executes it in a real Playwright browser, takes a screenshot, and verifies the outcome visually.      &lt;/p&gt;

&lt;h2&gt;
  
  
  How the execution engine works
&lt;/h2&gt;

&lt;p&gt;Each step goes through four stages:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Intent generation&lt;/strong&gt; — The natural language action is converted to a structured JSON ({ action: "click", target: { label: "Sign In", role: "button" }, value: null }). This separates intent from implementation.                                                                                                                                                                                                     &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Element resolution&lt;/strong&gt; — A multi-tier resolver finds the element: accessibility tree first (fast, reliable), CSS heuristics second, AI-assisted fallback third.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Execution + behavior detection&lt;/strong&gt; — Playwright executes the action. A mutation observer watches for DOM changes, URL changes, and value changes to confirm something actually happened.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Verification&lt;/strong&gt; — A vision model looks at the post-action screenshot and checks it against the expected result. If behavior was detected but verification fails, the engine assumes it hit the wrong element and&lt;br&gt;
   retries with a blacklisted selector.                                                                                                                                                                               &lt;/p&gt;

&lt;h2&gt;
  
  
  Self-healing selectors
&lt;/h2&gt;

&lt;p&gt;When a selector stops working between deploys, the repair loop kicks in. It re-queries the accessibility tree, scores candidate elements against the original target description, and picks the best match. The new selector is cached so the next run is fast.                                                                                                                                                                         &lt;/p&gt;

&lt;h2&gt;
  
  
  The confidence score
&lt;/h2&gt;

&lt;p&gt;After a run, every step result feeds into a score (0–100) built from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pass / fail / inconclusive ratio
&lt;/li&gt;
&lt;li&gt;Flakiness history (tests that flip between runs)&lt;/li&gt;
&lt;li&gt;Selector stability (how often repair had to run)
&lt;/li&gt;
&lt;li&gt;AI risk analysis against a PRD (optional)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The score maps to a gate decision: ship, caution, or block. You can call the API from CI and fail a deployment if the score drops below your threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stack and setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Backend: Python 3.11 · FastAPI · Celery · MongoDB · Redis · MinIO · Playwright&lt;/li&gt;
&lt;li&gt;Frontend: Next.js 15 · React 19 · TypeScript · Tailwind CSS v4
&lt;/li&gt;
&lt;li&gt;AI: pluggable — OpenAI, Anthropic Claude, or Ollama for local models&lt;/li&gt;
&lt;li&gt;Auth: Firebase or local JWT (no Firebase account needed for dev)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/OaktreeInnovations/confidence-gate.git" rel="noopener noreferrer"&gt;https://github.com/OaktreeInnovations/confidence-gate.git&lt;/a&gt;&lt;br&gt;
cd confidence-gate&lt;br&gt;&lt;br&gt;
cp .env.example .env&lt;br&gt;&lt;br&gt;
make up&lt;br&gt;&lt;br&gt;
                                                                                                                                                                                                                      Open &lt;a href="http://localhost:3001" rel="noopener noreferrer"&gt;http://localhost:3001&lt;/a&gt; and you're running.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;We're working on four things in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Stabilising execution (fewer inconclusive steps, better handling of edge cases)&lt;/li&gt;
&lt;li&gt;Better PRD coverage analysis (requirement-level traceability, not just a score)&lt;/li&gt;
&lt;li&gt;Browser recording (record your actions → auto-generate test cases)
&lt;/li&gt;
&lt;li&gt;Test generation from PRD (upload a spec → get a full test suite)
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The repo is MIT licensed and open to contributions. If any of this is interesting to you — especially the browser recording or the AI execution engine — come say hi on GitHub.                                     &lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/OaktreeInnovations/confidence-gate" rel="noopener noreferrer"&gt;https://github.com/OaktreeInnovations/confidence-gate&lt;/a&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>ai</category>
      <category>playwright</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
