<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: anjo zulaybar</title>
    <description>The latest articles on DEV Community by anjo zulaybar (@anjo_zulaybar_0b0a0e967eb).</description>
    <link>https://dev.to/anjo_zulaybar_0b0a0e967eb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3956087%2Fd39dc7b1-807b-42ba-94fb-3f1f79ab9a1d.png</url>
      <title>DEV Community: anjo zulaybar</title>
      <link>https://dev.to/anjo_zulaybar_0b0a0e967eb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anjo_zulaybar_0b0a0e967eb"/>
    <language>en</language>
    <item>
      <title>We stopped writing Playwright selectors and let AI figure it out</title>
      <dc:creator>anjo zulaybar</dc:creator>
      <pubDate>Thu, 28 May 2026 08:02:46 +0000</pubDate>
      <link>https://dev.to/anjo_zulaybar_0b0a0e967eb/we-stopped-writing-playwright-selectors-and-let-ai-figure-it-out-1hh8</link>
      <guid>https://dev.to/anjo_zulaybar_0b0a0e967eb/we-stopped-writing-playwright-selectors-and-let-ai-figure-it-out-1hh8</guid>
      <description>&lt;h2&gt;
  
  
  The problem with selector-based testing
&lt;/h2&gt;

&lt;p&gt;If you've maintained a Playwright or Cypress test suite for more than a few months, you know the drill. A designer renames a class, a developer restructures a form, and suddenly 30 tests are broken — not because the feature broke, but because .submit-btn became [data-action="submit"].&lt;/p&gt;

&lt;p&gt;You end up in a loop: fix selectors, ship, selectors break, fix selectors. The tests stop being useful because nobody trusts them.                                                                                  &lt;/p&gt;

&lt;h2&gt;
  
  
  What we built
&lt;/h2&gt;

&lt;p&gt;We built Confidence Gate — an AI-powered test execution engine where you describe test steps in plain English and the system figures out the rest.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[data-testid="email-input"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user@example.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;                                                                                                                                         
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button[type="submit"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;      
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toHaveURL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"enter the email from the test data in the email field"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;                                                                                                                                                
&lt;/span&gt;&lt;span class="nl"&gt;"expected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"the email field contains the entered address"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;engine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;translates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;each&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;step&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;into&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;typed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;intent,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;resolves&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;element&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;accessibility&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tree,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;executes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;real&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Playwright&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;browser,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;takes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;screenshot,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;verifies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;outcome&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;visually.&lt;/span&gt;&lt;span class="w"&gt;      
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How the execution engine works
&lt;/h2&gt;

&lt;p&gt;Each step goes through four stages:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Intent generation&lt;/strong&gt; — The natural language action is converted to a structured JSON ({ action: "click", target: { label: "Sign In", role: "button" }, value: null }). This separates intent from implementation.                                                                                                                                                                                                     &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Element resolution&lt;/strong&gt; — A multi-tier resolver finds the element: accessibility tree first (fast, reliable), CSS heuristics second, AI-assisted fallback third.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Execution + behavior detection&lt;/strong&gt; — Playwright executes the action. A mutation observer watches for DOM changes, URL changes, and value changes to confirm something actually happened.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Verification&lt;/strong&gt; — A vision model looks at the post-action screenshot and checks it against the expected result. If behavior was detected but verification fails, the engine assumes it hit the wrong element and&lt;br&gt;
   retries with a blacklisted selector.                                                                                                                                                                               &lt;/p&gt;

&lt;h2&gt;
  
  
  Self-healing selectors
&lt;/h2&gt;

&lt;p&gt;When a selector stops working between deploys, the repair loop kicks in. It re-queries the accessibility tree, scores candidate elements against the original target description, and picks the best match. The new selector is cached so the next run is fast.                                                                                                                                                                         &lt;/p&gt;

&lt;h2&gt;
  
  
  The confidence score
&lt;/h2&gt;

&lt;p&gt;After a run, every step result feeds into a score (0–100) built from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pass / fail / inconclusive ratio
&lt;/li&gt;
&lt;li&gt;Flakiness history (tests that flip between runs)&lt;/li&gt;
&lt;li&gt;Selector stability (how often repair had to run)
&lt;/li&gt;
&lt;li&gt;AI risk analysis against a PRD (optional)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The score maps to a gate decision: ship, caution, or block. You can call the API from CI and fail a deployment if the score drops below your threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stack and setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Backend: Python 3.11 · FastAPI · Celery · MongoDB · Redis · MinIO · Playwright&lt;/li&gt;
&lt;li&gt;Frontend: Next.js 15 · React 19 · TypeScript · Tailwind CSS v4
&lt;/li&gt;
&lt;li&gt;AI: pluggable — OpenAI, Anthropic Claude, or Ollama for local models&lt;/li&gt;
&lt;li&gt;Auth: Firebase or local JWT (no Firebase account needed for dev)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/OaktreeInnovations/confidence-gate.git
&lt;span class="nb"&gt;cd &lt;/span&gt;confidence-gate                                                                                                                                                                                                  
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env                                      
make up
Open http://localhost:3001 and you&lt;span class="s1"&gt;'re running.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;We're working on four things in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Stabilising execution (fewer inconclusive steps, better handling of edge cases)&lt;/li&gt;
&lt;li&gt;Better PRD coverage analysis (requirement-level traceability, not just a score)&lt;/li&gt;
&lt;li&gt;Browser recording (record your actions → auto-generate test cases)
&lt;/li&gt;
&lt;li&gt;Test generation from PRD (upload a spec → get a full test suite)
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The repo is MIT licensed and open to contributions. If any of this is interesting to you — especially the browser recording or the AI execution engine — come say hi on GitHub.                                     &lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/OaktreeInnovations/confidence-gate" rel="noopener noreferrer"&gt;https://github.com/OaktreeInnovations/confidence-gate&lt;/a&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>ai</category>
      <category>playwright</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
