<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dylan Worrall</title>
    <description>The latest articles on DEV Community by Dylan Worrall (@dylanworrall).</description>
    <link>https://dev.to/dylanworrall</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3991560%2Fcc4d14c2-c7b1-472c-8099-56ded338c026.png</url>
      <title>DEV Community: Dylan Worrall</title>
      <link>https://dev.to/dylanworrall</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dylanworrall"/>
    <language>en</language>
    <item>
      <title>Browser Automation for AI Agents: What Actually Works</title>
      <dc:creator>Dylan Worrall</dc:creator>
      <pubDate>Thu, 18 Jun 2026 23:16:36 +0000</pubDate>
      <link>https://dev.to/dylanworrall/browser-automation-for-ai-agents-what-actually-works-100m</link>
      <guid>https://dev.to/dylanworrall/browser-automation-for-ai-agents-what-actually-works-100m</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://dylanworrall.com/blog/browser-automation-for-ai-agents" rel="noopener noreferrer"&gt;dylanworrall.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most agent demos that involve a browser are shot in one take for a reason. The moment you try to make browser automation &lt;em&gt;reliable&lt;/em&gt; — running unattended, across sites you don't control, hundreds of times — it stops being a demo and starts being an engineering problem. I've spent a lot of time on that problem building the browser layer inside &lt;a href="https://froots.ai" rel="noopener noreferrer"&gt;Froots&lt;/a&gt;, and a handful of patterns made the difference between "works in the video" and "works at 3am while I'm asleep."&lt;/p&gt;

&lt;h2&gt;
  
  
  Prefer structured verbs over raw &lt;code&gt;eval&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;It's tempting to give the agent one giant escape hatch: run arbitrary JavaScript in the page and parse whatever comes back. It works right up until it doesn't, and when it fails it fails opaquely.&lt;/p&gt;

&lt;p&gt;A small vocabulary of structured commands beats one omnipotent one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;navigate &amp;lt;url&amp;gt;
click &amp;lt;selector&amp;gt;
fill &amp;lt;selector&amp;gt; &amp;lt;value&amp;gt;
type &amp;lt;selector&amp;gt; &amp;lt;value&amp;gt;      # contenteditable-safe; composers ignore plain fill
text &amp;lt;selector&amp;gt;              # read innerText back
wait_selector &amp;lt;selector&amp;gt;     # poll until it exists
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The point isn't that &lt;code&gt;eval&lt;/code&gt; is useless — it's the fallback, not the default. Structured verbs give you predictable error messages ("selector not found" beats a stack trace from inside a minified bundle), and they make the agent's intent legible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kill the &lt;code&gt;sleep&lt;/code&gt; instinct — wait on conditions
&lt;/h2&gt;

&lt;p&gt;The single biggest source of flakiness is &lt;code&gt;sleep(2000)&lt;/code&gt;. Too short and you act before the element exists; too long and every run wastes seconds. Replace time with conditions: poll until the element exists, until the spinner is gone, or until navigation lands. An agent that waits on the &lt;em&gt;thing it actually needs&lt;/em&gt; is both faster and dramatically more reliable than one that guesses at timing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Always read something back
&lt;/h2&gt;

&lt;p&gt;This is the lesson I learned the hard way. A command would return success and I'd assume the work was done — then find the agent had been talking to a pane that wasn't there. Every call "succeeded" by doing nothing.&lt;/p&gt;

&lt;p&gt;The fix is a discipline: &lt;strong&gt;a write should be confirmed by a read.&lt;/strong&gt; After you fill a field, read it back. After you click submit, wait for the URL or a success node. Silent success is not the same as success.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use the session's own cookies for reads
&lt;/h2&gt;

&lt;p&gt;A lot of useful data sits behind a login. Rather than scraping a login wall, do an in-page &lt;code&gt;fetch&lt;/code&gt; with &lt;code&gt;credentials: 'include'&lt;/code&gt; from the right origin — you reuse the existing session instead of re-authenticating or storing credentials. Probe for a login cookie &lt;em&gt;before&lt;/em&gt; you reach for authenticated data, so you can ask the human to sign in rather than silently scraping an error page.&lt;/p&gt;

&lt;h2&gt;
  
  
  Screenshots are the honest fallback
&lt;/h2&gt;

&lt;p&gt;When the DOM is hostile — shadow roots, canvas UIs, obfuscated class names — stop fighting selectors and take a screenshot. A vision model reading a picture of the page is sometimes the most robust path.&lt;/p&gt;

&lt;h2&gt;
  
  
  The meta-lesson
&lt;/h2&gt;

&lt;p&gt;Reliable browser automation is less about clever selectors and more about &lt;strong&gt;closing the loop&lt;/strong&gt;: act, observe, confirm, and never trust a result you didn't verify.&lt;/p&gt;

&lt;p&gt;I write more about agent architecture — &lt;a href="https://dylanworrall.com/blog/giving-ai-agents-reliable-memory" rel="noopener noreferrer"&gt;reliable memory&lt;/a&gt;, &lt;a href="https://dylanworrall.com/blog/building-froots-agents-you-can-watch" rel="noopener noreferrer"&gt;agents you can watch work&lt;/a&gt;, and &lt;a href="https://dylanworrall.com/blog/building-toward-a-zero-employee-company" rel="noopener noreferrer"&gt;building toward a one-person company&lt;/a&gt; — over on &lt;a href="https://dylanworrall.com/blog" rel="noopener noreferrer"&gt;my blog&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;— Dylan Worrall, founder of &lt;a href="https://froots.ai" rel="noopener noreferrer"&gt;Froots&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
