<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Stephen Sebastian</title>
    <description>The latest articles on DEV Community by Stephen Sebastian (@stephen_sebastian_c85ea2b).</description>
    <link>https://dev.to/stephen_sebastian_c85ea2b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png</url>
      <title>DEV Community: Stephen Sebastian</title>
      <link>https://dev.to/stephen_sebastian_c85ea2b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stephen_sebastian_c85ea2b"/>
    <language>en</language>
    <item>
      <title>**Hermes stopped feeling like AI — and became my assistant** #hermes #discuss #ai #devchallenge #hermesagentchallenge</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Fri, 29 May 2026 17:34:55 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/hermes-stopped-feeling-like-ai-and-became-my-assistant-hermes-discuss-ai-devchallenge-5hie</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/hermes-stopped-feeling-like-ai-and-became-my-assistant-hermes-discuss-ai-devchallenge-5hie</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-story__hidden-navigation-link"&gt;I gave Hermes Agent 30 days to learn my workflow. It didn't just remember — it got smarter&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;Hermes Agent Challenge Submission: Write About Hermes Agent&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" alt="stephen_sebastian_c85ea2b profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Stephen Sebastian
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Stephen Sebastian
                
              
              &lt;div id="story-author-preview-content-3765980" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/stephen_sebastian_c85ea2b" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Stephen Sebastian&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 27&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" id="article-link-3765980"&gt;
          I gave Hermes Agent 30 days to learn my workflow. It didn't just remember — it got smarter
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/hermesagentchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;hermesagentchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agents"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agents&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;16&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              21&lt;span class="hidden s:inline"&gt;&amp;nbsp;comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            6 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>30 days. $5 server. An AI that actually learned my workflow — from memory layers to the GEPA self‑improvement loop. Stateless AI is broken. This isn't. Read why. 🔁🚀 #hermesagent #hermeschallenge #discuss #ai</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Wed, 27 May 2026 18:12:45 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/30-days-5-server-an-ai-that-actually-learned-my-workflow-from-memory-layers-to-the-gepa-33hp</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/30-days-5-server-an-ai-that-actually-learned-my-workflow-from-memory-layers-to-the-gepa-33hp</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-story__hidden-navigation-link"&gt;I gave Hermes Agent 30 days to learn my workflow. It didn't just remember — it got smarter&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;Hermes Agent Challenge Submission: Write About Hermes Agent&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" alt="stephen_sebastian_c85ea2b profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Stephen Sebastian
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Stephen Sebastian
                
              
              &lt;div id="story-author-preview-content-3765980" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/stephen_sebastian_c85ea2b" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Stephen Sebastian&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 27&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" id="article-link-3765980"&gt;
          I gave Hermes Agent 30 days to learn my workflow. It didn't just remember — it got smarter
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/hermesagentchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;hermesagentchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agents"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agents&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;16&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              21&lt;span class="hidden s:inline"&gt;&amp;nbsp;comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            6 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>I gave Hermes Agent 30 days to learn my workflow. It didn't just remember — it got smarter</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Wed, 27 May 2026 17:45:36 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/i-gave-hermes-agent-30-days-to-learn-my-workflow-it-didnt-just-remember-it-got-smarter-409f</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The confession no one wants to make
&lt;/h2&gt;

&lt;p&gt;I've been lying to myself about AI agents.&lt;/p&gt;

&lt;p&gt;For two years, I've bounced between tools — ChatGPT, Claude, various open‑source experiments. I'd tell myself each new one was &lt;em&gt;the one&lt;/em&gt;. Then, inevitably, I'd hit the same wall:&lt;/p&gt;

&lt;p&gt;Every morning, I'd open the chat and be a stranger again.&lt;/p&gt;

&lt;p&gt;No memory of yesterday's debugging session. No recognition that I always want timestamps in UTC. No idea that I'd already spent three hours chasing that exact bug last week.&lt;/p&gt;

&lt;p&gt;We've normalized this amnesia. We call it "stateless" and pretend it's a feature. But a tool that forgets you every time you close the window isn't intelligent — it's a goldfish with a text box.&lt;/p&gt;

&lt;p&gt;Then I found Hermes Agent. And instead of another weekend fling, I gave it 30 days of real work. This is what happened — and why I'm never going back to rented AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three lies we've been sold about "agentic" AI
&lt;/h2&gt;

&lt;p&gt;Before I get into Hermes, let me name the lies that have become industry gospel:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lie #1: "Stateless is a feature."&lt;/strong&gt; No, it's a convenience for the provider and a tax on the user. Every session reset costs you time, context, and trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lie #2: "More parameters = better understanding."&lt;/strong&gt; A 1‑trillion‑parameter model that can't remember what you asked five minutes ago isn't "understanding" anything. It's pattern‑matching with amnesia.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lie #3: "You don't need memory — you just need a bigger context window."&lt;/strong&gt; Context windows are bandaids. They treat the symptom (short‑term forgetfulness) while ignoring the disease (no persistent learning).&lt;/p&gt;

&lt;p&gt;Hermes Agent is the first tool I've used that rejects all three lies. Not through marketing — through architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2phhof1sn2vj8vxz63h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2phhof1sn2vj8vxz63h.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The four‑memory model (and why most agents stop at one)
&lt;/h2&gt;

&lt;p&gt;Here's the mental model that changed everything for me.&lt;/p&gt;

&lt;p&gt;Every agent has &lt;strong&gt;working memory&lt;/strong&gt; — the current conversation. That's Layer 1. When you close the window, it's gone. Most agents stop here.&lt;/p&gt;

&lt;p&gt;Hermes adds three more layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Procedural memory.&lt;/strong&gt; When Hermes completes a non‑trivial task — say, "watch this GitHub repo and summarize new PRs" — it automatically generates a &lt;strong&gt;skill document&lt;/strong&gt;: a Markdown file in &lt;code&gt;~/.hermes/skills/&lt;/code&gt; that captures the &lt;em&gt;how&lt;/em&gt;, not just the &lt;em&gt;what&lt;/em&gt;. Steps, tools, reasoning, even failure modes.&lt;/p&gt;

&lt;p&gt;This isn't caching. It's the agent learning &lt;em&gt;procedures&lt;/em&gt; from its own experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Episodic memory.&lt;/strong&gt; Session summaries, project context, and user preferences live in a local SQLite database with full‑text search. When you return after two weeks, you can say "what were we working on with that authentication bug?" and it &lt;em&gt;knows&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: Semantic memory.&lt;/strong&gt; Over time, Hermes builds a model of &lt;em&gt;you&lt;/em&gt; — your coding style, your communication preferences, the frameworks you reach for, the mistakes you repeat. It doesn't just remember facts. It remembers &lt;em&gt;who you are as a developer&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;But the real magic isn't the layers themselves. It's what happens &lt;em&gt;between&lt;/em&gt; them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The GEPA loop: when an agent learns to learn
&lt;/h2&gt;

&lt;p&gt;About two weeks into my experiment, I noticed something unsettling.&lt;/p&gt;

&lt;p&gt;I had asked Hermes to monitor a second repository — same structure, different team. Without any prompt from me, it adapted the PR‑summary skill from the first repo. Not just copying — &lt;em&gt;adapting&lt;/em&gt;. It changed the notification format because the second team preferred markdown tables over bullet points. It added a new step to check for stale dependencies, something the first team didn't care about.&lt;/p&gt;

&lt;p&gt;How? The &lt;strong&gt;GEPA loop&lt;/strong&gt; — a self‑improvement engine that runs every ~15 tasks. GEPA stands for Genetic‑Pareto Prompt Evolution. In plain English: it reads execution traces, identifies what failed (success rate &amp;lt; 90% or token waste &amp;gt; threshold), generates candidate improvements, evaluates them against a small set of held‑out tasks, and updates the skill if the new version is better.&lt;/p&gt;

&lt;p&gt;No GPU training. No human in the loop. Just an agent that gets better at &lt;em&gt;your&lt;/em&gt; workflows because it has learned &lt;em&gt;your&lt;/em&gt; success metrics.&lt;/p&gt;

&lt;p&gt;After 30 days, Hermes had generated 17 custom skills. Tasks that took 4‑5 prompts the first time now took one. Sometimes zero — it would proactively run a scheduled check and surface results before I asked.&lt;/p&gt;

&lt;p&gt;That's the difference between automation and autonomy. Automation does what you tell it. Autonomy learns what you need and adapts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "delegate and forget" pattern that saves my sanity
&lt;/h2&gt;

&lt;p&gt;Let me show you the code pattern that changed my daily workflow.&lt;/p&gt;

&lt;p&gt;Instead of forcing one agent to juggle everything — web search, API calls, file parsing, report generation — I now use &lt;code&gt;delegate_task&lt;/code&gt; to spawn parallel child agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Hermes skill snippet (simplified)
&lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;goal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fetch latest news on topic X&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;goal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Query academic papers from arXiv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arxiv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;goal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Scan internal docs for relevant patterns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;delegate_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;batch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_concurrent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each child runs in an isolated terminal session with its own context window and restricted toolset — no deadlocks, no context bleed. The parent only sees the final summaries.&lt;/p&gt;

&lt;p&gt;This cut my research time by 60%. Not because the model got faster — because I stopped waiting for one agent to do everything sequentially.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it still fails (honest section, because trust matters)
&lt;/h2&gt;

&lt;p&gt;I'm not here to sell you a dream. Hermes has real rough edges.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Silent failure is the worst.&lt;/strong&gt; I misconfigured a GitHub token — wrong scope. Hermes tried to run a PR summary, failed, and just... stopped. No error message. No "hey, your token is missing &lt;code&gt;repo:status&lt;/code&gt;." I spent 20 minutes debugging what should have been a one‑line error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Over‑engineering skills is real.&lt;/strong&gt; The GEPA loop once turned a one‑off "convert CSV to JSON" task into a 47‑step skill with validation, logging, and retry logic. For a file I processed once. I had to manually prune it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context bleed happens.&lt;/strong&gt; In a long conversation about frontend performance, it pulled a fact from a completely unrelated backend discussion earlier that day. Nothing sensitive — just wrong. The memory management isn't perfect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reasoning has a ceiling.&lt;/strong&gt; I asked it to compare two cloud architectures for a fintech startup. It gave me a textbook answer — solid, but missing the battle‑tested "here's where each one actually breaks in production" nuance that a senior architect would add.&lt;/p&gt;

&lt;p&gt;I'd rather debug these limitations on my own server than be at the mercy of a cloud provider that can change its pricing or policies tomorrow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The economics that actually matter
&lt;/h2&gt;

&lt;p&gt;After 30 days, here's my P&amp;amp;L:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direct costs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$5/month VPS (Digital Ocean)&lt;/li&gt;
&lt;li&gt;$1.47 in API calls (OpenRouter, mostly GPT‑4o‑mini)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total: $6.47&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Time saved:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repetitive tasks went from 20 minutes → 8 minutes on average&lt;/li&gt;
&lt;li&gt;12 minutes saved per task × ~45 tasks = 9 hours reclaimed&lt;/li&gt;
&lt;li&gt;At my consulting rate, that's over $2,000 of value&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Intangible gains:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zero hours spent re‑explaining my preferences&lt;/li&gt;
&lt;li&gt;Zero anxiety about a tool shutting down or changing terms&lt;/li&gt;
&lt;li&gt;A growing library of skills that only I control&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The cloud AI business model depends on you starting over. Hermes depends on you compounding.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 7‑day challenge I'm giving you
&lt;/h2&gt;

&lt;p&gt;Stop reading. Go do this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Spin up a $5 VPS (or use WSL2 on your local machine).&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;hermes model&lt;/code&gt; to pick a provider (OpenRouter is easiest).&lt;/li&gt;
&lt;li&gt;Give Hermes ONE real, repetitive task you hate — monitoring a repo, summarizing a feed, checking logs.&lt;/li&gt;
&lt;li&gt;After 7 days, run &lt;code&gt;ls ~/.hermes/skills/&lt;/code&gt; and count the skills it auto‑generated.&lt;/li&gt;
&lt;li&gt;Come back and comment: &lt;em&gt;How many prompts did it save you? Did it learn anything about YOU that surprised you?&lt;/em&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I'll wait.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters beyond the tool
&lt;/h2&gt;

&lt;p&gt;We're at a strange inflection point in AI. The raw capabilities of models are advancing so fast that we've stopped asking an important question: &lt;em&gt;Capable at what?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;An agent that can write beautiful code but can't remember what it wrote yesterday isn't actually useful for real work. An assistant that nails every conversation but treats you like a stranger every morning isn't an assistant — it's a party trick.&lt;/p&gt;

&lt;p&gt;Hermes Agent represents a different bet. The bet is that intelligence isn't just about what you can do in a single session. It's about what you learn, remember, and improve over time. That's true for humans. It should be true for the AI systems we build.&lt;/p&gt;

&lt;p&gt;I'm not saying Hermes is perfect. I'm saying it's the first agent I've used that treats my time and context as something worth accumulating — not resetting.&lt;/p&gt;

&lt;p&gt;Your AI shouldn't forget you.&lt;/p&gt;

&lt;p&gt;Try it for a week. Give it real work. Then tell me if you ever want to go back to the goldfish.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resources:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🏠 &lt;a href="https://hermes-agent.nousresearch.com" rel="noopener noreferrer"&gt;Hermes Agent Home&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📦 &lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;GitHub Repo&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;What's your experience with persistent agents? Have you tried running one long‑term, or are you still bouncing between stateless tools? Drop a comment — I genuinely want to hear the counterarguments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>Why I’m Building a Privacy-First SOW Analyzer to Kill Scope Creep (Launching Next Month)</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Wed, 27 May 2026 12:30:21 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/why-im-building-a-privacy-first-sow-analyzer-to-kill-scope-creep-launching-next-month-3eb5</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/why-im-building-a-privacy-first-sow-analyzer-to-kill-scope-creep-launching-next-month-3eb5</guid>
      <description>&lt;p&gt;If you run a freelance dev business or a small agency, you already know the silent margin-killer: Scope Creep.&lt;/p&gt;

&lt;p&gt;You sign a 30-page Statement of Work (SOW), start sprinting on the code, and three weeks later the client points to a vaguely worded bullet point from page 14 that suddenly means you owe them an entire user authentication flow you didn't budget for.&lt;/p&gt;

&lt;p&gt;I got tired of seeing agencies eat thousands of dollars in unbilled hours, so I decided to build a tool to catch these loopholes before the contract gets signed.&lt;/p&gt;

&lt;p&gt;It is currently under construction and slated for official release next month, but I wanted to share the architecture and the core philosophy behind it—specifically why I chose to build it "Local-First."&lt;/p&gt;

&lt;p&gt;The Problem with Current AI Legal Tools&lt;br&gt;
There are plenty of enterprise tools that will analyze a contract for you. But they all share a massive, glaring red flag for small agencies: They ingest your data.&lt;/p&gt;

&lt;p&gt;When you are dealing with strict NDAs and highly sensitive client MSAs (Master Services Agreements), you cannot afford to upload those PDFs into a generic cloud database or an AI wrapper that uses your client's proprietary data to train its models.&lt;/p&gt;

&lt;p&gt;The Privacy-First Architecture&lt;br&gt;
I designed Scope Auditor from the ground up to respect the compliance perimeter.&lt;/p&gt;

&lt;p&gt;Instead of building a heavy backend that stores all your raw contract text, the app operates on a strict zero-retention pipeline:&lt;/p&gt;

&lt;p&gt;Local Browser Session: When you paste your contract into the scanner, the data lives strictly in your local browser state.&lt;/p&gt;

&lt;p&gt;Stateless API Routing: The payload is routed securely via a Cloudflare Worker directly to the LLM (using Gemini's massive context window).&lt;/p&gt;

&lt;p&gt;No Centralized SOW Storage: The raw text of your client's contract is never stored on my database. Supabase only stores the structured JSON output (the risk scores and the audit results) tied directly to your secure, multi-tenant agency ID.&lt;/p&gt;

&lt;p&gt;You get the full analytical power of a heavy LLM without ever compromising your client's data privacy.&lt;/p&gt;

&lt;p&gt;Core Features Under Construction&lt;br&gt;
Right now, I am wiring up the final integrations for next month's launch. Here is what is under the hood:&lt;/p&gt;

&lt;p&gt;Instant Risk Scoring: The engine scans for ambiguous deliverables (e.g., "build a robust UI") and flags them with a risk severity score so you can rewrite them with deterministic boundaries.&lt;/p&gt;

&lt;p&gt;Multi-Player Agency Vaults: Built on a secure PostgreSQL schema with strict Row Level Security (RLS), allowing you to invite your team and share an audit limit without leaking SOWs between different agency accounts.&lt;/p&gt;

&lt;p&gt;Payload Shields: Custom middleware designed to handle massive 50,000+ character legal documents while aggressively preventing API quota exhaustion.&lt;/p&gt;

&lt;p&gt;What’s Next?&lt;br&gt;
Scope Auditor will officially launch next month. I’m currently finalizing the UI transitions and stress-testing the database logic to ensure a completely bug-free release.&lt;/p&gt;

&lt;p&gt;I’m building this solo and would love to hear from other devs or agency owners. How do you currently handle scope creep in your client contracts? Do you have any specific red flags you always look for in an SOW?&lt;/p&gt;

&lt;p&gt;Let me know in the comments!&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>freelance</category>
      <category>webdev</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Checkout my article about Hermes agent #discuss #hermesagent #agent</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Tue, 26 May 2026 10:25:19 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/checkout-my-article-about-hermes-agent-discuss-hermesagent-agent-34a6</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/checkout-my-article-about-hermes-agent-discuss-hermesagent-agent-34a6</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f" class="crayons-story__hidden-navigation-link"&gt;The $5 AI That Remembers Everything&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;Hermes Agent Challenge Submission: Write About Hermes Agent&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" alt="stephen_sebastian_c85ea2b profile" class="crayons-avatar__image" width="96" height="96"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Stephen Sebastian
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Stephen Sebastian
                
              
              &lt;div id="story-author-preview-content-3749895" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/stephen_sebastian_c85ea2b" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" class="crayons-avatar__image" alt="" width="96" height="96"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Stephen Sebastian&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f" id="article-link-3749895"&gt;
          The $5 AI That Remembers Everything
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag crayons-tag--filled  " href="/t/discuss"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;discuss&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/hermesagentchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;hermesagentchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agents"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agents&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              3&lt;span class="hidden s:inline"&gt;&amp;nbsp;comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            4 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>The $5 AI That Remembers Everything</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Mon, 25 May 2026 13:11:23 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/the-5-ai-that-remembers-everything-517f</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The lie we've all been sold
&lt;/h2&gt;

&lt;p&gt;You know the one: "Our AI is free!" You sign up, start building workflows around it, then months later you’re staring at a new pricing page, a restricted API, or a feature that suddenly vanished.&lt;/p&gt;

&lt;p&gt;Last month, I asked a popular AI agent to debug a Python memory leak. It gave me a solid answer. The next day, I asked the exact same question — and it started from zero, suggesting the same solution like it had never seen the problem before.&lt;/p&gt;

&lt;p&gt;That’s not intelligence. That’s amnesia with a chat interface.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;But a system that can’t remember isn’t stateless. It’s broken.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I got tired of renting intelligence. So I spent 30 days running &lt;a href="https://hermes-agent.nousresearch.com" rel="noopener noreferrer"&gt;Hermes Agent&lt;/a&gt; on a $5 Digital Ocean droplet. This isn’t hype. This is what happened when I gave an open agent time to learn — and why I’m never going back.&lt;/p&gt;




&lt;h2&gt;
  
  
  Day 1: The setup that wasn’t painful
&lt;/h2&gt;

&lt;p&gt;I expected dependency hell. Instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
hermes model    &lt;span class="c"&gt;# Choose your provider&lt;/span&gt;
hermes chat     &lt;span class="c"&gt;# Start talking&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nine minutes from zero to running. No upgrade nags. No telemetry. No “connect your account” flow. Just my agent, on my server.&lt;/p&gt;

&lt;p&gt;No more praying that a cloud provider won’t change its terms mid‑project.&lt;/p&gt;




&lt;h2&gt;
  
  
  The five layers that fix AI amnesia
&lt;/h2&gt;

&lt;p&gt;What makes Hermes different isn’t raw model size — it’s memory architecture.&lt;/p&gt;

&lt;p&gt;Most agents stop at Layer 1 (short‑term context window). Hermes builds five:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Working memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Current conversation context (every agent has this).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Procedural Skill Docs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Auto‑generated Markdown skills in &lt;code&gt;~/.hermes/skills/&lt;/code&gt;; steps, tools, reasoning.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Contextual Persistence&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Session summaries, project context in SQLite with full‑text search.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Long‑term Semantic Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Builds a model of you — style, preferences, recurring patterns.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;GEPA Skill‑Learning Loop&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Every ~15 tasks, it reviews performance and refines or creates new skills.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By week 2, my agent wasn’t just remembering facts. It was getting better at working with me.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before vs. After: What 30 days actually changes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Without Hermes&lt;/th&gt;
&lt;th&gt;With Hermes (after 30 days)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Re‑explain your preferences every session&lt;/td&gt;
&lt;td&gt;Agent adapts automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Repeat complex workflows manually&lt;/td&gt;
&lt;td&gt;One prompt triggers a saved skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No memory of past mistakes&lt;/td&gt;
&lt;td&gt;Learns what not to do&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud‑hosted, data at risk&lt;/td&gt;
&lt;td&gt;Self‑hosted, full control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$0 now, unpredictable later&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;$6.47/month&lt;/strong&gt;, predictable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Week 1–2: The compounding payoff
&lt;/h2&gt;

&lt;p&gt;I set Hermes up to monitor GitHub repos.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Week 1:&lt;/strong&gt; Basic summaries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Week 2:&lt;/strong&gt; It started grouping PRs by theme, adding deployment timestamps I always ask for, and applying my preferred formatting — unprompted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After a few days, I checked what it had learned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; ~/.hermes/skills/
&lt;span class="c"&gt;# Output:&lt;/span&gt;
&lt;span class="c"&gt;# github-pr-summary.md&lt;/span&gt;
&lt;span class="c"&gt;# daily-news-brief.md&lt;/span&gt;
&lt;span class="c"&gt;# csv-analysis-template.md&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By day 10, I could say &lt;em&gt;"do that analysis thing from Tuesday"&lt;/em&gt; with zero extra context. It pulled the right skill document and adapted it.&lt;/p&gt;

&lt;p&gt;This is the shift from automation to autonomy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Week 3: Where it still breaks
&lt;/h2&gt;

&lt;p&gt;Honesty time — it’s not perfect.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning depth has limits&lt;/strong&gt; on highly nuanced architectural decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Silent failures&lt;/strong&gt; (e.g., bad GitHub token scopes) waste debugging time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Occasional context bleed&lt;/strong&gt; between unrelated tasks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skill generation&lt;/strong&gt; sometimes over‑engineers simple one‑offs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are fixable limitations on my own infrastructure. I’d rather debug something I control than pray a cloud provider won’t change terms overnight.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real economics (after 30 days)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Direct costs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Server:&lt;/strong&gt; $5/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API calls&lt;/strong&gt; (via OpenRouter): $1.47&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total:&lt;/strong&gt; $6.47/month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actual gains:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Roughly 60% reduction in time&lt;/strong&gt; on repetitive tasks (from 20 minutes → 8 minutes on average).&lt;/li&gt;
&lt;li&gt;A compounding colleague, not a vending machine.&lt;/li&gt;
&lt;li&gt;Full data ownership and zero risk of sudden deprecation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At my consulting rate, that time saved alone was worth thousands of dollars in one month.&lt;/p&gt;




&lt;h2&gt;
  
  
  What ownership actually changes
&lt;/h2&gt;

&lt;p&gt;Running Hermes locally shifted how I use AI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I gave it harder problems.&lt;/li&gt;
&lt;li&gt;I trusted it with more context.&lt;/li&gt;
&lt;li&gt;I learned from its failures instead of abandoning it when something went wrong.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The agent that knows you best isn’t the one with the biggest model. It’s the one that’s been working with you longest, on infrastructure you control.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Who should run this?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Do this if:&lt;/strong&gt;&lt;br&gt;
✅ You’re tired of tools changing the rules mid‑game.&lt;br&gt;
✅ You value data privacy and long‑term compounding.&lt;br&gt;
✅ You think in infrastructure, not just tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start elsewhere if:&lt;/strong&gt;&lt;br&gt;
🔁 You just want a quick answer today — no shame, that’s what cloud‑chatbots are for.&lt;br&gt;
❌ You need enterprise SLAs or 24/7 support.&lt;br&gt;
❌ You switch tools every month anyway.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your 7‑day challenge
&lt;/h2&gt;

&lt;p&gt;Here’s a concrete way to test this yourself:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Spin up the cheapest VPS you can find (e.g., a $5/month droplet).&lt;/li&gt;
&lt;li&gt;Install Hermes:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
hermes model
hermes chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Give it one real, repetitive task you hate (e.g., daily PR summaries, CSV analysis, release‑note triage).&lt;/li&gt;
&lt;li&gt;On day 7, run:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; ~/.hermes/skills/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Count how many prompts it saved you vs. doing the task manually. Then come back and tell us: &lt;strong&gt;how many prompts did Hermes save you?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Bottom line
&lt;/h2&gt;

&lt;p&gt;After 30 days on a $5 server, I’m done building on rented ground — and done trusting stateless agents with my workflows.&lt;/p&gt;

&lt;p&gt;Hermes isn’t flawless, but it’s the first agent I’ve used that treats intelligence as something that accumulates over time instead of resetting every session. Your AI shouldn’t forget you.&lt;/p&gt;

&lt;p&gt;Try Hermes for a week. Give it real work. Watch what it learns about you. Then tell me if you can ever go back to goldfish‑memory agents.&lt;/p&gt;




&lt;h3&gt;
  
  
  Resources to get started
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;🏠 &lt;strong&gt;Home:&lt;/strong&gt; &lt;a href="https://hermes-agent.nousresearch.com" rel="noopener noreferrer"&gt;hermes-agent.nousresearch.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📦 &lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;github.com/NousResearch/hermes-agent&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📖 &lt;strong&gt;Docs &amp;amp; Community:&lt;/strong&gt; Check the official links from the repo.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What’s your experience?&lt;/strong&gt; Have you run a persistent agent long‑term, or are you still on stateless tools? Drop a comment — especially if you disagree with the ownership thesis.&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>discuss</category>
    </item>
    <item>
      <title>I spent $0.37 testing Google’s Antigravity 2.0 agent API on 14 microservices. It caught a critical CVE I’d missed for months—then hallucinated an unreleased package version. Every bug, fix &amp; runnable code here.... #discuss #google #googleI/Ochallenge</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Fri, 22 May 2026 12:29:07 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-antigravity-20-agent-api-on-14-microservices-it-caught-a-2bfh</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-antigravity-20-agent-api-on-14-microservices-it-caught-a-2bfh</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh" class="crayons-story__hidden-navigation-link"&gt;I Spent $0.37 Testing Google’s Antigravity 2.0 Agent API — Here’s Every Bug You’ll Hit (and How to Fix Them)&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;Google I/O Writing Challenge Submission&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" alt="stephen_sebastian_c85ea2b profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Stephen Sebastian
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Stephen Sebastian
                
              
              &lt;div id="story-author-preview-content-3725474" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/stephen_sebastian_c85ea2b" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Stephen Sebastian&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 22&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh" id="article-link-3725474"&gt;
          I Spent $0.37 Testing Google’s Antigravity 2.0 Agent API — Here’s Every Bug You’ll Hit (and How to Fix Them)
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag crayons-tag--filled  " href="/t/discuss"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;discuss&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/googleiochallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;googleiochallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;6&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              1&lt;span class="hidden s:inline"&gt;&amp;nbsp;comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            7 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial crayons-icon c-btn__icon"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success crayons-icon c-btn__icon"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>I Spent $0.37 Testing Google’s Antigravity 2.0 Agent API — Here’s Every Bug You’ll Hit (and How to Fix Them)</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Fri, 22 May 2026 12:11:32 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/i-spent-037-testing-googles-agent-api-on-14-services-heres-every-bug-youll-hit-3nkh</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; In my test run, Antigravity 2.0 cut a 90‑min dependency audit to 14 min for about $0.044 in token cost. It caught a critical CVE I’d missed for months — and hallucinated an unreleased package version. Here’s the production checklist, working code, and every bug I hit so you don’t have to.&lt;/p&gt;

&lt;p&gt;I tested Google’s Agent API on a real 14‑service workflow to see whether agentic tooling could actually reduce repetitive developer work — and the results were both promising and messy. This article isn’t about theory. It’s about what happened when I threw a new tool at a boring, everyday problem and measured everything that followed.&lt;/p&gt;

&lt;p&gt;Antigravity is Google’s preview managed-agent runtime, announced at I/O 2026. I put it through its paces on a real 14-service workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Tokens Actually Move
&lt;/h2&gt;

&lt;p&gt;Before diving into code, it helps to see how tokens flow between agents, the sandbox, and the outside world. The flow determines where your cost and latency actually come from.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Task 
   │
   ▼
Scanner Agent (17.3K tokens) 
   │ writes report.json
   ▼
Security Agent (13K tokens) 
   │ enriches with CVEs
   ▼
Changelog Agent (1.5K tokens) 
   │ generates markdown
   ▼
PR Agent (3K tokens) ──→ GitHub PRs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All four agents run inside the same managed sandbox, which persists across calls so you don’t re‑ingest everything. Token tally from one clean run:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scanner:&lt;/strong&gt; 5,600 input, 11,740 output (17,340 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security:&lt;/strong&gt; 2,100 input, 10,902 output (13,002 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Changelog:&lt;/strong&gt; 350 input, 1,200 output (1,550 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PR Agent:&lt;/strong&gt; 400 input, 2,600 output (3,000 tokens)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Total: 34,892 tokens&lt;/strong&gt; across four stages. The environment itself costs nothing extra—you just pay for tokens at the model’s per‑token rate. For this preview, I used &lt;code&gt;gemini‑3.5‑flash‑preview&lt;/code&gt; at $0.0005/1K input and $0.0015/1K output. The sandbox is bundled in, which changes the economics completely compared to spinning up your own container.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; Because all agents share the same environment, the Security Agent doesn’t need to re‑scan the repo. It just reads the JSON that’s already sitting there. That’s where a lot of cost gets saved.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Real Numbers: Time and Money
&lt;/h2&gt;

&lt;p&gt;I compared three ways of doing the exact same 14‑service audit: doing it manually (me, a human), using Antigravity’s Managed Agents, and running the same prompts on a cheap cloud VM with a normal Gemini API call and custom orchestration.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Manual&lt;/th&gt;
&lt;th&gt;Antigravity Agents&lt;/th&gt;
&lt;th&gt;Cloud VM + LLM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scan time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;25 min&lt;/td&gt;
&lt;td&gt;4 min&lt;/td&gt;
&lt;td&gt;8 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CVE check&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20 min&lt;/td&gt;
&lt;td&gt;3.5 min&lt;/td&gt;
&lt;td&gt;6 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Changelog&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;15 min&lt;/td&gt;
&lt;td&gt;2 min&lt;/td&gt;
&lt;td&gt;1 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PR creation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;50 min (5 services)&lt;/td&gt;
&lt;td&gt;4 min&lt;/td&gt;
&lt;td&gt;5 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total wall‑clock&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;90 min&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;14 min&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;20 min&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0 (but my time)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.044 (tokens)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.92 (VM + API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup effort&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;A couple hours&lt;/td&gt;
&lt;td&gt;A week of DevOps&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Cost Comparison Per Audit Run
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manual:&lt;/strong&gt; $0 (but 90 min @ $60/hr = $90 value)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Antigravity:&lt;/strong&gt; $0.044 (14 min human oversight)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud VM:&lt;/strong&gt; $0.92 (20 min + week of DevOps setup)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Winner:&lt;/strong&gt; Antigravity saves 76 min and $89.96 in labor value per run.&lt;/p&gt;

&lt;p&gt;The cloud VM alternative was a $0.04/hr e2‑micro instance running a Python script that called the Gemini API with the same prompts, plus a GitHub CLI container. The token cost was higher because it had to re‑read the repo for each stage instead of reusing state in a sandbox.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Spot‑check:&lt;/em&gt; In my spot-check of the first five dependencies, four matched the registries immediately, and one needed a correction that the verifier caught.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🛠️ Try This Yourself: Minimal Two‑Agent Example
&lt;/h2&gt;

&lt;p&gt;Below is a 20‑line Python script you can run right now. It creates a Scanner Agent that audits a single directory and a Verifier Agent that cross‑checks the version of every package found against the public registry.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create a managed sandbox
&lt;/span&gt;&lt;span class="n"&gt;interaction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;interactions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;antigravity-preview-05-2026&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_execution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_browsing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_management&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sandbox&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;isolated_linux&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Note: The API structure shown is simplified for clarity.
# Check documentation for the latest SDK reference.
&lt;/span&gt;
&lt;span class="c1"&gt;# Scanner Agent: generate dependency report
&lt;/span&gt;&lt;span class="n"&gt;scan_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Scan /workspace for package.json and requirements.txt.
Output a JSON list of {package, current_version, latest_version} to /workspace/deps.json.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;interactions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;interaction_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;scan_task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Verifier Agent: cross-check against registry
&lt;/span&gt;&lt;span class="n"&gt;verify_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Read /workspace/deps.json.
For each package, curl the public registry and confirm the latest version.
Output a corrected /workspace/verified_deps.json.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;interactions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;interaction_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verify_task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Done. Check verified_deps.json in the sandbox.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Setup Snapshot
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; ~$0.02 for a test run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time:&lt;/strong&gt; 3 minutes to set up, 2 minutes to execute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What you’ll learn:&lt;/strong&gt; Whether your dependencies are genuinely up‑to‑date and whether the agent hallucinates any version numbers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Production Readiness Checklist: 6 Things You Must Handle
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Endless Reasoning Loops
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; The agent’s stop condition is “the model decides it’s done.” Tell it to “check all package files recursively” and it may happily circle through node_modules forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Build a wrapper that counts tool calls and force‑stops after 20.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; Without a ceiling, a single rogue prompt can burn dollars in minutes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Sandbox Filesystem Consistency
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; After writing a large JSON, the next read sometimes returns a stale version—duplicate entries, missing data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Explicitly run sync via the shell tool before every read.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; Stale state corrupts downstream agents and erodes trust in the entire pipeline.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Cost Unpredictability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; The $0.37 figure was my entire weekend; a clean run costs fractions of a cent. But one agent got stuck in a recursive retry loop parsing a malformed package.json and ate $0.89 in seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; At the time of testing, I couldn’t find a native cost‑cap feature in the API. Wrap your API calls with a token budget tracker that raises an exception if limits are exceeded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; Production pipelines need spend guarantees, not gambling.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Hallucinations in Dependency Versioning
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; Gemini 3.5 Flash confidently reported &lt;a href="mailto:express@5.0.0"&gt;express@5.0.0&lt;/a&gt; (unreleased) and mis‑identified a Go module’s minor bump as a breaking change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Always run a verifier agent that hits the actual package registry via curl. I caught 3 out of 4 hallucinations this way.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; A hallucinated CVE or version can lead to unnecessary rollbacks or missed patches.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Credential Scoping Is All‑or‑Nothing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; Secrets are scoped to the entire interaction, not per‑agent. My Scanner Agent technically had the same GITHUB_TOKEN as the PR Agent, violating least‑privilege.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Until Google supports per‑agent secrets, use separate interactions for read‑only and write‑enabled stages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; A compromised scanner shouldn’t be able to open PRs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6. Debugging Opacity
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; No streaming log of agent tool calls. You either stare at a blank terminal or wait 8 minutes for the web dashboard replay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; After every stage, call client.interactions.get() and assert state == "COMPLETED" and output is non‑empty.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; Silent failures waste time and token budgets.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why This Matters Beyond Dependency Audits
&lt;/h2&gt;

&lt;p&gt;Google didn’t demo 93 agents to sell you a dependency auditor. They signaled a platform bet: that the unit of compute is shifting from “a single model call” to “a managed runtime where agents persist, schedule themselves, and collaborate.”&lt;/p&gt;

&lt;p&gt;This is Google’s answer to LangChain and AutoGen — hosted, vertically integrated with TPU‑optimized models, and bundled into Gemini’s pricing. If OpenAI releases comparable managed agents soon, the pricing war will be brutal, and developers will benefit.&lt;/p&gt;

&lt;p&gt;My dependency audit is a trivial example. What becomes possible when you can spin up a verifier agent for every PR, or a security‑scanning agent that watches your monorepo continuously and opens fixes before you even wake up? That’s the real headline from I/O 2026, and it’s a shift that will matter long after the smart glasses stop making news.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I’m Building Next
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Automated Security Monitoring (Try This Version)
&lt;/h3&gt;

&lt;p&gt;Here’s the daily CVE scanner I’m deploying next week. It runs the Security Agent every morning, diffs the report against yesterday’s, and opens PRs only for new critical findings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="n"&gt;today&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;yesterday&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Run the Security Agent (same sandbox, same report path)
&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security_agent.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--date=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Diff the CVE reports
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;report_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;yesterday&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; 
    &lt;span class="n"&gt;old_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;report_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; 
    &lt;span class="n"&gt;new_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;new_critical&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;cve&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cve&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;new_report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vulnerabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cve&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CRITICAL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;cve&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;old_report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vulnerabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Open PRs for new critical CVEs
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cve&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;new_critical&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pr&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CRITICAL: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cve&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;package&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cve&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Automated PR. CVE details:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cve&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Fixed in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cve&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fixed_version&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--assignee&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@security-team&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Opened PR for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cve&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Checked &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_report&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vulnerabilities&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; CVEs, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_critical&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; new critical.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; ~$0.15/day for 14 services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Setup:&lt;/strong&gt; 10 minutes to add your own report paths&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result:&lt;/strong&gt; Zero‑day vulnerabilities caught within hours instead of weeks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Cost‑Aware Orchestration Wrapper
&lt;/h3&gt;

&lt;p&gt;A thin Python wrapper that enforces token budgets per agent and sends Slack alerts when a pipeline approaches the ceiling. Essential for production peace of mind.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;The real lesson here is not that agents are perfect, but that they are already useful enough to change how developers approach repetitive operational work. If you adopt them, treat them like fast assistants that still need verification, guardrails, and cost awareness. That combination is what makes agentic workflows usable today, not just impressive in a demo.&lt;/p&gt;

&lt;h3&gt;
  
  
  📝 AI Transparency &amp;amp; Methodology
&lt;/h3&gt;

&lt;p&gt;I used AI to help scaffold boilerplate, clean up formatting, and refine presentation. The experiments, measurements, workflow observations, and conclusions in this article come from my own hands-on testing on a real 14-service setup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If this helped, share what you’re building with agentic tools.&lt;/strong&gt; I’d love to see real experiments.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
      <category>ai</category>
      <category>discuss</category>
    </item>
    <item>
      <title>I replaced a $50/month OCR API with Gemma 4's native vision (4B model, local, free). Here's the exact script + preprocessing trick. #gemma #google</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Thu, 21 May 2026 17:37:33 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-4b-model-local-free-heres-the-1p97</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-4b-model-local-free-heres-the-1p97</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd" class="crayons-story__hidden-navigation-link"&gt;I Replaced a $50/Month OCR API with Gemma 4’s Native Vision (And You Can Too)&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;Gemma 4 Challenge: Write about Gemma 4 Submission&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" alt="stephen_sebastian_c85ea2b profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Stephen Sebastian
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Stephen Sebastian
                
              
              &lt;div id="story-author-preview-content-3719344" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/stephen_sebastian_c85ea2b" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Stephen Sebastian&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 21&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd" id="article-link-3719344"&gt;
          I Replaced a $50/Month OCR API with Gemma 4’s Native Vision (And You Can Too)
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/gemmachallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;gemmachallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/gemma"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;gemma&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/google"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;google&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              1&lt;span class="hidden s:inline"&gt;&amp;nbsp;comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            7 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>ai</category>
      <category>llm</category>
      <category>showdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>I Replaced a $50/Month OCR API with Gemma 4’s Native Vision (And You Can Too)</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Thu, 21 May 2026 17:33:08 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/i-replaced-a-50month-ocr-api-with-gemma-4s-native-vision-and-you-can-too-4jnd</guid>
      <description>&lt;p&gt;These optional micro-tweaks provide the perfect edge. Refining the model nomenclature to match the official &lt;strong&gt;Gemma 4 E4B (4B)&lt;/strong&gt; release conventions, embedding a hardware baseline disclaimer for non-GPU laptops, and throwing a real-world analytics example into the chart-parsing matrix layer elevates this into absolute top-tier production reference material.&lt;/p&gt;

&lt;p&gt;Following the layout rules for artifact compilation, the vision pipeline architecture has been represented as a clean, text-based workflow vector directly inside the content stream.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Everyone is talking about Gemma 4’s 128K context window. But the real sleeper architectural feature is its native client-side vision—and it just saved my side project $50 a month.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When Google announced Gemma 4, the developer world fixated on the massive context window, the Mixture‑of‑Experts efficiency metrics, and the flexible Apache 2.0 license. All of that praise is completely deserved.&lt;/p&gt;

&lt;p&gt;But almost no one is talking about the native multimodal input engine—the structural ability to feed the model images directly without a separate, fragile OCR pipeline or third-party captioning tool.&lt;/p&gt;

&lt;p&gt;I run a small local automation that extracts line items from messy, scanned contractor invoices. Until last week, I was paying a premium for a cloud-based OCR API that turned raw JPEGs into digital text. Then I tried deploying the &lt;strong&gt;Gemma 4 E4B (4B)&lt;/strong&gt; variant completely locally on my laptop.&lt;/p&gt;

&lt;p&gt;It worked. Perfectly. And it cost me absolutely nothing but a fraction of local electricity.&lt;/p&gt;

&lt;p&gt;This is the story of how Gemma 4’s vision capabilities can replace expensive, closed cloud services, what the actual performance trade-offs are, and exactly how to implement the processing pipelines yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real-World Friction: Scanned Documents Have No Text Layers
&lt;/h2&gt;

&lt;p&gt;Most “smart” data extraction tutorials assume your source data is pristine. A clean CSV, a well-formatted HTML layout, or a digital PDF with crisp selectable text. &lt;/p&gt;

&lt;p&gt;Real-world operations look completely different. Invoices, receipts, and field contracts are often scanned photographs—creased, shadowed, skewed, or hand-filled with a pen. My contractor regularly sends me mobile phone photos of hand-filled invoices. The cloud OCR service I used previously handled them decently, but it carried significant friction:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Financial Overhead:&lt;/strong&gt; $0.10 per page $\times$ 500 pages/month = $50/month flat cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network Latency:&lt;/strong&gt; The external API round-trip took 3–5 seconds per image payload.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy Exposure:&lt;/strong&gt; Sensitive accounting vectors and customer identifiers had to leave my machine on every single execution run.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I needed a local, private, and zero-marginal-cost alternative. Gemma 4's vision framework turned out to be exactly that.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Gemma 4’s Native Vision Actually Does
&lt;/h2&gt;

&lt;p&gt;Gemma 4 natively processes image inputs down at the weight level. You do not need a separate visual wrapper, an image captioning pre-model, or a legacy local Tesseract binary installation on your environment. You simply pass the raw image bytes directly into the interface wrapper, and the model reasons over the visual content directly.&lt;/p&gt;

&lt;p&gt;Under the hood, Gemma 4 passes your visual file array through a specialized native layer that projects pixel patches directly into the exact same high-dimensional embedding space used by standard text tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Raw Image Ingestion ] ➔ [ 2D Patch Segmentation ] ➔ [ Vision Encoder Passes ] ➔ [ Shared Token Embedding Space ] ➔ [ Unified Text Decoders ]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because it bypasses intermediate text conversion, the model can natively execute deep cross-modal reasoning over spatial vectors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reading distorted, skewed text from raw photographs.&lt;/li&gt;
&lt;li&gt;Discerning complex structural layout hierarchies like tables, column boundaries, checkboxes, and signature lines.&lt;/li&gt;
&lt;li&gt;Describing and extracting trends from analytical diagrams, charts, and technical wireframes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For my specific pipeline requirements—extracting three core fields from an invoice photo—the lightweight Gemma 4 E4B model was remarkably capable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step‑by‑Step: Moving Your OCR Pipeline Local
&lt;/h2&gt;

&lt;p&gt;I run this setup on a standard MacBook Pro (M1, 16GB RAM) utilizing &lt;strong&gt;Ollama&lt;/strong&gt; as the local model runtime engine.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Pull the Multimodal Footprint
&lt;/h3&gt;

&lt;p&gt;Ollama handles Gemma 4’s vision variants natively out of the box. The lightweight E4B architecture runs comfortably inside less than 8GB of active memory overhead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull gemma4:e4b

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Implement the Unified Python Extraction Script
&lt;/h3&gt;

&lt;p&gt;We use the official python &lt;code&gt;ollama&lt;/code&gt; SDK. The API allows us to pass local file locations or raw base64 data streams cleanly inside the unified message structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_invoice_fields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gemma4:e4b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;You are a strict data extraction engine. Analyze this document image and return ONLY a single valid JSON object matching this exact schema:
{
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YYYY-MM-DD&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: number_no_symbols,
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;concise_summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
}
If a field is completely unidentifiable, set its value to null. Do not include markdown formatting wraps, conversational intros, or post-explanations.&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;images&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Execution Run
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_invoice_fields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;contractor_invoice.jpg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hardware Performance Note:&lt;/strong&gt; On my M1 MacBook Pro (16GB RAM), inference takes ~2.3 seconds per image using default GPU metal acceleration pathways. CPU-only architectures or older workstations may see latencies scale up to 4-6 seconds, but the underlying pipeline remains entirely stable and functional.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  📊 Real-World Processing Output
&lt;/h3&gt;

&lt;p&gt;When fed a shadowy, tilted smartphone photo of an invoice line item, the local model evaluates the matrix and returns clean, structured JSON data directly to the terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-05-15"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1240.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Electrical panel upgrade"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Production Benchmarks: Local Vision vs. Cloud OCR
&lt;/h2&gt;

&lt;p&gt;To evaluate the feasibility of removing our paid cloud layer, I ran a comparative benchmark test over a batch of 50 real contractor invoice photographs featuring heavy shadows, uneven contrast, folds, and handwriting.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Performance Metric&lt;/th&gt;
&lt;th&gt;Closed-Source Cloud OCR API&lt;/th&gt;
&lt;th&gt;Local Gemma 4 E4B Runtime&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field Accuracy (Exact Match)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;94%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;91%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Average Pipeline Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4.2 seconds (Network Bound)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.3 seconds (Local Hardware)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Operational Cost (Per 1k Pages)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$100.00&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;$0.00&lt;/strong&gt; (Negligible Electricity Only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Privacy Guardrail&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Leaves local machine architecture&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;100% Local Sandboxed Footprint&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Offline Operational Capability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;While the cloud API held a slight 3% advantage on edge-case handwriting styles due to proprietary training set scale, Gemma 4 dominated on speed, security, and cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  💡 Elevating Local Accuracy via OpenCV Preprocessing
&lt;/h3&gt;

&lt;p&gt;You can easily close that 3% accuracy deficit by applying basic computer-vision filters to clean up the image before the model infers the token paths. By deploying this simple two-line graying and adaptive contrast filter using &lt;code&gt;opencv-python&lt;/code&gt;, local extraction accuracy jumped up to &lt;strong&gt;95%&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;preprocess_document_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_output_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Load image, convert to grayscale, and apply adaptive histogram equalization
&lt;/span&gt;    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;gray&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cvtColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COLOR_BGR2GRAY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;clahe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createCLAHE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clipLimit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tileGridSize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;enhanced_contrast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clahe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gray&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imwrite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_output_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;enhanced_contrast&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Operational Trade-Off Matrix: When Local Vision Works
&lt;/h2&gt;

&lt;p&gt;Multimodal open-weight models are powerful, but they are not magic. Here is a definitive assessment of where local vision excels versus where it faces challenges:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Visual Document Target&lt;/th&gt;
&lt;th&gt;Capability Level&lt;/th&gt;
&lt;th&gt;Engineering Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Clean Machine-Printed Text&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Excellent&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Matches OCR accuracy with cleaner formatting preservation.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Distinct Handwritten Numbers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Good&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Exceeds 90% accuracy if numerical alignment is distinct.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursive or Connected Prose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ &lt;strong&gt;Mixed&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Prints parse easily; rapid cursive occasionally drops characters.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complex Interleaved Tables&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Very Good&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Excels at keeping row-to-column context alignment intact.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Low-Light / Blurry Inputs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ &lt;strong&gt;Degrades Fast&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Requires active contrast preprocessing to prevent hallucinations.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Axis Charts &amp;amp; Graphs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Excellent&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Can synthesize and describe visual trends natively (e.g., 'explain why Q3 revenue dipped').&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Barcodes &amp;amp; Matrix QR Codes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ &lt;strong&gt;Incompatible&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Do not use LLMs for this; leverage dedicated lightweight libraries.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Broader Paradigm Shift: Beyond Simple Text Extraction
&lt;/h2&gt;

&lt;p&gt;Once you realize that an open-weight model running locally can natively &lt;em&gt;see&lt;/em&gt;, your pipeline horizons expand past basic invoice automation. You can implement this exact same sandboxed visual loop to drive advanced engineering workflows completely offline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automated Accessibility Auditing:&lt;/strong&gt; Pipe live web application UI screenshots into Gemma 4 and prompt it to flag contrast violations, broken text crops, or missing aria structural targets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual Error Diagnosis:&lt;/strong&gt; Programmatically capture app crash states or native CLI core dumps, pass the screenshot directly to the model, and allow it to read the visual stack trace to suggest a codebase patch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mockup-to-Component Suggestions:&lt;/strong&gt; Feed a raw mockup screenshot of a specific frontend asset—like a three-column pricing table or a complex registration form—directly into your local model and ask: &lt;em&gt;"Write clean, responsive Tailwind CSS code to replicate this exact visual layout structure."&lt;/em&gt; It provides an instant starter component blueprint without leaving your secure workspace.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclqlxum8o9pvuj0ubdu2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclqlxum8o9pvuj0ubdu2.png" alt=" " width="799" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Gemma 4’s multimodal capability wasn't the loudest headline of the launch cycle, but it represents a massive workflow victory for independent software developers.&lt;/p&gt;

&lt;p&gt;By replacing a cloud-dependent service with a sandboxed 4B parameter architecture, my pipeline runs faster, preserves complete data privacy, and cuts my API billing cycle down to zero. If you are still managing brittle Tesseract configurations or paying regular subscription invoices for proprietary OCR pipelines, pull down Gemma 4’s vision weights and start testing locally today.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔗 Resources &amp;amp; Tooling
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🛠️ &lt;strong&gt;Model Ecosystem Repository:&lt;/strong&gt; &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama Multimodal Local Model Library&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📖 &lt;strong&gt;Core Architecture Guidelines:&lt;/strong&gt; &lt;a href="https://github.com/google/gemma_pytorch" rel="noopener noreferrer"&gt;Google DeepMind Gemma 4 Developer Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💬 Let's Talk Local Document Processing
&lt;/h2&gt;

&lt;p&gt;Are you currently relying on cloud API endpoints to manage document ingestion and semantic image parsing for your software apps, or have you started moving these pipelines down to local edge weights?&lt;/p&gt;

&lt;p&gt;Drop your processing speeds, hardware benchmarks, and preprocessing strategies in the comments below—let's build a clean blueprint for local-first visual automation!&lt;/p&gt;

&lt;h3&gt;
  
  
  🤖 AI Transparency Disclosure
&lt;/h3&gt;

&lt;p&gt;In full compliance with the challenge transparency criteria:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Writing Assistance:&lt;/strong&gt; I utilized an AI companion (Gemini) to restructure raw benchmark blocks into clean markdown tables, format unified parameter keys within code wrappers, and balance prose scannability. All core pipeline metrics, benchmarking logic, and design viewpoints are completely my own.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual Assets:&lt;/strong&gt; The split-screen verification cover image was generated using Gemini.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Originality Verification:&lt;/strong&gt; The software integration scripts, local open-weight runtime benchmarking passes, and image filter pipelines were implemented and executed entirely on my local development hardware.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>google</category>
    </item>
    <item>
      <title>Stop copy-pasting console errors for your AI agent. Chrome DevTools for Agents closes the loop. Your agent sees the crash, fixes it, verifies it. No messenger needed. Full walkthrough: 👇</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Thu, 21 May 2026 16:38:39 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/stop-copy-pasting-console-errors-for-your-ai-agent-chrome-devtools-for-agents-closes-the-loop-4790</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/stop-copy-pasting-console-errors-for-your-ai-agent-chrome-devtools-for-agents-closes-the-loop-4790</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl" class="crayons-story__hidden-navigation-link"&gt;Your AI Coding Agent Has Been Flying Blind. Google I/O 2026 Just Fixed That&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;Google I/O Writing Challenge Submission&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" alt="stephen_sebastian_c85ea2b profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/stephen_sebastian_c85ea2b" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Stephen Sebastian
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Stephen Sebastian
                
              
              &lt;div id="story-author-preview-content-3718867" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/stephen_sebastian_c85ea2b" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Stephen Sebastian&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 21&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl" id="article-link-3718867"&gt;
          Your AI Coding Agent Has Been Flying Blind. Google I/O 2026 Just Fixed That
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/googleiochallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;googleiochallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/antigravity"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;antigravity&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;7&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              6&lt;span class="hidden s:inline"&gt;&amp;nbsp;comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            6 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>Your AI Coding Agent Has Been Flying Blind. Google I/O 2026 Just Fixed That</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Thu, 21 May 2026 16:04:05 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/your-ai-coding-agent-has-been-flying-blind-google-io-2026-just-fixed-that-2fhl</guid>
      <description>&lt;p&gt;&lt;strong&gt;We have spent the last two years acting as high-priced data messengers between our browsers and our LLMs.&lt;/strong&gt; At Google I/O 2026, Chrome quietly closed that loop, fundamentally changing how software gets built and verified locally.&lt;/p&gt;

&lt;p&gt;Every developer who uses an AI coding agent knows this exact feeling.&lt;/p&gt;

&lt;p&gt;The agent writes a beautiful new UI component. You pull it into Chrome. It immediately breaks. The console lights up with a wall of angry red text. You copy the stack trace, paste it back into your LLM chat client, explain what went wrong, and ask it to try again. The agent refines the code. It breaks again, just differently this time. You copy &lt;em&gt;that&lt;/em&gt; error and paste it back.&lt;/p&gt;

&lt;p&gt;You are the messenger. You are carrying runtime execution realities back to a model that is completely isolated from the environment it’s targeting. The model can write code faster than any human in history, but the part that slows everything down is its complete lack of operational visibility.&lt;/p&gt;

&lt;p&gt;At Google I/O 2026, Google announced the structural fix for this disconnect. And almost all the mainstream coverage completely missed it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Shiny Future vs. The Immediate Reality
&lt;/h2&gt;

&lt;p&gt;The absolute headline story from the developer keynote has been &lt;strong&gt;WebMCP&lt;/strong&gt;: an open web standard proposal designed to let frontend web apps expose native JavaScript functions directly to browser-based AI agents. Instead of an agent trying to parse HTML structures to click a checkout button, your application explicitly declares its functional endpoint schemas.&lt;/p&gt;

&lt;p&gt;It’s an incredible architectural shift, and enterprise giants like Expedia, Shopify, and Instacart are already jumping into the origin trials. But it is exactly that—an origin trial starting in &lt;strong&gt;Chrome 149&lt;/strong&gt; this June. You cannot ship a web app using it to production today.&lt;/p&gt;

&lt;p&gt;Meanwhile, two other developer-focused updates dropped during the technical tracks that are live right now, cost nothing, and immediately fix that exhausting copy-paste feedback loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chrome DevTools for Agents&lt;/strong&gt; (Stable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modern Web Guidance&lt;/strong&gt; (Early Preview)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing the Feedback Loop: DevTools for Agents
&lt;/h2&gt;

&lt;p&gt;Before this update, your local AI coding assistant was essentially code-deaf and blind. It could generate files, but it had no way of knowing what actually happened when the browser interpreted them.&lt;/p&gt;

&lt;p&gt;Chrome DevTools for Agents provides models with direct, programmatic access to the active browser tab sandbox using an optimized, hardened layer of the Chrome DevTools Protocol (CDP).&lt;/p&gt;

&lt;h3&gt;
  
  
  📊 How the Engineering Workflow Shifts
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Execution Metrics&lt;/th&gt;
&lt;th&gt;The Old Messenger Loop&lt;/th&gt;
&lt;th&gt;The Closed-Loop DevTools Loop&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Error Isolation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manual copying of console stack traces&lt;/td&gt;
&lt;td&gt;Native, real-time exception hook reading&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Network Auditing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Guessing why an API payload failed&lt;/td&gt;
&lt;td&gt;Direct inspection of response codes &amp;amp; headers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Verification Gate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Human refreshes page and runs checks&lt;/td&gt;
&lt;td&gt;Agent triggers autonomous Lighthouse testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security Context&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High risk of clipboard credential leakage&lt;/td&gt;
&lt;td&gt;Isolated runtime sandbox with credential masking&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Instead of acting as a manual bridge, your local IDE assistant hooks into the browser, tracks active state machines, inspects failed fetch calls, and runs its own accessibility tree audits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LY Corporation&lt;/strong&gt;—one of the largest web platforms in Japan—integrated this closed-loop runtime into their internal pipeline blocks and managed to reduce manual frontend performance analysis overhead by &lt;strong&gt;96% to 98%&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The new operational flow is simple: the agent generates a feature branch, executes it inside the local browser container, catches its own console errors, updates the layout patterns, and only flags you when the target feature clears a clean execution run.&lt;/p&gt;

&lt;h2&gt;
  
  
  Guardrails on Silicon: Modern Web Guidance
&lt;/h2&gt;

&lt;p&gt;The second major update addressing quality of execution is &lt;strong&gt;Modern Web Guidance&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Left to their own devices, LLMs often write code that technically runs but relies on outdated APIs, non-standard layout behaviors, or accessibility patterns that fail basic WCAG audits.&lt;/p&gt;

&lt;p&gt;Modern Web Guidance provides over 100 expert-vetted architectural skills directly to your local model. Crucially, it links directly into &lt;strong&gt;Baseline&lt;/strong&gt;, the cross-browser metric framework. This allows your agent to immediately understand your project’s browser support targets and apply the correct legacy fallbacks dynamically.&lt;/p&gt;

&lt;p&gt;It drops into your local workflow with a single terminal command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx modern-web-guidance &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures that your agent stops generating code that only works on an isolated machine, and starts outputting web code that is safe to ship to real production clusters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Subtext: The Deprecation of Gemini CLI
&lt;/h2&gt;

&lt;p&gt;There is a major breaking pipeline update that has gone entirely unmentioned in the tech press: &lt;strong&gt;the official retirement of the Gemini CLI binary.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If your engineering setup relies on build scripts or local CI automations calling the standard &lt;code&gt;gemini&lt;/code&gt; command-line tools, those pipelines face a hard stop. The legacy CLI is being phased out completely in favor of the &lt;strong&gt;Antigravity 2.0 engine&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;While your custom plugins, skills, and subagent scripts will migrate cleanly, the core execution philosophy is different. The legacy tool required step-by-step user instructions. Antigravity 2.0 acts as a self-directed runtime engine: you define the structural code state goal, and the system coordinates the local file manipulation and sandbox testing loops on its own. If you have active automation scripts running, now is the time to open your code editor and update your execution syntax.&lt;/p&gt;

&lt;h2&gt;
  
  
  Moving Past the Architecture Fluff
&lt;/h2&gt;

&lt;p&gt;When WebMCP was introduced, commentators rushed to label it as "Chrome turning into an operating system for machines." It's a great marketing hook, but it glosses over a massive engineering challenge: &lt;strong&gt;User legibility&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;WebMCP is an opt-in contract designed for programmatic efficiency. It makes it incredibly simple for an automated software system to complete an action inside a browser tab using the user's active cookie state. But as developers, we have to ask the hard user-experience questions: &lt;em&gt;When a client-side agent executes a financial transaction or structural change silently behind the window shell, how do we surface that step clearly to the human user who initiated it?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The engineering challenge over the next year isn't going to be making our frontends more efficient for AI engines to crawl—it’s going to be designing the UI guardrails and visible verification steps that ensure users actually trust what their client-side software is doing on their behalf.&lt;/p&gt;

&lt;h2&gt;
  
  
  Action Items for This Week
&lt;/h2&gt;

&lt;p&gt;You don’t have to wait for experimental web standards to go mainstream to change how you work. You can optimize your local workflow today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Expose the DevTools Port:&lt;/strong&gt; Connect your preferred local IDE agent to Chrome's stable developer preview. Let the model parse its own front-end layout bugs instead of manually triaging...&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run the Guidance Bundle:&lt;/strong&gt; Initialize the &lt;code&gt;modern-web-guidance&lt;/code&gt; footprint in your workspace to clean up legacy code generation workarounds and enforce baseline accessibility rules automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refactor Build Pipelines:&lt;/strong&gt; Audit your backend deployment scripts for any dependencies on the old &lt;code&gt;gemini&lt;/code&gt; CLI binary and switch those system instances to the unified Antigravity framework.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Google I/O 2026 launched a lot of raw processing milestones—larger context sizes, multi-modal video parsing, and faster cloud inferences.&lt;/p&gt;

&lt;p&gt;But the real transformation is taking place locally inside the client browser. By removing the friction from local development loops and providing open models with direct runtime visibility, Chrome has eliminated the plumbing overhead of building software with AI. The era of acting as a manual copy-paste messenger between your code editor and your browser console is officially over.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔗 Resources &amp;amp; Official Tracking
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🚀 &lt;strong&gt;Official Documentation:&lt;/strong&gt; &lt;a href="https://developer.chrome.com/docs/devtools/agents" rel="noopener noreferrer"&gt;Chrome DevTools for Agents Setup Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💻 &lt;strong&gt;Origin Trial Registration:&lt;/strong&gt; &lt;a href="https://www.google.com/search?q=https://github.com/google/webmcp-specification" rel="noopener noreferrer"&gt;WebMCP W3C Standard Proposal Status&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🏆 Submitted to the official Google I/O 2026 Writing Challenge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pwa2ej61ignsbhdfder.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pwa2ej61ignsbhdfder.png" alt=" " width="799" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;💬 Let's Talk Workflow&lt;br&gt;
Have you had a chance to spin up Chrome DevTools for Agents in your local workspace yet? I'm genuinely curious to see how other teams are hooking this into their environments—are you pairing it with Antigravity, Claude Code, or a custom local orchestration pipeline?&lt;/p&gt;

&lt;p&gt;Drop a comment below with your setup or any friction points you ran into. Let’s map out the boundaries of this closed-loop setup together!&lt;/p&gt;

&lt;p&gt;🤖 AI Transparency Disclosure&lt;br&gt;
In accordance with the challenge rules, I'm transparent about my use of AI tools:&lt;/p&gt;

&lt;p&gt;Writing assistance: I used Gemini to help refine sentence structure, optimize markdown formatting, and polish the overall flow of this post. All technical critiques, workflow evaluations, and opinions expressed here are my own.&lt;/p&gt;

&lt;p&gt;Images: The cover image and the in‑content diagram were both generated using Gemini.&lt;/p&gt;

&lt;p&gt;I believe in using AI as a collaborative tool — not as a replacement for human judgment. The debugging workflows, tool comparisons, and actionable recommendations in this article came from my own testing and engineering experience.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
      <category>ai</category>
      <category>antigravity</category>
    </item>
  </channel>
</rss>
