<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Juan Pablo Enriquez Ortiz</title>
    <description>The latest articles on DEV Community by Juan Pablo Enriquez Ortiz (@jpablortiz96).</description>
    <link>https://dev.to/jpablortiz96</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3846843%2F21ecb04b-d1ec-48ce-8480-ecb3645d37cb.png</url>
      <title>DEV Community: Juan Pablo Enriquez Ortiz</title>
      <link>https://dev.to/jpablortiz96</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jpablortiz96"/>
    <language>en</language>
    <item>
      <title>Museum of Dead Dreams: Turning Abandoned GitHub Repos into AI Revival Plans and Copilot Kits</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Thu, 04 Jun 2026 20:48:05 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/museum-of-dead-dreams-turning-abandoned-github-repos-into-ai-revival-plans-and-copilot-kits-4e5d</link>
      <guid>https://dev.to/jpablortiz96/museum-of-dead-dreams-turning-abandoned-github-repos-into-ai-revival-plans-and-copilot-kits-4e5d</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-05-21"&gt;GitHub Finish-Up-A-Thon Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Museum of Dead Dreams: Turning Abandoned GitHub Repos into AI Revival Plans and Copilot Kits
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcvpace8g4ufsfi28hgl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcvpace8g4ufsfi28hgl.png" alt="Museum of Dead Dreams Hero" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What if abandoned repositories were not just dead code, but unfinished stories waiting for the right resurrection plan?&lt;/p&gt;

&lt;p&gt;That idea became &lt;strong&gt;Museum of Dead Dreams&lt;/strong&gt;: an AI-powered product that transforms forgotten GitHub repositories into interactive museum exhibits, grounded technical autopsies, revival strategies, branded PDFs, and GitHub Copilot-ready execution kits.&lt;/p&gt;

&lt;p&gt;This project started as a visually interesting concept. I finished it as a real product.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Museum of Dead Dreams is a virtual museum for abandoned software projects.&lt;/p&gt;

&lt;p&gt;A user enters a GitHub username, and the app:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;scans public repositories&lt;/li&gt;
&lt;li&gt;filters out forks&lt;/li&gt;
&lt;li&gt;ranks projects by abandonment&lt;/li&gt;
&lt;li&gt;collects real repo evidence like languages, commits, README excerpts, root files, and manifest snippets&lt;/li&gt;
&lt;li&gt;generates an AI-powered personalized museum&lt;/li&gt;
&lt;li&gt;turns each abandoned project into an exhibit&lt;/li&gt;
&lt;li&gt;lets the user ask a &lt;strong&gt;Copilot Curator&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;creates a &lt;strong&gt;Revival Plan&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;exports that plan as &lt;strong&gt;Markdown&lt;/strong&gt; or a &lt;strong&gt;branded PDF&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;saves selected projects into &lt;strong&gt;Resurrection Bay&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;generates a &lt;strong&gt;GitHub Copilot Resurrection Kit&lt;/strong&gt; that can be dropped into a real repository&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not just a portfolio viewer or a gimmick UI. It is a complete workflow for recovering value from unfinished code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why I built it
&lt;/h3&gt;

&lt;p&gt;Every developer has a graveyard.&lt;/p&gt;

&lt;p&gt;Old side projects. Half-finished tools. Hackathon builds. Startup experiments that almost became something.&lt;/p&gt;

&lt;p&gt;GitHub preserves the files, but it usually does not preserve the context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why did the project die?&lt;/li&gt;
&lt;li&gt;What still makes it valuable?&lt;/li&gt;
&lt;li&gt;What should be rebuilt first?&lt;/li&gt;
&lt;li&gt;How can a coding agent like GitHub Copilot actually help revive it?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Museum of Dead Dreams is my answer to that problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Live Links
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Live Experience:&lt;/strong&gt; &lt;a href="https://osiam2phyuryk.kimi.page" rel="noopener noreferrer"&gt;https://osiam2phyuryk.kimi.page&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/jpablortiz96/Museum-of-Dead-Dreams" rel="noopener noreferrer"&gt;https://github.com/jpablortiz96/Museum-of-Dead-Dreams&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Video Walkthrough:&lt;/strong&gt; &lt;a href="https://youtu.be/qmLwdTxxjIA" rel="noopener noreferrer"&gt;https://youtu.be/qmLwdTxxjIA&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Product Flow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fepi0aobgr9sshmp9h8vp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fepi0aobgr9sshmp9h8vp.png" alt="Product Flow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The experience works like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Enter a GitHub username
&lt;/li&gt;
&lt;li&gt;Scan public repositories
&lt;/li&gt;
&lt;li&gt;Rank the most abandoned projects
&lt;/li&gt;
&lt;li&gt;Generate a personalized AI museum
&lt;/li&gt;
&lt;li&gt;Explore exhibits, causes of death, and technical artifacts
&lt;/li&gt;
&lt;li&gt;Ask the Copilot Curator project-specific questions
&lt;/li&gt;
&lt;li&gt;Open a six-part Revival Plan
&lt;/li&gt;
&lt;li&gt;Export it as Markdown or branded PDF
&lt;/li&gt;
&lt;li&gt;Commit revived projects into Resurrection Bay
&lt;/li&gt;
&lt;li&gt;Download a Copilot Resurrection Kit and continue the rebuild in the original repo&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Real Product Screenshots
&lt;/h2&gt;

&lt;p&gt;These are real screenshots captured from the running application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Welcome Screen
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngjntrgfwgi4w83la4p5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngjntrgfwgi4w83la4p5.png" alt="Welcome Screen" width="800" height="590"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Loading Graveyard
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0hatocpd0yp0uowg7dz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0hatocpd0yp0uowg7dz.png" alt="Loading Graveyard" width="800" height="590"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Museum Hall
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqrqv3wv9b8z3oo89m8nf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqrqv3wv9b8z3oo89m8nf.png" alt="Museum Hall" width="800" height="590"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Exhibit Room
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6wmhrhd5ggwjex0snlsn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6wmhrhd5ggwjex0snlsn.png" alt="Exhibit Room" width="800" height="590"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Copilot Curator
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxu7vistp2hq2qpinbl8b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxu7vistp2hq2qpinbl8b.png" alt="Copilot Curator" width="800" height="647"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Revival Plan
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fx52p9q1vheoo5mzp1d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fx52p9q1vheoo5mzp1d.png" alt="Revival Plan" width="800" height="922"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Resurrection Bay
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk17o8lznkwv4selslzgo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk17o8lznkwv4selslzgo.png" alt="Resurrection Bay" width="800" height="590"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Comeback Story
&lt;/h2&gt;

&lt;p&gt;This is the part that matters most for the &lt;strong&gt;Finish-Up-A-Thon&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Museum of Dead Dreams did &lt;strong&gt;not&lt;/strong&gt; start as the polished product you see now.&lt;/p&gt;

&lt;p&gt;It started as a more static concept: a cool atmospheric museum with a few hardcoded rooms. It looked interesting, but it was not yet a complete, usable system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;a static concept museum&lt;/li&gt;
&lt;li&gt;four hardcoded rooms&lt;/li&gt;
&lt;li&gt;no live GitHub user analysis&lt;/li&gt;
&lt;li&gt;no grounded repo evidence&lt;/li&gt;
&lt;li&gt;no real AI revival workflow&lt;/li&gt;
&lt;li&gt;no persistent archive&lt;/li&gt;
&lt;li&gt;no Copilot execution handoff&lt;/li&gt;
&lt;li&gt;no exportable decision artifact&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  After
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;personalized museum generation for any GitHub username&lt;/li&gt;
&lt;li&gt;GitHub API repo ingestion&lt;/li&gt;
&lt;li&gt;abandonment ranking and repo evidence enrichment&lt;/li&gt;
&lt;li&gt;AI-generated exhibit narrative grounded in real repo context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Copilot Curator&lt;/strong&gt; for per-project Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revival Plans&lt;/strong&gt; with diagnosis, architecture, stack, features, GTM, and score&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Markdown export&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Branded PDF export&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resurrection Bay&lt;/strong&gt; as a persistent archive of revived ideas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Copilot Resurrection Kits&lt;/strong&gt; ready to drop into a real repo&lt;/li&gt;
&lt;li&gt;shareable museum URLs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Before vs After Visual
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9o4kkgjbmeewz0glcl9v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9o4kkgjbmeewz0glcl9v.png" alt="Before and After" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The challenge was not just adding more features.&lt;/p&gt;

&lt;p&gt;The real challenge was turning a concept into a complete product with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a clear workflow&lt;/li&gt;
&lt;li&gt;a stronger information architecture&lt;/li&gt;
&lt;li&gt;live repo analysis&lt;/li&gt;
&lt;li&gt;fallback-safe AI behavior&lt;/li&gt;
&lt;li&gt;exportable outputs&lt;/li&gt;
&lt;li&gt;and a real “next step” after inspiration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That “next step” became one of my favorite parts of the project: the &lt;strong&gt;Copilot Resurrection Kit&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Most Important Product Idea
&lt;/h2&gt;

&lt;p&gt;I did not want this to become “just another cool AI interface.”&lt;/p&gt;

&lt;p&gt;So I asked myself:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What happens after the museum tells you a project is worth reviving?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The answer was: the app should help you act on it.&lt;/p&gt;

&lt;p&gt;That is why Museum of Dead Dreams generates a &lt;strong&gt;GitHub Copilot Resurrection Kit&lt;/strong&gt; with files like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;.github/copilot-instructions.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.github/instructions/resurrection.instructions.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.github/skills/&amp;lt;project&amp;gt;-resurrection/SKILL.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;docs/revival-plan.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;docs/resurrection-backlog.md&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is the visual for that system:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8i6gb3dqeqphzhp7njsr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8i6gb3dqeqphzhp7njsr.png" alt="Copilot Kit" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This makes the project more than a diagnosis engine.&lt;/p&gt;

&lt;p&gt;It becomes an execution bridge between abandoned code and the next real commit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;p&gt;Museum of Dead Dreams is built with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;React 19&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TypeScript&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vite&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tailwind CSS&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;shadcn/ui&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Node.js&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OpenAI SDK&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zod&lt;/strong&gt; for structured output validation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GitHub API&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;html2canvas + jsPDF&lt;/strong&gt; for branded PDF export&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSZip&lt;/strong&gt; for Copilot kit downloads&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;localStorage&lt;/code&gt; for persistence and per-museum Resurrection Bay archives&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  High-Level Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wo8obk2pe3ria2cwh6h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wo8obk2pe3ria2cwh6h.png" alt="Architecture" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A few technical choices I’m especially proud of:&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Grounded repo analysis
&lt;/h4&gt;

&lt;p&gt;The app does not just ask an LLM to imagine a project story. It collects real context from the repo first:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;languages&lt;/li&gt;
&lt;li&gt;commit count&lt;/li&gt;
&lt;li&gt;last commit&lt;/li&gt;
&lt;li&gt;README excerpt&lt;/li&gt;
&lt;li&gt;root files&lt;/li&gt;
&lt;li&gt;manifest snippets&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. AI with fallbacks
&lt;/h4&gt;

&lt;p&gt;The app stays functional even when AI is unavailable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;museum exhibit fallback generation&lt;/li&gt;
&lt;li&gt;revival plan fallback generation&lt;/li&gt;
&lt;li&gt;curator fallback answers&lt;/li&gt;
&lt;li&gt;caching and timeouts&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Persistent Resurrection Bay
&lt;/h4&gt;

&lt;p&gt;Revived projects are not temporary UI state. They become part of a scoped archive for that museum.&lt;/p&gt;

&lt;h4&gt;
  
  
  4. Exportable outcomes
&lt;/h4&gt;

&lt;p&gt;The output is not trapped in the interface. Users can export plans and download repo-ready guidance artifacts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Business / Product Value
&lt;/h2&gt;

&lt;p&gt;One of the most interesting things about abandoned code is that it often represents &lt;strong&gt;hidden capital&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not every dead repo deserves to come back. But many deserve a second look.&lt;/p&gt;

&lt;p&gt;Museum of Dead Dreams helps compress the time required to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;discover forgotten projects worth revisiting&lt;/li&gt;
&lt;li&gt;understand what they are&lt;/li&gt;
&lt;li&gt;diagnose why they died&lt;/li&gt;
&lt;li&gt;define what should happen next&lt;/li&gt;
&lt;li&gt;and package that thinking into something actionable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s the framing I used:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftbwyjmrgyboh346kxmvy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftbwyjmrgyboh346kxmvy.png" alt="ROI Dashboard" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;solo developers with years of side projects&lt;/li&gt;
&lt;li&gt;hackathon builders reviewing unfinished experiments&lt;/li&gt;
&lt;li&gt;startup founders revisiting prototype graveyards&lt;/li&gt;
&lt;li&gt;engineering managers evaluating internal abandoned tools&lt;/li&gt;
&lt;li&gt;open-source maintainers trying to prioritize what to revive&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  My Experience with GitHub Copilot
&lt;/h2&gt;

&lt;p&gt;GitHub Copilot was a meaningful part of finishing this project.&lt;/p&gt;

&lt;p&gt;I do not mean that in the shallow “AI wrote everything” sense.&lt;/p&gt;

&lt;p&gt;I mean that Copilot helped me behave like a more effective finisher.&lt;/p&gt;

&lt;p&gt;It supported the process by helping me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;inspect and reason about code structure faster&lt;/li&gt;
&lt;li&gt;tighten workflows across multiple components&lt;/li&gt;
&lt;li&gt;refine technical documentation&lt;/li&gt;
&lt;li&gt;shape product-facing explanations&lt;/li&gt;
&lt;li&gt;prepare repository instructions&lt;/li&gt;
&lt;li&gt;polish submission materials&lt;/li&gt;
&lt;li&gt;and move the project from “interesting prototype” to “launch-ready experience”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last part matters.&lt;/p&gt;

&lt;p&gt;The whole spirit of this challenge is not just to build something new, but to &lt;strong&gt;finally finish something that was left unfinished&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is exactly what happened here.&lt;/p&gt;

&lt;p&gt;Copilot was especially useful in the finish-up stage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;repo polish&lt;/li&gt;
&lt;li&gt;docs&lt;/li&gt;
&lt;li&gt;architecture communication&lt;/li&gt;
&lt;li&gt;developer onboarding&lt;/li&gt;
&lt;li&gt;GitHub-ready packaging&lt;/li&gt;
&lt;li&gt;and turning the repo into something others can understand quickly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So my experience with GitHub Copilot was not just about speed.&lt;/p&gt;

&lt;p&gt;It was about momentum.&lt;/p&gt;

&lt;p&gt;It helped reduce friction in the exact phase where many side projects usually die: the final stretch.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Project Fits the Challenge
&lt;/h2&gt;

&lt;p&gt;This challenge asks for a clear comeback story.&lt;/p&gt;

&lt;p&gt;Museum of Dead Dreams is, in a way, a meta submission.&lt;/p&gt;

&lt;p&gt;It is a project about abandoned software.&lt;/p&gt;

&lt;p&gt;And it was itself revived and finished for a challenge about reviving unfinished work.&lt;/p&gt;

&lt;p&gt;That made the completion arc very natural and very honest.&lt;/p&gt;

&lt;p&gt;I did not just “add a few features.”&lt;br&gt;
I turned a concept into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a functional full-stack product&lt;/li&gt;
&lt;li&gt;a real AI workflow&lt;/li&gt;
&lt;li&gt;a polished public repository&lt;/li&gt;
&lt;li&gt;a demoable experience&lt;/li&gt;
&lt;li&gt;and a stronger statement about how we should think about unfinished code&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;Most unfinished projects are not failures.&lt;/p&gt;

&lt;p&gt;They are frozen momentum.&lt;/p&gt;

&lt;p&gt;Museum of Dead Dreams is built around the belief that with the right context, diagnosis, and execution handoff, abandoned repositories can become future products.&lt;/p&gt;

&lt;p&gt;If GitHub is where software lives, then maybe it should also be where dead ideas get a second chance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/jpablortiz96/Museum-of-Dead-Dreams" rel="noopener noreferrer"&gt;https://github.com/jpablortiz96/Museum-of-Dead-Dreams&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Video Demo:&lt;/strong&gt; &lt;a href="https://youtu.be/qmLwdTxxjIA" rel="noopener noreferrer"&gt;https://youtu.be/qmLwdTxxjIA&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built by &lt;strong&gt;Juan Pablo Enríquez Ortiz / Eduky&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>githubcopilot</category>
      <category>ai</category>
    </item>
    <item>
      <title>Hermes Agent Changed How I Think About AI Agents: From Answer Engines to Skill-Building Systems</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sun, 31 May 2026 02:22:53 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/hermes-agent-changed-how-i-think-about-ai-agents-from-answer-engines-to-skill-building-systems-1gd2</link>
      <guid>https://dev.to/jpablortiz96/hermes-agent-changed-how-i-think-about-ai-agents-from-answer-engines-to-skill-building-systems-1gd2</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Write About Hermes Agent&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Hermes Agent Changed How I Think About AI Agents: From Answer Engines to Skill-Building Systems
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;The next leap in AI agents is not just better answers.&lt;br&gt;&lt;br&gt;
It is reusable experience.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When people talk about AI agents, the conversation often starts with automation.&lt;/p&gt;

&lt;p&gt;Can the agent use tools?&lt;br&gt;&lt;br&gt;
Can it open files?&lt;br&gt;&lt;br&gt;
Can it run commands?&lt;br&gt;&lt;br&gt;
Can it complete a multi-step task?&lt;/p&gt;

&lt;p&gt;Those questions matter.&lt;/p&gt;

&lt;p&gt;But after spending time building with &lt;strong&gt;Hermes Agent&lt;/strong&gt;, I think the more interesting question is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Can an agent turn one task into reusable knowledge for the next one?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That shift sounds small, but it changes everything.&lt;/p&gt;

&lt;p&gt;It moves agents from being answer engines to becoming skill-building systems.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Problem With Most AI Agent Workflows
&lt;/h2&gt;

&lt;p&gt;Most AI assistants are useful, but temporary.&lt;/p&gt;

&lt;p&gt;They help you solve a task in the moment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explain this codebase&lt;/li&gt;
&lt;li&gt;Summarize this file&lt;/li&gt;
&lt;li&gt;Suggest a fix&lt;/li&gt;
&lt;li&gt;Generate a script&lt;/li&gt;
&lt;li&gt;Run a command&lt;/li&gt;
&lt;li&gt;Help me understand an error&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is valuable.&lt;/p&gt;

&lt;p&gt;But once the task is done, the learning usually disappears.&lt;/p&gt;

&lt;p&gt;The next time you ask a similar question, the agent starts from scratch again.&lt;/p&gt;

&lt;p&gt;That creates a weird pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Human learns slowly.
Agent answers quickly.
But the system itself does not get much better.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The human has to remember the context.&lt;/p&gt;

&lt;p&gt;The repo does not become easier to understand.&lt;/p&gt;

&lt;p&gt;The workflow does not become more reusable.&lt;/p&gt;

&lt;p&gt;The agent helps, but it does not accumulate operational experience in a way that feels productized.&lt;/p&gt;

&lt;p&gt;Hermes Agent made me think about this differently.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Insight: Agents Need Reusable Experience
&lt;/h2&gt;

&lt;p&gt;The most interesting thing about Hermes Agent is not simply that it can use tools.&lt;/p&gt;

&lt;p&gt;Many agent systems can use tools.&lt;/p&gt;

&lt;p&gt;What stood out to me is the idea that an agentic workflow can move through a loop like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Observe
  ↓
Reason
  ↓
Act
  ↓
Extract reusable knowledge
  ↓
Use that knowledge in the next pass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last step is the important one.&lt;/p&gt;

&lt;p&gt;If the agent can create or reuse skills, then the system is not only completing a task.&lt;/p&gt;

&lt;p&gt;It is improving the next task.&lt;/p&gt;

&lt;p&gt;That creates a very different product design philosophy.&lt;/p&gt;

&lt;p&gt;Instead of building an app that asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What should the agent answer?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You start asking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What should the agent learn from this interaction?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a much stronger framing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Hermes Agent Feels Different
&lt;/h2&gt;

&lt;p&gt;Hermes Agent feels less like a black-box chatbot and more like a local agentic operating layer.&lt;/p&gt;

&lt;p&gt;The parts that stood out to me were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;CLI-first workflow&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Local execution&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tool use&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Terminal access&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Skill-based workflows&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-step reasoning&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A structure that encourages repeatable agent behavior&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The CLI-first design matters because it makes the agent feel closer to the developer workflow.&lt;/p&gt;

&lt;p&gt;Developers already live in terminals, repositories, file systems, and local environments.&lt;/p&gt;

&lt;p&gt;A local agent that can inspect, reason, and act in that environment feels much more natural than a detached chat window.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tool Use Is Not Enough
&lt;/h2&gt;

&lt;p&gt;A common trap in agent design is thinking that tool use alone makes something agentic.&lt;/p&gt;

&lt;p&gt;It does not.&lt;/p&gt;

&lt;p&gt;An agent that can run a command is useful.&lt;/p&gt;

&lt;p&gt;But an agent that knows &lt;strong&gt;when&lt;/strong&gt;, &lt;strong&gt;why&lt;/strong&gt;, and &lt;strong&gt;how&lt;/strong&gt; to run that command as part of a larger workflow is much more interesting.&lt;/p&gt;

&lt;p&gt;The difference looks like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Basic Tool Use&lt;/th&gt;
&lt;th&gt;Agentic Workflow&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Run &lt;code&gt;ls&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Inspect a repository structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Read a file&lt;/td&gt;
&lt;td&gt;Identify architectural areas&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run tests&lt;/td&gt;
&lt;td&gt;Understand project verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suggest a change&lt;/td&gt;
&lt;td&gt;Scope a safe contribution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complete one task&lt;/td&gt;
&lt;td&gt;Create reusable knowledge for future tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The real value is not the command.&lt;/p&gt;

&lt;p&gt;The value is the reasoning loop around the command.&lt;/p&gt;

&lt;p&gt;Hermes Agent encourages that loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Skill Layer Is the Big Deal
&lt;/h2&gt;

&lt;p&gt;The most important concept for me was the skill layer.&lt;/p&gt;

&lt;p&gt;Skills change the shape of an agentic system.&lt;/p&gt;

&lt;p&gt;Without skills, every interaction is mostly isolated.&lt;/p&gt;

&lt;p&gt;With skills, an agent can preserve procedures, context, and patterns that are useful later.&lt;/p&gt;

&lt;p&gt;That matters because real work is repetitive.&lt;/p&gt;

&lt;p&gt;Developers do not only solve one-off problems.&lt;/p&gt;

&lt;p&gt;They revisit the same repositories, the same commands, the same architecture, the same testing patterns, and the same contribution flows.&lt;/p&gt;

&lt;p&gt;A skill turns that repeated work into a reusable asset.&lt;/p&gt;

&lt;p&gt;That is where agents start to feel less like assistants and more like infrastructure.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Mental Model: Agent Memory Is Not Enough
&lt;/h2&gt;

&lt;p&gt;Memory is useful.&lt;/p&gt;

&lt;p&gt;But memory alone is not always operational.&lt;/p&gt;

&lt;p&gt;A memory might say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“This repository uses Python and pytest.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A skill can say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“When working in this repository, inspect these files first, run this verification flow, avoid this common pitfall, and use this process to scope a first contribution.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a big difference.&lt;/p&gt;

&lt;p&gt;Memory stores information.&lt;/p&gt;

&lt;p&gt;Skills store procedure.&lt;/p&gt;

&lt;p&gt;And procedure is what turns information into action.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned While Building With Hermes
&lt;/h2&gt;

&lt;p&gt;While experimenting with Hermes Agent, I learned that strong agentic products need five things.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. A Clear Workflow
&lt;/h3&gt;

&lt;p&gt;If the user cannot understand what the agent is doing, the product feels like magic in the bad sense.&lt;/p&gt;

&lt;p&gt;The workflow should be visible:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input → Agent reasoning → Tool use → Output → Reusable artifact
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user should know where the agent is in the process.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Real Tool Boundaries
&lt;/h3&gt;

&lt;p&gt;Agents that can act need boundaries.&lt;/p&gt;

&lt;p&gt;A powerful agent without safety rules can become unpredictable.&lt;/p&gt;

&lt;p&gt;For developer tools, that means asking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can the agent modify files?&lt;/li&gt;
&lt;li&gt;Where can it modify files?&lt;/li&gt;
&lt;li&gt;Can it install packages?&lt;/li&gt;
&lt;li&gt;Can it push code?&lt;/li&gt;
&lt;li&gt;Can it run destructive commands?&lt;/li&gt;
&lt;li&gt;Is there a sandbox?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The more capable the agent becomes, the more important the safety model becomes.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Reusable Artifacts
&lt;/h3&gt;

&lt;p&gt;A great agentic workflow should leave something behind.&lt;/p&gt;

&lt;p&gt;Not just an answer.&lt;/p&gt;

&lt;p&gt;A useful artifact.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A skill&lt;/li&gt;
&lt;li&gt;A checklist&lt;/li&gt;
&lt;li&gt;A structured analysis&lt;/li&gt;
&lt;li&gt;A diff&lt;/li&gt;
&lt;li&gt;A test&lt;/li&gt;
&lt;li&gt;A report&lt;/li&gt;
&lt;li&gt;A reusable command flow&lt;/li&gt;
&lt;li&gt;A decision log&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where agentic systems become compounding systems.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. A Second Pass
&lt;/h3&gt;

&lt;p&gt;The second pass is underrated.&lt;/p&gt;

&lt;p&gt;The first pass shows that the agent can understand.&lt;/p&gt;

&lt;p&gt;The second pass shows that the agent can improve.&lt;/p&gt;

&lt;p&gt;That is a more powerful story than a single output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;First pass: “I understand this.”
Second pass: “I can now use what I learned.”
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the beginning of agentic learning as a product experience.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. Visible Reasoning Without Exposing Chaos
&lt;/h3&gt;

&lt;p&gt;Developer users need trust.&lt;/p&gt;

&lt;p&gt;They do not necessarily need to see every token or every internal detail, but they do need to see evidence.&lt;/p&gt;

&lt;p&gt;Good agent UX should show:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What was inspected&lt;/li&gt;
&lt;li&gt;What tools were used&lt;/li&gt;
&lt;li&gt;What files mattered&lt;/li&gt;
&lt;li&gt;What changed&lt;/li&gt;
&lt;li&gt;What was verified&lt;/li&gt;
&lt;li&gt;What the agent learned&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That visibility turns agent output into something users can trust.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Open Agentic Systems Mean for Developers
&lt;/h2&gt;

&lt;p&gt;Open agentic systems matter because developers need control.&lt;/p&gt;

&lt;p&gt;If agents are going to operate in real development environments, developers should be able to understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What model or provider is being used&lt;/li&gt;
&lt;li&gt;What tools are enabled&lt;/li&gt;
&lt;li&gt;What files are accessible&lt;/li&gt;
&lt;li&gt;What commands can be run&lt;/li&gt;
&lt;li&gt;Where outputs are stored&lt;/li&gt;
&lt;li&gt;How reusable skills are created&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Closed, opaque agent systems can be impressive.&lt;/p&gt;

&lt;p&gt;But open, inspectable agent systems are easier to trust, debug, extend, and integrate.&lt;/p&gt;

&lt;p&gt;Hermes Agent fits into that direction.&lt;/p&gt;

&lt;p&gt;It gives developers a way to build agentic workflows that feel closer to real software systems than isolated chat sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Practical Pattern: Analyze → Skill → Improve → Act
&lt;/h2&gt;

&lt;p&gt;One pattern I found especially powerful is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Analyze
  ↓
Generate skills
  ↓
Run a second pass
  ↓
Act safely
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern can apply to many developer workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repository onboarding&lt;/li&gt;
&lt;li&gt;Code review&lt;/li&gt;
&lt;li&gt;Documentation generation&lt;/li&gt;
&lt;li&gt;Test planning&lt;/li&gt;
&lt;li&gt;Incident response&lt;/li&gt;
&lt;li&gt;DevOps runbooks&lt;/li&gt;
&lt;li&gt;Data pipeline debugging&lt;/li&gt;
&lt;li&gt;Release checklists&lt;/li&gt;
&lt;li&gt;Migration planning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important thing is that the agent does not simply complete a task.&lt;/p&gt;

&lt;p&gt;It creates a workflow that can be reused.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Hermes Agent Shines
&lt;/h2&gt;

&lt;p&gt;Based on my experience, Hermes Agent is especially interesting when the task requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local context&lt;/li&gt;
&lt;li&gt;Tool use&lt;/li&gt;
&lt;li&gt;Multi-step reasoning&lt;/li&gt;
&lt;li&gt;Reusable procedures&lt;/li&gt;
&lt;li&gt;Developer workflows&lt;/li&gt;
&lt;li&gt;Filesystem interaction&lt;/li&gt;
&lt;li&gt;Iterative improvement&lt;/li&gt;
&lt;li&gt;A visible bridge between reasoning and action&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes it a strong fit for projects where the agent is not just answering questions, but operating inside a workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where You Still Need to Be Careful
&lt;/h2&gt;

&lt;p&gt;Powerful agents need careful design.&lt;/p&gt;

&lt;p&gt;A few lessons became clear very quickly:&lt;/p&gt;

&lt;h3&gt;
  
  
  Do not give write access too early
&lt;/h3&gt;

&lt;p&gt;Let the agent inspect first.&lt;/p&gt;

&lt;p&gt;Only allow modifications once the workflow is clear.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use sandboxes
&lt;/h3&gt;

&lt;p&gt;If an agent can modify code, isolate the changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Avoid hidden destructive commands
&lt;/h3&gt;

&lt;p&gt;Block or review commands like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo
rm -rf
git push
apt-get
global package installs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Validate outputs
&lt;/h3&gt;

&lt;p&gt;Structured JSON, tests, diffs, and verification commands make agent behavior easier to trust.&lt;/p&gt;

&lt;h3&gt;
  
  
  Build fallback paths
&lt;/h3&gt;

&lt;p&gt;Provider quotas, timeouts, and model errors are real.&lt;/p&gt;

&lt;p&gt;A good agentic product should fail gracefully.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Shift
&lt;/h2&gt;

&lt;p&gt;The old way of thinking about AI assistants was:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“How can this model answer my question?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The new way of thinking about agents is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“How can this system complete a workflow, preserve what it learned, and improve the next workflow?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is why Hermes Agent is interesting.&lt;/p&gt;

&lt;p&gt;It points toward agents as systems that can accumulate useful operational experience.&lt;/p&gt;

&lt;p&gt;Not consciousness.&lt;/p&gt;

&lt;p&gt;Not magic.&lt;/p&gt;

&lt;p&gt;Just practical, reusable, developer-controlled experience.&lt;/p&gt;

&lt;p&gt;That is enough to be a big deal.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Takeaway
&lt;/h2&gt;

&lt;p&gt;Hermes Agent made me think about agentic development in a more product-oriented way.&lt;/p&gt;

&lt;p&gt;The most exciting agent products will not be the ones that simply generate the longest answers.&lt;/p&gt;

&lt;p&gt;They will be the ones that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use tools responsibly&lt;/li&gt;
&lt;li&gt;Create reusable skills&lt;/li&gt;
&lt;li&gt;Make their process visible&lt;/li&gt;
&lt;li&gt;Improve over repeated use&lt;/li&gt;
&lt;li&gt;Act safely inside clear boundaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The future of agents is not just automation.&lt;br&gt;&lt;br&gt;
It is reusable operational intelligence.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;Most agents answer.&lt;/p&gt;

&lt;p&gt;Better agents act.&lt;/p&gt;

&lt;p&gt;The most useful agents learn from action and turn that learning into reusable skills.&lt;/p&gt;

&lt;p&gt;That is the direction I want more developer tools to explore.&lt;/p&gt;

&lt;p&gt;And that is why Hermes Agent is worth paying attention to.&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>Hermes Repo Dojo: Most Agents Answer. Hermes Learns. Then It Safely Contributes.</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sun, 31 May 2026 02:16:17 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/hermes-repo-dojo-most-agents-answer-hermes-learns-then-it-safely-contributes-1kda</link>
      <guid>https://dev.to/jpablortiz96/hermes-repo-dojo-most-agents-answer-hermes-learns-then-it-safely-contributes-1kda</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Build With Hermes Agent&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Hermes Repo Dojo: Most Agents Answer. Hermes Learns. Then It Safely Contributes.
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;Turn any GitHub repo into a living onboarding academy.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most AI coding agents answer questions about a repository.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Repo Dojo does something different.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It lets Hermes Agent learn a GitHub repository, transform that understanding into reusable repo-specific skills, improve on a second pass, and safely create a first contribution inside a sandbox clone.&lt;/p&gt;

&lt;p&gt;This is not a repo chatbot.&lt;/p&gt;

&lt;p&gt;This is an agentic onboarding system.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Hermes Repo Dojo&lt;/strong&gt; is a local AI developer tool that turns any public GitHub repository into a guided onboarding academy for new contributors.&lt;/p&gt;

&lt;p&gt;A user pastes a GitHub repository URL, and Hermes Repo Dojo generates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🧬 &lt;strong&gt;Repo DNA&lt;/strong&gt; — purpose, stack, language, commands, and entrypoints&lt;/li&gt;
&lt;li&gt;🏗️ &lt;strong&gt;Architecture Map&lt;/strong&gt; — logical areas of the codebase&lt;/li&gt;
&lt;li&gt;🧭 &lt;strong&gt;Setup Timeline&lt;/strong&gt; — what to do first, second, and next&lt;/li&gt;
&lt;li&gt;🥋 &lt;strong&gt;Dojo Map&lt;/strong&gt; — a contributor learning path from Explorer to Contributor&lt;/li&gt;
&lt;li&gt;⚔️ &lt;strong&gt;Boss Fight&lt;/strong&gt; — a scoped first contribution with acceptance criteria&lt;/li&gt;
&lt;li&gt;🧠 &lt;strong&gt;Skill Forge&lt;/strong&gt; — reusable repo-specific Hermes skills&lt;/li&gt;
&lt;li&gt;🔁 &lt;strong&gt;Second Pass&lt;/strong&gt; — before/after improvement using generated skills&lt;/li&gt;
&lt;li&gt;🛡️ &lt;strong&gt;Patch Arena&lt;/strong&gt; — a safe sandbox where Hermes creates a first contribution&lt;/li&gt;
&lt;li&gt;🎬 &lt;strong&gt;Hermes Brain Replay&lt;/strong&gt; — a visual replay of the agent learning trace&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core idea:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A repository should not only store code. It should teach. It should remember. It should safely guide contribution.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Open-source onboarding is broken.&lt;/p&gt;

&lt;p&gt;Every new contributor usually has to repeat the same painful discovery process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read the README&lt;/li&gt;
&lt;li&gt;Guess the architecture&lt;/li&gt;
&lt;li&gt;Find the right entrypoints&lt;/li&gt;
&lt;li&gt;Figure out install and test commands&lt;/li&gt;
&lt;li&gt;Understand conventions&lt;/li&gt;
&lt;li&gt;Decide what a safe first contribution looks like&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Maintainers and senior contributors carry hidden knowledge in their heads.&lt;/p&gt;

&lt;p&gt;New contributors do not.&lt;/p&gt;

&lt;p&gt;AI agents can answer questions about a repo, but most of that knowledge disappears after the answer.&lt;/p&gt;

&lt;p&gt;Hermes Repo Dojo turns that one-time exploration into reusable operational knowledge.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🎥 Demo video:&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/6fxXv6-1Z6g"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;In the demo, I use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://github.com/karpathy/micrograd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hermes Repo Dojo analyzes &lt;code&gt;micrograd&lt;/code&gt;, a tiny educational autograd engine, and produces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stack: Python, Autograd, Neural Networks, Pytest&lt;/li&gt;
&lt;li&gt;Architecture:

&lt;ul&gt;
&lt;li&gt;Core Autograd Engine&lt;/li&gt;
&lt;li&gt;Neural Network Library&lt;/li&gt;
&lt;li&gt;Tests and Demos&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Skills:

&lt;ul&gt;
&lt;li&gt;MicrogradAutogradTrace&lt;/li&gt;
&lt;li&gt;MicrogradMLPTraining&lt;/li&gt;
&lt;li&gt;MicrogradGraphViz&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Patch Arena contribution:

&lt;ul&gt;
&lt;li&gt;Adds a &lt;code&gt;sigmoid&lt;/code&gt; activation to the &lt;code&gt;Value&lt;/code&gt; class&lt;/li&gt;
&lt;li&gt;Creates a smoke test&lt;/li&gt;
&lt;li&gt;Shows a diff&lt;/li&gt;
&lt;li&gt;Verifies the patch&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;The final demo flow is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GitHub URL
   ↓
Hermes Repo Analysis
   ↓
Skill Forge
   ↓
Second Pass Improvement
   ↓
Patch Arena
   ↓
Hermes Brain Replay
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;GitHub repository:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/jpablortiz96/hermes-repo-dojo" rel="noopener noreferrer"&gt;https://github.com/jpablortiz96/hermes-repo-dojo&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  My Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hermes Agent CLI&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hermes terminal toolset&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hermes skills toolset&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Next.js&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TypeScript&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tailwind CSS&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Node.js API routes&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Local JSON persistence&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sandboxed Git workspaces&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The app runs locally and invokes Hermes through the CLI from the backend.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Used Hermes Agent
&lt;/h2&gt;

&lt;p&gt;Hermes Agent is not decorative in this project.&lt;/p&gt;

&lt;p&gt;It is the operating intelligence behind the workflow.&lt;/p&gt;

&lt;p&gt;Hermes Repo Dojo invokes Hermes through commands like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes chat &lt;span class="nt"&gt;--toolsets&lt;/span&gt; &lt;span class="s2"&gt;"terminal,skills"&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hermes is used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inspect cloned repositories&lt;/li&gt;
&lt;li&gt;Reason about architecture and repository structure&lt;/li&gt;
&lt;li&gt;Generate structured onboarding analysis&lt;/li&gt;
&lt;li&gt;Create reusable repo-specific skills&lt;/li&gt;
&lt;li&gt;Improve the onboarding flow on a second pass&lt;/li&gt;
&lt;li&gt;Assist with safe contribution planning&lt;/li&gt;
&lt;li&gt;Power the agent learning narrative behind the product&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key Hermes capabilities I leaned on were:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hermes Capability&lt;/th&gt;
&lt;th&gt;How Repo Dojo Uses It&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Terminal tool use&lt;/td&gt;
&lt;td&gt;Inspecting local cloned repositories&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Skills&lt;/td&gt;
&lt;td&gt;Creating reusable repo-specific workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-step reasoning&lt;/td&gt;
&lt;td&gt;Analysis → skills → second pass → patch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CLI-first architecture&lt;/td&gt;
&lt;td&gt;Running Hermes locally from the backend&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agentic workflow&lt;/td&gt;
&lt;td&gt;Turning codebase exploration into a guided product experience&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Big Idea
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What if every repository could teach itself?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A repository is usually static.&lt;/p&gt;

&lt;p&gt;It stores code, but it does not preserve the contributor journey behind the code.&lt;/p&gt;

&lt;p&gt;Hermes Repo Dojo makes the repository operational.&lt;/p&gt;

&lt;p&gt;It lets Hermes Agent inspect a codebase, extract hidden onboarding knowledge, convert that knowledge into reusable skills, and use those skills to guide better future interactions.&lt;/p&gt;

&lt;p&gt;The result is a repository that does not just contain code.&lt;/p&gt;

&lt;p&gt;It becomes a living onboarding workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🧬 Repo DNA
&lt;/h3&gt;

&lt;p&gt;Hermes extracts a repository profile:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;li&gt;Main language&lt;/li&gt;
&lt;li&gt;Stack&lt;/li&gt;
&lt;li&gt;Entrypoints&lt;/li&gt;
&lt;li&gt;Commands&lt;/li&gt;
&lt;li&gt;Important files&lt;/li&gt;
&lt;li&gt;Architecture areas&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For &lt;code&gt;micrograd&lt;/code&gt;, Hermes identifies Python, Autograd, Neural Networks, Pytest, and the key files a contributor should inspect first.&lt;/p&gt;




&lt;h3&gt;
  
  
  🏗️ Architecture Map
&lt;/h3&gt;

&lt;p&gt;Instead of showing raw folders, Hermes transforms the structure into logical areas.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;micrograd&lt;/code&gt;, it maps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Core Autograd Engine&lt;/strong&gt; — &lt;code&gt;micrograd/engine.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Neural Network Library&lt;/strong&gt; — &lt;code&gt;micrograd/nn.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tests and Demos&lt;/strong&gt; — &lt;code&gt;demo.ipynb&lt;/code&gt;, &lt;code&gt;trace_graph.ipynb&lt;/code&gt;, &lt;code&gt;test/test_engine.py&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🧠 Skill Forge
&lt;/h3&gt;

&lt;p&gt;This is where the project becomes more than analysis.&lt;/p&gt;

&lt;p&gt;Hermes generates reusable repo-specific skills.&lt;/p&gt;

&lt;p&gt;For the demo, it creates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MicrogradAutogradTrace
MicrogradMLPTraining
MicrogradGraphViz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The point is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Repository understanding should not disappear after one answer. It should become reusable operational knowledge.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  🔁 Second Pass
&lt;/h3&gt;

&lt;p&gt;After generating skills, Hermes Repo Dojo runs a second pass.&lt;/p&gt;

&lt;p&gt;Before skills:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A contributor manually reads README files&lt;/li&gt;
&lt;li&gt;Inspects files without a clear path&lt;/li&gt;
&lt;li&gt;Guesses where to start&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After skills:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hermes knows the important files&lt;/li&gt;
&lt;li&gt;Hermes has repo-specific procedures&lt;/li&gt;
&lt;li&gt;Hermes can guide a scoped first contribution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most agents answer once.&lt;/p&gt;

&lt;p&gt;Hermes improves on the second pass.&lt;/p&gt;




&lt;h3&gt;
  
  
  🛡️ Patch Arena
&lt;/h3&gt;

&lt;p&gt;Patch Arena is where Hermes can safely contribute.&lt;/p&gt;

&lt;p&gt;This is not reckless auto-coding.&lt;/p&gt;

&lt;p&gt;Patch Arena creates a sandbox clone:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;workspace/patch-arena/{repo}-{timestamp}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then Hermes can create or assist with a contribution &lt;strong&gt;inside that sandbox only&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Safety rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The original repo is never modified&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;git push&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;sudo&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;apt-get&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;No destructive commands&lt;/li&gt;
&lt;li&gt;Diff is shown before the patch is presented&lt;/li&gt;
&lt;li&gt;Verification is run before showing the result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For &lt;code&gt;micrograd&lt;/code&gt;, Patch Arena creates a safe first contribution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gi"&gt;+ import math
+
&lt;/span&gt;  class Value:
      """ stores a single scalar value and its gradient """
&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="gi"&gt;+     def sigmoid(self):
+         s = 1 / (1 + math.exp(-self.data))
+         out = Value(s, (self,), 'sigmoid')
+
+         def _backward():
+             self.grad += (s * (1 - s)) * out.grad
+         out._backward = _backward
+         return out
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It also adds a smoke test and verifies the result.&lt;/p&gt;

&lt;p&gt;This is the moment where the product becomes more than an onboarding dashboard.&lt;/p&gt;

&lt;p&gt;Hermes does not just explain the repo.&lt;/p&gt;

&lt;p&gt;It creates a safe first contribution.&lt;/p&gt;




&lt;h3&gt;
  
  
  🎬 Hermes Brain Replay
&lt;/h3&gt;

&lt;p&gt;Most agent systems hide the process in logs.&lt;/p&gt;

&lt;p&gt;Hermes Brain Replay turns the agent journey into a visible product experience.&lt;/p&gt;

&lt;p&gt;It shows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Repo Ingested
DNA Extracted
Architecture Mapped
Skills Forged
Second Pass Improved
Patch Created
Verification Passed
Learning Trace Completed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is designed to make the agent’s learning process understandable, visual, and demo-friendly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flowchart TD
    A[GitHub Repo URL] --&amp;gt; B[Next.js UI]
    B --&amp;gt; C[Analyze API]
    C --&amp;gt; D[Clone Repository]
    D --&amp;gt; E[Hermes Agent CLI]
    E --&amp;gt; F[Repo DNA JSON]
    F --&amp;gt; G[Dashboard]
    F --&amp;gt; H[Skill Forge]
    H --&amp;gt; I[Generated SKILL.md Files]
    I --&amp;gt; J[Hermes Skills Directory]
    F --&amp;gt; K[Second Pass]
    K --&amp;gt; L[Before vs After]
    F --&amp;gt; M[Patch Arena]
    M --&amp;gt; N[Sandbox Clone]
    N --&amp;gt; O[Safe Patch Generation]
    O --&amp;gt; P[Diff Preview + Verification]
    F --&amp;gt; Q[Hermes Brain Replay]
    H --&amp;gt; Q
    K --&amp;gt; Q
    M --&amp;gt; Q
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Safety Model
&lt;/h2&gt;

&lt;p&gt;Hermes Repo Dojo separates learning from contribution.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scout Mode
&lt;/h3&gt;

&lt;p&gt;Scout Mode is read-oriented.&lt;/p&gt;

&lt;p&gt;It inspects the repository, extracts facts, generates onboarding analysis, and forges skills.&lt;/p&gt;

&lt;h3&gt;
  
  
  Patch Arena
&lt;/h3&gt;

&lt;p&gt;Patch Arena is write-enabled, but only inside a sandbox clone.&lt;/p&gt;

&lt;p&gt;The original repository stays untouched.&lt;/p&gt;

&lt;h3&gt;
  
  
  Demo-safe fallback
&lt;/h3&gt;

&lt;p&gt;Agent providers can hit quota, timeout, or return unsafe output.&lt;/p&gt;

&lt;p&gt;To keep the product reliable, Hermes Repo Dojo can fall back to locally extracted repository facts and sandbox-safe patch generation.&lt;/p&gt;

&lt;p&gt;That fallback does not replace Hermes Agent. It protects the user experience and environment when provider limits interrupt the live flow.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Makes This Different
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Typical AI Repo Tool&lt;/th&gt;
&lt;th&gt;Hermes Repo Dojo&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Answers one-off questions&lt;/td&gt;
&lt;td&gt;Creates reusable repo-specific skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Explains files&lt;/td&gt;
&lt;td&gt;Builds a contributor onboarding path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suggests changes&lt;/td&gt;
&lt;td&gt;Generates sandboxed patches with diff and verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hides agent process in logs&lt;/td&gt;
&lt;td&gt;Visualizes the full learning trace&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Static analysis&lt;/td&gt;
&lt;td&gt;Skill generation and second-pass improvement&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is not a chatbot for repositories.&lt;/p&gt;

&lt;p&gt;It is an agentic onboarding system.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building this project taught me that agentic products need more than model output.&lt;/p&gt;

&lt;p&gt;They need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clear workflows&lt;/li&gt;
&lt;li&gt;Safety boundaries&lt;/li&gt;
&lt;li&gt;Visible reasoning artifacts&lt;/li&gt;
&lt;li&gt;Reusable knowledge&lt;/li&gt;
&lt;li&gt;A way to show improvement over time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hermes Agent was a strong fit because the project is not just about generating text.&lt;/p&gt;

&lt;p&gt;It is about tool use, skill creation, repository understanding, and multi-step workflows.&lt;/p&gt;




&lt;h2&gt;
  
  
  Future Work
&lt;/h2&gt;

&lt;p&gt;The next versions could include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub pull request creation&lt;/li&gt;
&lt;li&gt;Maintainer-approved contribution templates&lt;/li&gt;
&lt;li&gt;Multi-repo organization onboarding&lt;/li&gt;
&lt;li&gt;Team skill libraries&lt;/li&gt;
&lt;li&gt;VS Code extension&lt;/li&gt;
&lt;li&gt;CI integration&lt;/li&gt;
&lt;li&gt;Repo health scoring&lt;/li&gt;
&lt;li&gt;Contributor progress tracking&lt;/li&gt;
&lt;li&gt;Enterprise internal repo academy mode&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Pitch
&lt;/h2&gt;

&lt;p&gt;Most agents answer.&lt;/p&gt;

&lt;p&gt;Hermes learns.&lt;/p&gt;

&lt;p&gt;Then it safely contributes.&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>5 production patterns for running Gemma 4 in the browser — what the docs don't tell you</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sat, 23 May 2026 04:25:37 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/5-production-patterns-for-running-gemma-4-in-the-browser-what-the-docs-dont-tell-you-2ai1</link>
      <guid>https://dev.to/jpablortiz96/5-production-patterns-for-running-gemma-4-in-the-browser-what-the-docs-dont-tell-you-2ai1</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Write About Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I spent 11 days shipping &lt;strong&gt;AULA&lt;/strong&gt; — an AI tutor that runs Gemma 4 entirely inside the browser for Latin American students without reliable internet. The build forced me to learn things about deploying Gemma 4 in production that the official documentation glosses over.&lt;/p&gt;

&lt;p&gt;This post distills the 5 patterns I wish someone had handed me on day one. Every one of them cost me hours (or in one case, an entire afternoon) to figure out. If you're shipping Gemma 4 to real users on real hardware, this is the playbook I would have wanted.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you want to see the result first, AULA's repo is here: &lt;a href="https://github.com/jpablortiz96/aula" rel="noopener noreferrer"&gt;github.com/jpablortiz96/aula&lt;/a&gt;. The Build with Gemma 4 submission has the full demo. This post is the technical postmortem.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Pattern 1 — MediaPipe is the right runtime, not transformers.js (yet)
&lt;/h2&gt;

&lt;p&gt;If you Google "run Gemma 4 in the browser", you'll mostly find tutorials using &lt;code&gt;@huggingface/transformers.js&lt;/code&gt;. It's a fantastic library and the obvious starting point. I started there too.&lt;/p&gt;

&lt;p&gt;On my development laptop — a Windows machine with an NVIDIA RTX 3050 (Ampere, 6 GB VRAM) — &lt;code&gt;transformers.js&lt;/code&gt; with WebGPU gave me &lt;strong&gt;2 tokens per second&lt;/strong&gt;. The benchmarks I'd seen online claimed 20-30 tok/s on similar hardware. Something was very wrong.&lt;/p&gt;

&lt;p&gt;After a full afternoon of debugging (chrome://gpu, Task Manager GPU monitor, NVIDIA Control Panel, Vulkan flags, switching to Edge), I found the root cause: on NVIDIA Optimus laptops, &lt;strong&gt;dispatch was routing through the integrated Intel UHD GPU&lt;/strong&gt;, not the discrete NVIDIA. WebGPU's &lt;code&gt;requestAdapter({ powerPreference: 'high-performance' })&lt;/code&gt; is ignored on Windows (&lt;a href="https://bugs.chromium.org/p/chromium/issues/detail?id=369219127" rel="noopener noreferrer"&gt;Chromium bug 369219127&lt;/a&gt;). The model "worked" but ran on the wrong silicon.&lt;/p&gt;

&lt;p&gt;What fixed it: &lt;strong&gt;migrating to &lt;a href="https://www.npmjs.com/package/@mediapipe/tasks-genai" rel="noopener noreferrer"&gt;@mediapipe/tasks-genai&lt;/a&gt; with the WebGPU delegate.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;FilesetResolver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;LlmInference&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@mediapipe/tasks-genai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fileset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;FilesetResolver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forGenAiTasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai/wasm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;LlmInference&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createFromOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;baseOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;modelAssetPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://huggingface.co/litert-community/gemma-4-e2b-it/resolve/main/gemma-4-e2b-it-int4-web.task&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;topK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same hardware. &lt;strong&gt;Same model. Jumped from 2 tok/s to 14-16 tok/s. A 7x speedup, just from switching runtimes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MediaPipe is Google's official runtime for Gemma on edge devices. The team optimized the dispatch path specifically for the WebGPU delegate. It's also the only path that supports the &lt;code&gt;.task&lt;/code&gt; artifact format Google publishes for browser deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; if you're targeting consumer hardware in 2026, MediaPipe is the production runtime. &lt;code&gt;transformers.js&lt;/code&gt; is excellent for prototyping but has not yet caught up on dispatch quality across all GPU/OS combinations. Use it for the local engine; revisit &lt;code&gt;transformers.js&lt;/code&gt; in 6-12 months.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 2 — Pick the right Gemma 4 variant for the constraint, not the benchmark
&lt;/h2&gt;

&lt;p&gt;Gemma 4 comes in three flavors and the marketing pages emphasize the 31B Dense and 26B MoE as the headline models. For a browser deployment, &lt;strong&gt;the only variant that actually matters is the E2B&lt;/strong&gt; (~2 billion effective parameters, quantized to ~1.5 GB).&lt;/p&gt;

&lt;p&gt;Here's the honest tradeoff matrix I built when picking the model for AULA's local engine:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Size on disk&lt;/th&gt;
&lt;th&gt;Runs in browser?&lt;/th&gt;
&lt;th&gt;Math/reasoning quality&lt;/th&gt;
&lt;th&gt;When to use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E2B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~1.5 GB (q4f16)&lt;/td&gt;
&lt;td&gt;✅ Yes, WebGPU&lt;/td&gt;
&lt;td&gt;Good for conversational tutoring&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Local browser deployments&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E4B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~3 GB (q4f16)&lt;/td&gt;
&lt;td&gt;⚠️ Only on 8 GB+ VRAM GPUs&lt;/td&gt;
&lt;td&gt;Slightly better than E2B&lt;/td&gt;
&lt;td&gt;Mid-range GPUs only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 26B-A4B-IT (MoE)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~13 GB&lt;/td&gt;
&lt;td&gt;❌ Cloud only&lt;/td&gt;
&lt;td&gt;Near-31B quality, lower latency&lt;/td&gt;
&lt;td&gt;Cloud API for structured output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 31B-IT (Dense)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~16 GB&lt;/td&gt;
&lt;td&gt;❌ Cloud only&lt;/td&gt;
&lt;td&gt;Best reasoning&lt;/td&gt;
&lt;td&gt;When latency doesn't matter&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For AULA's offline-first use case, the picking logic was straightforward: &lt;strong&gt;the model has to fit in VRAM on a Raspberry Pi 5 (8 GB unified memory)&lt;/strong&gt;. E4B is too big the moment you account for KV cache + browser overhead. E2B fits with margin.&lt;/p&gt;

&lt;p&gt;The non-obvious learning: &lt;strong&gt;on my RTX 3050 (6 GB VRAM), I tried to ship with E4B because it scored better on benchmarks&lt;/strong&gt;. The model loaded but spilled into shared system memory via PCIe, dropping inference to ~1.8 tok/s. Switching to E2B (which actually fits in dedicated VRAM) jumped me back to 14+ tok/s.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; for in-browser inference, the right model is the largest one that fits entirely in dedicated VRAM after counting ~1.5 GB of browser/system overhead. Anything larger spills to PCIe and is unusable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For Cloud Boost (the optional half of AULA), I picked &lt;strong&gt;26B-A4B over 31B Dense&lt;/strong&gt; despite the lower parameter count. The mixture-of-experts architecture activates only ~4B parameters per forward pass, giving &lt;strong&gt;2-3x lower latency&lt;/strong&gt; at near-31B quality. For short structured outputs (a quiz JSON, a Mermaid diagram), this latency difference is the difference between "feels instant" and "user gives up".&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 3 — Don't force small models into rigid structured output
&lt;/h2&gt;

&lt;p&gt;This is the pattern I had to relearn three times before accepting it.&lt;/p&gt;

&lt;p&gt;Gemma 4 E2B is &lt;em&gt;brilliant&lt;/em&gt; at conversational tasks: math explanations, language tutoring, Socratic dialogue, multi-step reasoning in plain text. It is &lt;strong&gt;not reliable&lt;/strong&gt; at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Producing valid JSON without surrounding prose&lt;/li&gt;
&lt;li&gt;Generating syntactically-valid Mermaid diagrams&lt;/li&gt;
&lt;li&gt;Outputting coherent SVG with proper geometry&lt;/li&gt;
&lt;li&gt;Following "respond ONLY with X" instructions consistently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a bug. It's a known property of small open models. The "instruction following" capability scales roughly with parameter count, and at 2B effective parameters, E2B sits at the edge.&lt;/p&gt;

&lt;p&gt;My first three approaches all failed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stricter prompts.&lt;/strong&gt; "Respond ONLY with valid JSON, no markdown, no prose." Worked 70% of the time. The other 30% the model added an explanation paragraph or a &lt;code&gt;Here is the JSON:&lt;/code&gt; prefix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Higher temperature for diversity, lower for structure.&lt;/strong&gt; Marginal improvement, but introduced its own failure modes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tolerant JSON parser that strips fences and reaches for the first &lt;code&gt;{&lt;/code&gt;.&lt;/strong&gt; Helped, but didn't fix the cases where the model produced &lt;em&gt;almost-valid&lt;/em&gt; JSON with unescaped quotes inside string values.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What actually worked: &lt;strong&gt;route structured-output features to a larger model in the cloud&lt;/strong&gt; (26B-A4B), keep the local model for conversational features, and &lt;strong&gt;be transparent about the routing in the UI&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In AULA, every screen shows a badge: green for local, blue for cloud. The user always knows which engine answered. This is the design pattern I'd argue for as a general principle:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Don't pretend your small model can do something it can't. Make the limitation a UX surface, not a hidden failure mode.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Here's the shape of the routing logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Routing decision per feature, not per request&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;chooseEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hasApiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;EngineId&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;structuredOutputFeatures&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;infinite-practice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;// requires JSON&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;svg-illustration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;// requires valid SVG&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mermaid-mindmap&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// requires strict syntax&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;interactive-quiz&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;// requires JSON array&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;handwriting-ocr&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// requires vision (E2B is text-only)&lt;/span&gt;
  &lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;structuredOutputFeatures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;hasApiKey&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cloud-boost&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unavailable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;local&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// chat, voice, calculator, Socratic, etc.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the user always sees the routing decision, with an honest reason if cloud isn't available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unavailable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;showInfoMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;This feature needs Cloud Boost. Add your free Google AI Studio API key in Settings to unlock it. The rest of AULA works offline regardless.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Pattern 4 — &lt;code&gt;LlmInference&lt;/code&gt; is exclusive. Build a queue.
&lt;/h2&gt;

&lt;p&gt;This bit me on day 9 and cost me half a day to diagnose.&lt;/p&gt;

&lt;p&gt;MediaPipe's &lt;code&gt;LlmInference&lt;/code&gt; instance is &lt;strong&gt;a singleton with exclusive access&lt;/strong&gt;. It can process exactly one generation at a time. If you call &lt;code&gt;generateResponse()&lt;/code&gt; while a previous generation is still in flight, you get:&lt;br&gt;
Previous invocation or loading is still ongoing.&lt;/p&gt;

&lt;p&gt;In a single-page app with multiple routes (chat, practice, mind maps), this is easy to trigger:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User starts a long response in /chat&lt;/li&gt;
&lt;li&gt;User navigates to /practice before it finishes&lt;/li&gt;
&lt;li&gt;/practice tries to generate an exercise&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The model is locked. Everything breaks.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The fix: a FIFO queue with abort propagation across navigations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LocalEngine&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;currentAbort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AbortController&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;GenerateOptions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Cancel any in-flight generation&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="c1"&gt;// small buffer&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;task&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Recovery path when the model gets stuck&lt;/span&gt;
  &lt;span class="nf"&gt;forceReset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Critical: every component that uses the engine must call &lt;code&gt;abort()&lt;/code&gt; on unmount.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without this cleanup, navigating away mid-generation leaves the model locked, and the next page that wants to generate will silently hang.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 5 — Gemma 4 26B does not stream reliably. Use &lt;code&gt;generateContent&lt;/code&gt;, not &lt;code&gt;streamGenerateContent&lt;/code&gt;.
&lt;/h2&gt;

&lt;p&gt;This one took an afternoon and a careful read of DevTools Network tab to find.&lt;/p&gt;

&lt;p&gt;The Gemini API exposes two endpoints for Gemma 4 models:&lt;br&gt;
POST .../models/gemma-4-26b-a4b-it:generateContent        ← Full response&lt;br&gt;
POST .../models/gemma-4-26b-a4b-it:streamGenerateContent  ← SSE chunks&lt;/p&gt;

&lt;p&gt;For chat use cases, you obviously want streaming. So I wired everything through &lt;code&gt;:streamGenerateContent?alt=sse&lt;/code&gt; and assumed it would Just Work.&lt;/p&gt;

&lt;p&gt;It did, for chat. &lt;strong&gt;It returned &lt;code&gt;400 Bad Request&lt;/code&gt; for AULA's Practice and Mind Map features.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The DevTools investigation revealed: when the prompt requested structured output (JSON, SVG, Mermaid), the streaming endpoint failed with &lt;code&gt;400&lt;/code&gt; while the non-streaming endpoint succeeded with the same payload. I never got a clear root cause from the API — it may be a Gemma-specific quirk in how &lt;code&gt;streamGenerateContent&lt;/code&gt; handles certain &lt;code&gt;responseSchema&lt;/code&gt; configurations or thinking-mode trailers.&lt;/p&gt;

&lt;p&gt;The fix that unblocked everything: &lt;strong&gt;two separate API client paths&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// For chat — streaming, long responses&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;streamChat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;onToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/gemma-4-26b-a4b-it:streamGenerateContent?alt=sse&amp;amp;key=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ...parse SSE chunks, call onToken per chunk&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// For structured output — single-shot, short responses&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;cloudGenerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CloudOptions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/gemma-4-26b-a4b-it:generateContent?key=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;systemInstruction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;system&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
      &lt;span class="na"&gt;generationConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxOutputTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="c1"&gt;// Filter out "thought" parts (Gemma 4 thinking mode)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;thought&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; for short structured outputs, streaming buys you nothing. The user is waiting for one complete artifact, not a slow reveal of text. Use &lt;code&gt;generateContent&lt;/code&gt;. Save streaming for genuine chat.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;One more detail worth flagging: Gemma 4 has a &lt;strong&gt;thinking mode&lt;/strong&gt; that emits "thought" parts in the response. If you naively concatenate all &lt;code&gt;parts[].text&lt;/code&gt;, you'll surface the model's chain-of-thought in the user-visible output. Filter on &lt;code&gt;part.thought === true&lt;/code&gt; and skip those. AULA's chat looked very weird until I added that filter — the model was literally showing its work to the student, which is not the goal.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this means for developers shipping Gemma 4 today
&lt;/h2&gt;

&lt;p&gt;If you're building with Gemma 4 in 2026, the patterns I'd internalize before writing a single line of code:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;MediaPipe for browser, period.&lt;/strong&gt; Don't waste a week on &lt;code&gt;transformers.js&lt;/code&gt; benchmarks. Migrate or start there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick the model that fits in VRAM, not the model that benchmarks best.&lt;/strong&gt; Spilling to PCIe destroys throughput. E2B is the only realistic browser model in 2026.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design routing as a UX surface.&lt;/strong&gt; Small models can't do everything. Make the limitation visible and let the user opt into cloud where it matters. Honesty beats hiding limitations behind retries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat &lt;code&gt;LlmInference&lt;/code&gt; as a single-threaded mutex.&lt;/strong&gt; Queue your requests, abort on unmount, expose a recovery path. The cost of not doing this is a frustrating "the AI broke" experience that the user can't diagnose.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming is for chat. &lt;code&gt;generateContent&lt;/code&gt; is for everything else.&lt;/strong&gt; Don't fight the API.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These five patterns saved me probably a week of additional debugging once I internalized them. AULA exists because Gemma 4 is genuinely good enough to run in a browser tab — but it only feels good to use because the patterns above turn the rough edges into smooth UX.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm hopeful about
&lt;/h2&gt;

&lt;p&gt;The interesting thing about all five patterns above: &lt;strong&gt;none of them are about Gemma 4's quality.&lt;/strong&gt; They're about deployment ergonomics. The model itself is remarkable. A 2-billion-parameter open model that runs at 15 tok/s in a browser tab and can hold a real tutoring conversation with a high school student is a thing that genuinely did not exist 18 months ago.&lt;/p&gt;

&lt;p&gt;For my specific use case — students in rural Latin America who have no other access to AI tools — Gemma 4 is the first model that crosses the &lt;strong&gt;practical viability line&lt;/strong&gt;. It's small enough to download once over a school WiFi connection. It's capable enough to teach. It runs offline. It's free.&lt;/p&gt;

&lt;p&gt;If you're working on local-first AI for any underserved population, I'd encourage you to start with Gemma 4. The deployment patterns above will save you a week. The model will do the rest.&lt;/p&gt;

&lt;p&gt;If you want to see what the patterns look like in a finished product, AULA is open source under MIT: &lt;a href="https://github.com/jpablortiz96/aula" rel="noopener noreferrer"&gt;github.com/jpablortiz96/aula&lt;/a&gt;. Pull requests welcome.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About the author:&lt;/strong&gt; I'm a solo founder in Cali, Colombia building educational tech for Latin American students. AULA was built solo in 11 days for this challenge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Companion submission (Build track):&lt;/strong&gt; &lt;a href="https://dev.to/jpablortiz96/aula-the-ai-tutor-that-fits-in-a-browser-tab-built-for-the-students-the-internet-leaves-behind-253n"&gt;AULA — The AI tutor that fits in a browser tab&lt;/a&gt; — live demo, video walkthrough, full architecture.&lt;/p&gt;

&lt;p&gt;🇨🇴 Made in LATAM, for the students the world forgot.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>webdev</category>
    </item>
    <item>
      <title>AULA — The AI tutor that fits in a browser tab, built for the students the internet leaves behind</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sat, 23 May 2026 04:13:16 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/aula-the-ai-tutor-that-fits-in-a-browser-tab-built-for-the-students-the-internet-leaves-behind-253n</link>
      <guid>https://dev.to/jpablortiz96/aula-the-ai-tutor-that-fits-in-a-browser-tab-built-for-the-students-the-internet-leaves-behind-253n</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AULA&lt;/strong&gt; is a complete AI tutoring platform that runs Google's Gemma 4 entirely inside the browser — no server, no account, no internet required after the first 1.5 GB download. It is designed for the &lt;strong&gt;65+ million Latin American students&lt;/strong&gt; living in areas where reliable internet is the exception, not the norm.&lt;/p&gt;

&lt;p&gt;The premise is simple: if Gemma 4 can run on a Raspberry Pi 5, it can run on a teacher's laptop in rural Boyacá, Colombia. With WebGPU and MediaPipe, this is now possible — and AULA is what that looks like as a finished product.&lt;/p&gt;

&lt;h3&gt;
  
  
  The problem AULA solves
&lt;/h3&gt;

&lt;p&gt;In Latin America, ~40% of students live with unreliable, capped, or non-existent connectivity. ChatGPT, Gemini, Khan Academy's AI tutor — all require a stable connection. The very tools that could close the global education gap are inaccessible exactly where they are needed most.&lt;/p&gt;

&lt;p&gt;AULA flips this: the AI runs &lt;em&gt;on the student's device&lt;/em&gt;, not on a server thousands of miles away.&lt;/p&gt;

&lt;h3&gt;
  
  
  What AULA does — offline (100% local, Gemma 4 E2B)
&lt;/h3&gt;

&lt;p&gt;After loading once, these features work with WiFi off, in airplane mode, in a rural school with no signal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🎓 &lt;strong&gt;Conversational tutor&lt;/strong&gt; — chat with Gemma 4 in natural language. Full LaTeX rendering for math and science. ~15 tokens/sec on a mid-range laptop GPU.&lt;/li&gt;
&lt;li&gt;🧮 &lt;strong&gt;Scientific calculator&lt;/strong&gt; that teaches — visual keypad with trig functions, exponents, roots. Gemma 4 doesn't just solve. It explains the why.&lt;/li&gt;
&lt;li&gt;🎙️ &lt;strong&gt;Voice tutoring (bidirectional)&lt;/strong&gt; — ask by speaking, listen to the response. Optional hands-free mode chains them together.&lt;/li&gt;
&lt;li&gt;🦉 &lt;strong&gt;Socratic mode&lt;/strong&gt; — Gemma 4 stops giving answers and only asks guiding questions. Pedagogy-first.&lt;/li&gt;
&lt;li&gt;🤔 &lt;strong&gt;"Explain it simpler"&lt;/strong&gt; — three escalating reformulation levels on demand.&lt;/li&gt;
&lt;li&gt;💡 &lt;strong&gt;Conceptual error detection&lt;/strong&gt; — Gemma 4 diagnoses &lt;em&gt;which&lt;/em&gt; concept the student misunderstood, not just "wrong, try again".&lt;/li&gt;
&lt;li&gt;📚 &lt;strong&gt;Persistent study sessions&lt;/strong&gt; in IndexedDB. No cloud sync ever.&lt;/li&gt;
&lt;li&gt;♿ &lt;strong&gt;Accessibility first&lt;/strong&gt; — high contrast, large text, easy reading mode (for dyslexia), auto-read responses.&lt;/li&gt;
&lt;li&gt;🌍 &lt;strong&gt;Spanish ↔ English&lt;/strong&gt; — full i18n. System prompts translate, not just the labels.&lt;/li&gt;
&lt;li&gt;🏆 &lt;strong&gt;Local gamification&lt;/strong&gt; — XP, levels, streak, achievements. All in the browser.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What AULA does — Cloud Boost (optional, Gemma 4 26B-A4B)
&lt;/h3&gt;

&lt;p&gt;For features that require strict structured output (which is beyond what a 2B-parameter model can do reliably), AULA routes through the user's own free Google AI Studio API key:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✍️ &lt;strong&gt;Handwritten whiteboard&lt;/strong&gt; — draw equations with finger or mouse, Gemma 4 reads and solves.&lt;/li&gt;
&lt;li&gt;📷 &lt;strong&gt;Photo OCR + reasoning&lt;/strong&gt; — point camera at a printed exercise, get a step-by-step solution.&lt;/li&gt;
&lt;li&gt;♾️ &lt;strong&gt;Infinite adaptive practice&lt;/strong&gt; — exercises that never repeat, with difficulty calibrated dynamically.&lt;/li&gt;
&lt;li&gt;🎯 &lt;strong&gt;Interactive student quiz&lt;/strong&gt; — self-assessment with scoring and per-error conceptual review.&lt;/li&gt;
&lt;li&gt;👩‍🏫 &lt;strong&gt;Teacher mode with PDF export&lt;/strong&gt; — generate quizzes, export student/teacher PDFs ready to print.&lt;/li&gt;
&lt;li&gt;🎨 &lt;strong&gt;SVG illustrations&lt;/strong&gt; — Gemma 4 generates educational diagrams.&lt;/li&gt;
&lt;li&gt;🗺️ &lt;strong&gt;Mermaid mind maps&lt;/strong&gt; — concept diagrams rendered interactively, downloadable as PNG/SVG.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Critical:&lt;/strong&gt; Cloud Boost is &lt;em&gt;always opt-in&lt;/em&gt;. AULA never sends data without an explicit API key configured by the user. The core educational experience never requires the internet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🎥 &lt;strong&gt;Watch the 2-minute walkthrough:&lt;/strong&gt; &lt;a href="https://youtu.be/d0jN8Kw_Cz4" rel="noopener noreferrer"&gt;https://youtu.be/d0jN8Kw_Cz4&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Live demo:&lt;/strong&gt; &lt;a href="https://aula.run" rel="noopener noreferrer"&gt;https://aula.run&lt;/a&gt; &lt;em&gt;(or local: &lt;code&gt;pnpm dev -p 3100&lt;/code&gt; after cloning)&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key screenshots
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Chat tutor running 100% locally with full LaTeX rendering&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3h84wj3p1tefo8or7uxk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3h84wj3p1tefo8or7uxk.png" alt="AULA chat with Gemma 4 local" width="757" height="787"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mermaid mind maps generated by Gemma 4 — click to enlarge, download as PNG&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tif53hcthb7renoxfc3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tif53hcthb7renoxfc3.png" alt="Mind map of photosynthesis" width="800" height="897"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SVG illustrations — educational diagrams generated by Gemma 4&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfujjl4gljda0x0t9mdo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfujjl4gljda0x0t9mdo.png" alt="Pythagoras illustration" width="800" height="899"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scientific calculator that explains, powered locally&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbafafksvwozkk8gpy4r0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbafafksvwozkk8gpy4r0.png" alt="Calculator solving sin(π/2) + 2^3" width="635" height="882"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Teacher mode with PDF export — ready for classroom&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg5uxwzal5gj95jc36xkm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg5uxwzal5gj95jc36xkm.png" alt="Teacher mode quiz" width="790" height="851"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accessibility built-in: high contrast mode&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fny207nftyvar075b5gje.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fny207nftyvar075b5gje.png" alt="High contrast mode" width="635" height="850"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/jpablortiz96/aula" rel="noopener noreferrer"&gt;https://github.com/jpablortiz96/aula&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repo includes a comprehensive README with architecture diagrams, hardware benchmarks across devices (Raspberry Pi 5 to RTX 3050 to MacBook M3), full tech stack documentation, and a roadmap for v1.1 through v3.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;License:&lt;/strong&gt; MIT&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;p&gt;AULA uses a &lt;strong&gt;dual-engine architecture&lt;/strong&gt; with intentional model selection for each tier:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Where it runs&lt;/th&gt;
&lt;th&gt;What it powers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E2B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~1.5 GB (q4f16 quantized)&lt;/td&gt;
&lt;td&gt;Browser, via MediaPipe + WebGPU&lt;/td&gt;
&lt;td&gt;All offline features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 26B-A4B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cloud (MoE)&lt;/td&gt;
&lt;td&gt;Gemini API&lt;/td&gt;
&lt;td&gt;Structured-output features&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why Gemma 4 E2B for local
&lt;/h3&gt;

&lt;p&gt;The E2B variant is the only Gemma 4 model that fits realistically on consumer hardware while preserving the multimodal capability path. It runs at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~15 tokens/sec on an NVIDIA RTX 3050 laptop&lt;/li&gt;
&lt;li&gt;~20-25 tokens/sec on a MacBook M3&lt;/li&gt;
&lt;li&gt;~7 tokens/sec on a Raspberry Pi 5 (CPU fallback)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This range covers &lt;strong&gt;every realistic device a Latin American student or teacher might have access to&lt;/strong&gt; — from a $80 SBC to a school laptop. The 31B Dense model would never fit in a browser tab; the 26B MoE requires server-grade resources. E2B is the &lt;em&gt;only&lt;/em&gt; viable choice for the rural offline use case, and that's exactly why I picked it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Gemma 4 26B-A4B for cloud-enhanced features
&lt;/h3&gt;

&lt;p&gt;Some features in AULA require strict structured output: JSON for quiz exercises, syntactically-valid Mermaid for mind maps, coherent SVG for illustrations. &lt;strong&gt;Small models are unreliable for this&lt;/strong&gt; — they're brilliant at conversation but tend to add prose around JSON, produce malformed SVG, or break Mermaid syntax.&lt;/p&gt;

&lt;p&gt;Rather than fight this limitation or hide it, AULA makes the routing &lt;strong&gt;explicit and visible to the user&lt;/strong&gt;. Every screen shows which engine answered: green badge for local, blue badge for cloud. The 26B-A4B variant gives me near-31B quality at substantially lower latency thanks to its mixture-of-experts architecture — ideal for short structured outputs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical challenges I solved
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. transformers.js was not viable on NVIDIA Optimus laptops.&lt;/strong&gt;&lt;br&gt;
My first prototype used &lt;code&gt;transformers.js&lt;/code&gt; + WebGPU. On an RTX 3050, I got 2 tokens/sec because dispatch was routing through the iGPU. Migrating to &lt;strong&gt;MediaPipe's WebGPU delegate&lt;/strong&gt; unlocked 14-16 tokens/sec on the same hardware — a 7x improvement. MediaPipe is Google's official runtime for Gemma 4 on edge, and the difference is real.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Concurrency on &lt;code&gt;LlmInference&lt;/code&gt; is exclusive.&lt;/strong&gt;&lt;br&gt;
A single MediaPipe &lt;code&gt;LlmInference&lt;/code&gt; instance processes one prompt at a time. When &lt;code&gt;/chat&lt;/code&gt; and &lt;code&gt;/practice&lt;/code&gt; competed for the same singleton, the model locked with &lt;code&gt;Previous invocation or loading is still ongoing&lt;/code&gt;. I implemented a &lt;strong&gt;FIFO queue with abort propagation&lt;/strong&gt; across navigations, plus a &lt;code&gt;forceReset()&lt;/code&gt; recovery path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Gemma 4 26B does not support &lt;code&gt;streamGenerateContent&lt;/code&gt; reliably.&lt;/strong&gt;&lt;br&gt;
This took an afternoon of DevTools debugging to identify: calling &lt;code&gt;:streamGenerateContent&lt;/code&gt; returned 400, while &lt;code&gt;:generateContent&lt;/code&gt; (no streaming) worked perfectly. The fix was creating a separate &lt;code&gt;cloudNoStream.ts&lt;/code&gt; helper for Practice, Illustrator, and Mermaid — features that don't benefit from streaming anyway since the user is waiting for one complete response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Easy Reading Mode is more than a CSS toggle.&lt;/strong&gt;&lt;br&gt;
For students with dyslexia or reading difficulties, AULA changes both the visual presentation (letter spacing, line height, max-width) &lt;em&gt;and&lt;/em&gt; the system prompt sent to Gemma 4 ("Short sentences. Simple vocabulary. One idea per line."). This is the kind of accessibility that AI uniquely enables — the model adapts its output style, not just the typography.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Gemma 4 unlocked that wasn't possible 18 months ago
&lt;/h3&gt;

&lt;p&gt;Browser-native inference at this quality was genuinely impossible until WebGPU stabilized. AULA is &lt;strong&gt;only buildable in 2026&lt;/strong&gt;. The combination of Gemma 4's open weights, WebGPU's GPU access, and MediaPipe's optimized runtime is what makes a Pi-friendly AI tutor a real thing, not a thought experiment.&lt;/p&gt;

&lt;p&gt;For 65 million students in Latin America who have been excluded from the AI revolution, this matters more than I can describe in this post.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tech stack:&lt;/strong&gt; Next.js 15, TypeScript strict, Tailwind v4, MediaPipe LLM Inference, WebGPU, Gemini API (REST + SSE), Zustand, IndexedDB, jsPDF, Mermaid, tesseract.js, Web Speech API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Built solo in 11 days&lt;/strong&gt; for the DEV.to Gemma 4 Challenge.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;AULA is open source under MIT. Fork it, run it in your school, contribute to it. If you're a teacher in a low-connectivity region and want help deploying AULA, open an issue on GitHub.&lt;/p&gt;

&lt;p&gt;🇨🇴 Made in LATAM, for the students the world forgot.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
    <item>
      <title>Building AccessBridge AI: How 5 AI Agents Collaborate to Make the Web Accessible</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sat, 28 Mar 2026 02:04:39 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/building-accessbridge-ai-how-5-ai-agents-collaborate-to-make-the-web-accessible-24kf</link>
      <guid>https://dev.to/jpablortiz96/building-accessbridge-ai-how-5-ai-agents-collaborate-to-make-the-web-accessible-24kf</guid>
      <description>&lt;p&gt;&lt;em&gt;Built for the JS AI Build-a-thon 2026 — Agents for Impact&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem That Inspired Us
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;96.3% of the top million websites fail basic accessibility standards.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That statistic, from the 2024 WebAIM Million report, stopped me cold. We're not talking about edge cases or rare corner cases — we're talking about the overwhelming majority of the web being effectively inaccessible to 1.3 billion people who live with some form of disability.&lt;/p&gt;

&lt;p&gt;The tools that exist today are part of the problem. Axe, WAVE, Lighthouse — these are excellent auditors. They'll tell you that you have 23 accessibility violations. What they won't do is fix a single one of them. The burden always falls back on the developer, who may not have the time, budget, or expertise to address every flag.&lt;/p&gt;

&lt;p&gt;We wanted to change the question from &lt;em&gt;"Where are the problems?"&lt;/em&gt; to &lt;em&gt;"Here's the fixed version — would you like to use it?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That's AccessBridge AI.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Built
&lt;/h2&gt;

&lt;p&gt;AccessBridge AI is a multi-agent system where 5 specialized AI agents collaborate in real-time to transform any web page into universally accessible content. You paste a URL. Fifteen seconds later, you get back:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An &lt;strong&gt;accessibility score&lt;/strong&gt; (before and after) on a 0-100 scale&lt;/li&gt;
&lt;li&gt;A list of every issue found, with the agent that found it and the confidence score&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;automatically transformed HTML file&lt;/strong&gt; with fixes applied&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;full decision log&lt;/strong&gt; explaining every choice the system made&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;WCAG breakdown&lt;/strong&gt; across all four principles: Perceivable, Operable, Understandable, Robust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you analyze a URL, here's what happens under the hood:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The &lt;strong&gt;Orchestrator&lt;/strong&gt; fetches the HTML server-side (15s timeout, custom User-Agent)&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Scanner&lt;/strong&gt;, &lt;strong&gt;Vision&lt;/strong&gt;, &lt;strong&gt;Simplifier&lt;/strong&gt;, and &lt;strong&gt;Navigator&lt;/strong&gt; agents all run in parallel&lt;/li&gt;
&lt;li&gt;The Orchestrator &lt;strong&gt;resolves conflicts&lt;/strong&gt; between agents (more on this below)&lt;/li&gt;
&lt;li&gt;High-confidence fixes are &lt;strong&gt;automatically applied&lt;/strong&gt; to the HTML&lt;/li&gt;
&lt;li&gt;Low-confidence suggestions are flagged for &lt;strong&gt;human review&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Scores are calculated and the full result is returned to the UI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On our test runs: average score improvement of &lt;strong&gt;+31 to +42 points&lt;/strong&gt;, depending on how accessibility-challenged the original page was.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The BaseAgent Contract
&lt;/h3&gt;

&lt;p&gt;Every agent in the system implements a single interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;BaseAgent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;AgentResult&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Every agent receives raw HTML and a URL, and returns a structured &lt;code&gt;AgentResult&lt;/code&gt; containing issues found, fixes proposed, metadata, and a confidence score. This contract is what makes the system composable — swapping the cloud Vision agent for an offline heuristic agent requires zero changes to the Orchestrator.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;AgentResult&lt;/code&gt; shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AgentResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AccessibilityIssue&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;attribute&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;oldValue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;newValue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;startTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;endTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every fix carries a &lt;code&gt;selector&lt;/code&gt; (CSS selector targeting the element), the old value, the new value, and — critically — a human-readable &lt;code&gt;reason&lt;/code&gt;. This is what powers the Decision Log in the UI.&lt;/p&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;isEnhancement&lt;/code&gt; Flag: An Honest Score Model
&lt;/h3&gt;

&lt;p&gt;This is one of the subtler design decisions that took three iterations to get right.&lt;/p&gt;

&lt;p&gt;The problem: Vision and Simplifier agents find &lt;em&gt;opportunities&lt;/em&gt; — images that could have better alt text, paragraphs that could be simpler. These aren't pre-existing defects that the website owner created. They're improvements AccessBridge can make. If we counted them in the &lt;code&gt;scoreBefore&lt;/code&gt; calculation, we'd be artificially penalizing the site for things it never claimed to do.&lt;/p&gt;

&lt;p&gt;The solution: an &lt;code&gt;isEnhancement&lt;/code&gt; flag on &lt;code&gt;AccessibilityIssue&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AccessibilityIssue&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="nl"&gt;fixApplied&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="cm"&gt;/** True for Vision / Simplifier issues that represent *improvements*
   *  AccessBridge found, not pre-existing defects. These are shown in the
   *  UI but never penalise scoreBefore, and their fixes (if applied)
   *  add to scoreAfter. */&lt;/span&gt;
  &lt;span class="nl"&gt;isEnhancement&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scoring model then becomes additive — honest and non-decreasing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// scoreBefore: only real pre-existing defects (Scanner + Navigator)&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;calcScoreBefore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IssueLike&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isEnhancement&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SCANNER&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;NAVIGATOR&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;critical&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;major&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;                              &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// scoreAfter: scoreBefore + points earned per applied fix&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;FIX_POINTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;vision&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// contextual alt text&lt;/span&gt;
  &lt;span class="na"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// structural fixes have high WCAG impact&lt;/span&gt;
  &lt;span class="na"&gt;simplifier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// readability improvements&lt;/span&gt;
  &lt;span class="na"&gt;scanner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;calcScoreAfter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IssueLike&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="nx"&gt;before&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fixApplied&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;FIX_POINTS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;before&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;gain&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Navigator gets the highest fix points because structural changes — adding landmark regions, fixing heading hierarchy, inserting skip links — have the biggest real-world impact for keyboard and screen reader users.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parallel Execution via Promise.all
&lt;/h3&gt;

&lt;p&gt;All four specialist agents run concurrently. The Orchestrator wraps each in a try/catch so one failing agent (e.g., Azure timeout) doesn't bring down the entire analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;settled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emitEvent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;WORKING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; started analyzing…`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emitEvent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DONE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; found &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; issues`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;issueCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fixCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emitEvent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ERROR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; failed: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each &lt;code&gt;emitEvent&lt;/code&gt; call feeds the real-time agent timeline in the UI via an &lt;code&gt;EventEmitter&lt;/code&gt; pattern — the Orchestrator extends Node's &lt;code&gt;EventEmitter&lt;/code&gt;, and the API route streams events back to the browser using a readable stream.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Conflict Resolution Engine
&lt;/h3&gt;

&lt;p&gt;Agents running in parallel will inevitably step on each other's toes. We handle two conflict types:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Type 1: Same WCAG rule, same element, different agents.&lt;/strong&gt;&lt;br&gt;
The Scanner might flag &lt;code&gt;img:nth-of-type(3)&lt;/code&gt; for missing alt text (WCAG 1.1.1), and so might the Navigator. We deduplicate by keeping the first reporter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;seenIssues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;::&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wcagRule&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;seenIssues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Log conflict, first-reporter wins&lt;/span&gt;
      &lt;span class="nx"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;winner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;First-reporter wins&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;seenIssues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Type 2: Vision vs Simplifier — the context preservation conflict.&lt;/strong&gt;&lt;br&gt;
This is the interesting one. Imagine Vision generates the alt text: &lt;em&gt;"Promotes transforming your future through education and growth opportunities"&lt;/em&gt; for an image inside a paragraph. Then Simplifier comes along and rewrites that paragraph to be shorter. Now the alt text no longer makes sense in context — screen reader users would hear the simplified text followed by the original (now out-of-context) alt text.&lt;/p&gt;

&lt;p&gt;Our rule: &lt;strong&gt;Vision always wins over Simplifier on the same element.&lt;/strong&gt; If a Vision-fixed image lives inside a paragraph that Simplifier wants to rewrite, that paragraph is blocked:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Find text blocks where Vision has fixed an img inside&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;simplFix&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;simplifierResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simplFix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imgSel&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;visionImgSelectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;$block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imgSel&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Vision wins — block the Simplifier fix&lt;/span&gt;
      &lt;span class="nx"&gt;blockedSimplifierSelectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simplFix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;winner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VISION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Alt text generated by Vision Agent is calibrated to the image&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;s &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;surrounding context. Rewriting that context could make the alt text &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;misleading for screen reader users.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This conflict — and its resolution — is recorded in the Decision Log and surfaced in the Responsible AI panel so users can understand why a particular fix wasn't applied.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Secret Sauce: Contextual Alt Text
&lt;/h2&gt;

&lt;p&gt;The Vision Agent is where Azure OpenAI earns its place in the system.&lt;/p&gt;

&lt;p&gt;Most accessibility scanners will tell you: "This image has no alt text." The best ones will say: "Add meaningful alt text." But what does "meaningful" mean for a specific image on a specific page?&lt;/p&gt;

&lt;p&gt;Before even calling the API, we extract rich context from the DOM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ExtractedContext&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;imageUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;imageType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;decorative&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;functional&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;informative&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;heading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// nearest ancestor or preceding h1-h6&lt;/span&gt;
  &lt;span class="nl"&gt;surroundingText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// text content of parent element&lt;/span&gt;
  &lt;span class="nl"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// figcaption if present&lt;/span&gt;
  &lt;span class="nl"&gt;linkText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// text of wrapping &amp;lt;a&amp;gt; if present&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="c1"&gt;// title attribute&lt;/span&gt;
  &lt;span class="nl"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// the raw HTML element&lt;/span&gt;
  &lt;span class="nl"&gt;currentAlt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent classifies each image into one of three roles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Decorative&lt;/strong&gt; — purely visual, no information content → &lt;code&gt;alt=""&lt;/code&gt; (handled by Scanner)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functional&lt;/strong&gt; — inside a link or button → alt text describes the destination/action&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Informative&lt;/strong&gt; — content image → alt text describes what the image communicates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This role classification shapes the system prompt sent to GPT-4o:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are an accessibility expert generating alt text for a web image.

Image role: FUNCTIONAL (this image is inside a link or button)
For functional images, describe the DESTINATION or ACTION, not just what you see.
Generate alt text that a screen reader user would find helpful.

RULES:
- Be concise (under 125 characters)
- Describe PURPOSE, not visual appearance
- If it's functional, what does it DO or WHERE does it go?
- Do NOT start with "Image of", "Picture of", "Photo of"
- Do NOT include quotes in your response

Context:
- Surrounding text: "Learn more about our engineering bootcamp programs"
- Link text: "Apply now"
- Nearest heading: "Transform Your Career in Tech"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; &lt;em&gt;"Apply for engineering bootcamp — Transform Your Career in Tech"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Without this context, a generic vision model might return: &lt;em&gt;"A button with text."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The agent also penalizes its own confidence score when context is thin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;heading&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;surroundingText&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;linkText&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;hasContext&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mf"&gt;0.88&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.72&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A confidence below 0.5 means the fix is surfaced as a suggestion, never auto-applied. This is human-in-the-loop by design — the system acknowledges its own uncertainty.&lt;/p&gt;




&lt;h2&gt;
  
  
  Going Offline: Accessibility Without Internet
&lt;/h2&gt;

&lt;p&gt;We built two fully functional modes from day one, and the offline mode is not a degraded fallback — it's a genuine capability with a specific use case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; Because the communities that most need accessibility tooling — nonprofits, government agencies in emerging markets, small educational institutions — often have unreliable or metered internet connectivity. A tool that stops working without cloud connectivity isn't truly accessible.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;☁️ Cloud&lt;/th&gt;
&lt;th&gt;📡 Offline&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scanner (20+ WCAG rules)&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Full (same code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Navigator (structure)&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Full (same code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vision (alt text)&lt;/td&gt;
&lt;td&gt;AI-powered via GPT-4o&lt;/td&gt;
&lt;td&gt;5-tier heuristic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Simplifier (readability)&lt;/td&gt;
&lt;td&gt;AI rewriting&lt;/td&gt;
&lt;td&gt;Deterministic splitting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typical speed&lt;/td&gt;
&lt;td&gt;~12 seconds&lt;/td&gt;
&lt;td&gt;~2 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privacy&lt;/td&gt;
&lt;td&gt;Processed via Azure&lt;/td&gt;
&lt;td&gt;Zero external requests&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Offline Vision Heuristic
&lt;/h3&gt;

&lt;p&gt;When no API key is present, the Vision agent falls back to a 5-tier priority system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tier 1: &amp;lt;img&amp;gt; inside &amp;lt;a&amp;gt;
  → "Link to {link text}" or "Link to {domain name}"
  Rationale: functional images communicate navigation intent

Tier 2: &amp;lt;figure&amp;gt; with &amp;lt;figcaption&amp;gt;
  → Use the caption verbatim (the author already wrote it)

Tier 3: Meaningful filename
  → "hero-education-program.jpg" → "Hero education program image"
  (strip extension, convert hyphens/underscores to spaces, title-case)

Tier 4: Nearest heading in the DOM
  → "Image related to: {heading text}"

Tier 5: Image URL domain
  → "Image — cdn.example.com"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All offline Vision issues are marked &lt;code&gt;isEnhancement: true&lt;/code&gt; with confidence &lt;code&gt;0.5&lt;/code&gt;, which means they're auto-applied (the threshold is &lt;code&gt;≥ 0.5&lt;/code&gt;) but don't penalize the before-score.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Offline Simplifier
&lt;/h3&gt;

&lt;p&gt;The offline Simplifier uses a deterministic algorithm instead of calling GPT-4o. For any &lt;code&gt;&amp;lt;p&amp;gt;&lt;/code&gt; element with a sentence over 30 words, it attempts a three-pass split:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pass 1 — Natural break (comma near midpoint):
  Find the comma closest to the ±30% midpoint of the sentence.
  "The program, which was founded in 2019, has helped over 1,000 students..."
  → "The program, which was founded in 2019, has helped over 1,000 students..."
  → Split at comma before "has"

Pass 2 — Conjunction split:
  Find the first coordinating/subordinating conjunction after the midpoint:
  (and, but, which, because, however, although, while, whereas...)
  → Split before the conjunction, add a period

Pass 3 — Hard midpoint:
  If no natural break found, split at the word nearest the midpoint.
  (Last resort — preserves meaning better than cutting arbitrarily)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result on Wikipedia: Cloud mode +37 pts, Offline mode +31 pts. The gap is real but smaller than you'd expect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Responsible AI: Not an Afterthought
&lt;/h2&gt;

&lt;p&gt;We made a deliberate decision early: transparency and human oversight are architectural requirements, not features we'd add later.&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;AgentEvent&lt;/code&gt; is timestamped and stored:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AgentEvent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// WORKING | DONE | ERROR | CONFLICT&lt;/span&gt;
  &lt;span class="nl"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;data&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Decision Log in the UI renders every event — including conflicts — in chronological order. Conflict events are highlighted in amber. The Responsible AI panel shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Transparency&lt;/strong&gt;: total number of agent decisions logged&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-Loop&lt;/strong&gt;: count of suggestions vs auto-applied fixes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence Scoring&lt;/strong&gt;: breakdown of high/medium/low confidence fixes per agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy&lt;/strong&gt;: mode used and data retention policy (none — all processing is ephemeral)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The confidence threshold for auto-apply is explicitly &lt;code&gt;≥ 0.5&lt;/code&gt;. Anything below that is shown as a suggestion with a reason: &lt;em&gt;"Confidence 0.42 — flagged for human review."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This design reflects a real belief: AI systems that affect people's lives — and accessibility directly affects how 1.3 billion people experience the web — need to be auditable, explainable, and humble about their own limitations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building with AI: Our Claude Code Workflow
&lt;/h2&gt;

&lt;p&gt;This section is the most honest part of this post.&lt;/p&gt;

&lt;p&gt;We used &lt;strong&gt;Claude Code&lt;/strong&gt; (Anthropic's CLI coding assistant) for the vast majority of this project. Here's what that actually looked like, warts included.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Worked Exceptionally Well
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Generating the type system.&lt;/strong&gt; We gave Claude Code the exact interfaces we wanted and it produced clean, idiomatic TypeScript on the first try. The &lt;code&gt;AccessibilityIssue&lt;/code&gt;, &lt;code&gt;AgentResult&lt;/code&gt;, and &lt;code&gt;AnalysisResult&lt;/code&gt; interfaces required almost no revision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Scanner Agent.&lt;/strong&gt; We asked for a WCAG 2.1 auditor covering all four principles. Claude generated 20+ detection rules using cheerio, each wrapped in its own try/catch, with proper severity and WCAG rule codes. This would have taken a week to research and write manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UI components with specific constraints.&lt;/strong&gt; When we described the exact visual behavior we wanted — "a segmented control using visually-hidden radio inputs, two options, with an offline disclaimer that animates in with aria-live='polite'" — we got exactly that. No hallucinated React libraries, no unnecessary dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging TypeScript errors across a multi-agent system.&lt;/strong&gt; When we hit a &lt;code&gt;TS2322&lt;/code&gt; error about &lt;code&gt;IssueSeverity&lt;/code&gt; string literals, we described the error and the surrounding code, and got the right fix immediately: import the enum and use &lt;code&gt;IssueSeverity.MAJOR&lt;/code&gt; instead of the string &lt;code&gt;'major'&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Didn't Work (At First)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The scoring algorithm needed three iterations to get right.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our first attempt: a single &lt;code&gt;calcScore(issues, afterFixes: boolean)&lt;/code&gt; function that counted all issues and tried to subtract fixed ones. When we tested in offline mode, scores were going &lt;em&gt;down&lt;/em&gt; — from 72 to 51 — because Vision and Simplifier were generating issues that got counted against the baseline.&lt;/p&gt;

&lt;p&gt;Second attempt: separate before/after calculations. Better, but still wrong — the "after" score was recounting all unfixed issues instead of adding earned points.&lt;/p&gt;

&lt;p&gt;Third attempt: the additive model with &lt;code&gt;isEnhancement&lt;/code&gt; flag described above. The key insight was identifying &lt;em&gt;why&lt;/em&gt; the model was wrong, not just that it was wrong.&lt;/p&gt;

&lt;p&gt;The lesson: &lt;strong&gt;AI-assisted coding works best when you can articulate the bug precisely.&lt;/strong&gt; "The offline score goes down" didn't help. "The before-score counts Vision issues that are improvements, not defects — they shouldn't appear in the baseline" produced an exact, correct solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complex cheerio selectors were brittle.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Early versions of the agents generated selectors like &lt;code&gt;div.container &amp;gt; section:first-child &amp;gt; img:nth-child(3)&lt;/code&gt;. These worked on the test page but broke on real sites. We had to manually establish the selector priority rule (id &amp;gt; class &amp;gt; src attribute &amp;gt; nth-of-type) and explain it precisely before the generated code became stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conflict resolution logic needed manual refinement.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The initial conflict resolution was purely deduplication. The Vision-vs-Simplifier context preservation conflict — where rewriting a paragraph could make an adjacent alt text misleading — was a design decision we arrived at ourselves, then asked Claude to implement. The "what" came from us; the "how" came from Claude.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our Prompting Strategy
&lt;/h3&gt;

&lt;p&gt;The difference between a prompt that works and one that doesn't, in our experience:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specify the interface, not just the behavior.&lt;/strong&gt;&lt;br&gt;
Instead of: &lt;em&gt;"Create a Scanner Agent that checks accessibility"&lt;/em&gt;&lt;br&gt;
We used: &lt;em&gt;"Create a Scanner Agent that implements the BaseAgent interface below. It should use cheerio to parse the HTML and detect these specific WCAG violations, returning issues with these exact fields..."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Describe bugs with reproduction steps, not symptoms.&lt;/strong&gt;&lt;br&gt;
Instead of: &lt;em&gt;"The score is wrong"&lt;/em&gt;&lt;br&gt;
We used: &lt;em&gt;"The &lt;code&gt;scoreBefore&lt;/code&gt; function at line 35 is including Vision agent issues (marked &lt;code&gt;isEnhancement: true&lt;/code&gt;) in its baseline count. These should be excluded. The fix should modify the filter condition to check &lt;code&gt;!i.isEnhancement &amp;amp;&amp;amp;&lt;/code&gt; before the agentType check."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iterate on real output, not hypothetical code.&lt;/strong&gt;&lt;br&gt;
We ran the app, analyzed a real URL, saw the output, identified what was wrong, then described that specific wrong output and the expected correct output. Every iteration was grounded in real behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example prompts we used:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;Create&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;Vision&lt;/span&gt; &lt;span class="nx"&gt;Agent&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="sr"&gt;/src/&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;vision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;It&lt;/span&gt; &lt;span class="nx"&gt;must&lt;/span&gt; &lt;span class="nx"&gt;implement&lt;/span&gt; &lt;span class="nx"&gt;BaseAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="nx"&gt;It&lt;/span&gt; &lt;span class="nx"&gt;should&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Use&lt;/span&gt; &lt;span class="nx"&gt;cheerio&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;find&lt;/span&gt; &lt;span class="nx"&gt;all&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;elements&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;missing&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;generic&lt;/span&gt; &lt;span class="nx"&gt;alt&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;For&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt; &lt;span class="nx"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;extract&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="nx"&gt;object&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;definition&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Call&lt;/span&gt; &lt;span class="nx"&gt;Azure&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;exact&lt;/span&gt; &lt;span class="nx"&gt;system&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Mark&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;isEnhancement&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt;
&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Fall&lt;/span&gt; &lt;span class="nx"&gt;back&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nf"&gt;generateFallbackAlt&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;Azure&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;configured&lt;/span&gt;
&lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;fallback&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;should&lt;/span&gt; &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;linkText&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;figcaption&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;heading&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;There&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;TypeScript&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="mi"&gt;482&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nx"&gt;Type&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;assignable&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IssueSeverity&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="nx"&gt;reads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;major&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;Fix&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;importing&lt;/span&gt; &lt;span class="nx"&gt;IssueSeverity&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="sr"&gt;/types/&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;using&lt;/span&gt; &lt;span class="nx"&gt;IssueSeverity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MAJOR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="nx"&gt;Apply&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;same&lt;/span&gt; &lt;span class="nx"&gt;fix&lt;/span&gt; &lt;span class="nx"&gt;wherever&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;critical&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;minor&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="nx"&gt;literals&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;used&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;We tested on a range of real websites. Here's a representative sample from a real run:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test site: eduky.co (education platform)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score before: 51 / 100&lt;/li&gt;
&lt;li&gt;Score after: 93 / 100 (&lt;strong&gt;+42 points&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Issues detected: 21 across all 4 WCAG categories&lt;/li&gt;
&lt;li&gt;Fixes auto-applied: 13 (high confidence)&lt;/li&gt;
&lt;li&gt;Suggestions for review: 8 (lower confidence)&lt;/li&gt;
&lt;li&gt;Analysis time (cloud): 14.4 seconds&lt;/li&gt;
&lt;li&gt;Analysis time (offline): 1.6 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;WCAG Breakdown (before → after):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perceivable: 48 → 89 (+41)&lt;/li&gt;
&lt;li&gt;Operable: 71 → 85 (+14)&lt;/li&gt;
&lt;li&gt;Understandable: 62 → 78 (+16)&lt;/li&gt;
&lt;li&gt;Robust: 55 → 91 (+36)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Test site: Wikipedia (English article)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score before: 68 / 100&lt;/li&gt;
&lt;li&gt;Cloud mode score after: 105 → capped at 100 (&lt;strong&gt;+32 points&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Offline mode score after: 99 (&lt;strong&gt;+31 points&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Analysis time (cloud): 18.2 seconds (many images → many API calls)&lt;/li&gt;
&lt;li&gt;Analysis time (offline): 2.1 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The near-parity between cloud and offline on Wikipedia demonstrates that the heuristic offline agents are genuinely useful — most Wikipedia images follow predictable patterns (figures with captions, file-name-described diagrams) that the heuristic system handles well.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;AccessBridge AI was built in five days for a hackathon. Here's where we'd take it with more time:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browser Extension&lt;/strong&gt; — Run AccessBridge directly in the browser without pasting URLs. Inject the transformed HTML into the current tab so users can see the before/after in situ.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CI/CD Integration&lt;/strong&gt; — An API endpoint that returns a machine-readable WCAG report and exits non-zero when critical violations are detected. Plug it into your GitHub Actions pipeline: no PR gets merged if it regresses accessibility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Foundry Local Integration&lt;/strong&gt; — Replace the offline heuristics with actual on-device AI inference using Azure AI Foundry Local and Phi-4. True intelligence without any internet dependency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-language Support&lt;/strong&gt; — The Simplifier currently targets English readability (Flesch-Kincaid). Extending to Spanish, French, and Portuguese would dramatically expand the tool's impact in underserved markets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accessibility Score Tracking&lt;/strong&gt; — Store historical scores per domain. Show a site owner their accessibility trend over time, not just a single snapshot.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://accessbridge-ai.vercel.app/" rel="noopener noreferrer"&gt;🚀 Live Demo&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/jpablortiz96/accessbridge-ai" rel="noopener noreferrer"&gt;📦 GitHub Repository&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drop any public URL into the analyzer and watch 5 agents work in real time. Try it on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A site you own (and care about improving)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;https://example.com&lt;/code&gt; (a minimal, intentionally bare page)&lt;/li&gt;
&lt;li&gt;A Wikipedia article (rich with images, complex structure)&lt;/li&gt;
&lt;li&gt;A government or nonprofit site (where accessibility matters most)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the analysis finds something, the fixed HTML is available for download immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Accessibility is one of those problems where the technical solution is well-understood and the barrier is almost entirely friction. We know what good alt text looks like. We know what heading hierarchy should be. We know what ARIA landmarks do. The problem is that fixing 47 violations across a 200-page website is a week of tedious work.&lt;/p&gt;

&lt;p&gt;AI agents can absorb that friction. Not perfectly — our confidence scores and human-in-the-loop design reflect genuine humility about what the system can and can't do reliably. But good enough, fast enough, that the decision for a small nonprofit to have an accessible website no longer has to be "we can't afford the developer time."&lt;/p&gt;

&lt;p&gt;That's the goal. Everything else is implementation details.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with ❤️ for the JS AI Build-a-thon 2026 — because the web should work for everyone.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;— Juan Pablo Enriquez Ortiz&lt;/em&gt;&lt;/p&gt;

</description>
      <category>a11y</category>
      <category>agents</category>
      <category>ai</category>
      <category>devchallenge</category>
    </item>
  </channel>
</rss>
