<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Juan Pablo Enriquez Ortiz</title>
    <description>The latest articles on DEV Community by Juan Pablo Enriquez Ortiz (@jpablortiz96).</description>
    <link>https://dev.to/jpablortiz96</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3846843%2F21ecb04b-d1ec-48ce-8480-ecb3645d37cb.png</url>
      <title>DEV Community: Juan Pablo Enriquez Ortiz</title>
      <link>https://dev.to/jpablortiz96</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jpablortiz96"/>
    <language>en</language>
    <item>
      <title>5 production patterns for running Gemma 4 in the browser — what the docs don't tell you</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sat, 23 May 2026 04:25:37 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/5-production-patterns-for-running-gemma-4-in-the-browser-what-the-docs-dont-tell-you-2ai1</link>
      <guid>https://dev.to/jpablortiz96/5-production-patterns-for-running-gemma-4-in-the-browser-what-the-docs-dont-tell-you-2ai1</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Write About Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I spent 11 days shipping &lt;strong&gt;AULA&lt;/strong&gt; — an AI tutor that runs Gemma 4 entirely inside the browser for Latin American students without reliable internet. The build forced me to learn things about deploying Gemma 4 in production that the official documentation glosses over.&lt;/p&gt;

&lt;p&gt;This post distills the 5 patterns I wish someone had handed me on day one. Every one of them cost me hours (or in one case, an entire afternoon) to figure out. If you're shipping Gemma 4 to real users on real hardware, this is the playbook I would have wanted.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you want to see the result first, AULA's repo is here: &lt;a href="https://github.com/jpablortiz96/aula" rel="noopener noreferrer"&gt;github.com/jpablortiz96/aula&lt;/a&gt;. The Build with Gemma 4 submission has the full demo. This post is the technical postmortem.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Pattern 1 — MediaPipe is the right runtime, not transformers.js (yet)
&lt;/h2&gt;

&lt;p&gt;If you Google "run Gemma 4 in the browser", you'll mostly find tutorials using &lt;code&gt;@huggingface/transformers.js&lt;/code&gt;. It's a fantastic library and the obvious starting point. I started there too.&lt;/p&gt;

&lt;p&gt;On my development laptop — a Windows machine with an NVIDIA RTX 3050 (Ampere, 6 GB VRAM) — &lt;code&gt;transformers.js&lt;/code&gt; with WebGPU gave me &lt;strong&gt;2 tokens per second&lt;/strong&gt;. The benchmarks I'd seen online claimed 20-30 tok/s on similar hardware. Something was very wrong.&lt;/p&gt;

&lt;p&gt;After a full afternoon of debugging (chrome://gpu, Task Manager GPU monitor, NVIDIA Control Panel, Vulkan flags, switching to Edge), I found the root cause: on NVIDIA Optimus laptops, &lt;strong&gt;dispatch was routing through the integrated Intel UHD GPU&lt;/strong&gt;, not the discrete NVIDIA. WebGPU's &lt;code&gt;requestAdapter({ powerPreference: 'high-performance' })&lt;/code&gt; is ignored on Windows (&lt;a href="https://bugs.chromium.org/p/chromium/issues/detail?id=369219127" rel="noopener noreferrer"&gt;Chromium bug 369219127&lt;/a&gt;). The model "worked" but ran on the wrong silicon.&lt;/p&gt;

&lt;p&gt;What fixed it: &lt;strong&gt;migrating to &lt;a href="https://www.npmjs.com/package/@mediapipe/tasks-genai" rel="noopener noreferrer"&gt;@mediapipe/tasks-genai&lt;/a&gt; with the WebGPU delegate.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;FilesetResolver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;LlmInference&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@mediapipe/tasks-genai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fileset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;FilesetResolver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forGenAiTasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai/wasm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;LlmInference&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createFromOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;baseOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;modelAssetPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://huggingface.co/litert-community/gemma-4-e2b-it/resolve/main/gemma-4-e2b-it-int4-web.task&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;topK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same hardware. &lt;strong&gt;Same model. Jumped from 2 tok/s to 14-16 tok/s. A 7x speedup, just from switching runtimes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MediaPipe is Google's official runtime for Gemma on edge devices. The team optimized the dispatch path specifically for the WebGPU delegate. It's also the only path that supports the &lt;code&gt;.task&lt;/code&gt; artifact format Google publishes for browser deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; if you're targeting consumer hardware in 2026, MediaPipe is the production runtime. &lt;code&gt;transformers.js&lt;/code&gt; is excellent for prototyping but has not yet caught up on dispatch quality across all GPU/OS combinations. Use it for the local engine; revisit &lt;code&gt;transformers.js&lt;/code&gt; in 6-12 months.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 2 — Pick the right Gemma 4 variant for the constraint, not the benchmark
&lt;/h2&gt;

&lt;p&gt;Gemma 4 comes in three flavors and the marketing pages emphasize the 31B Dense and 26B MoE as the headline models. For a browser deployment, &lt;strong&gt;the only variant that actually matters is the E2B&lt;/strong&gt; (~2 billion effective parameters, quantized to ~1.5 GB).&lt;/p&gt;

&lt;p&gt;Here's the honest tradeoff matrix I built when picking the model for AULA's local engine:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Size on disk&lt;/th&gt;
&lt;th&gt;Runs in browser?&lt;/th&gt;
&lt;th&gt;Math/reasoning quality&lt;/th&gt;
&lt;th&gt;When to use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E2B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~1.5 GB (q4f16)&lt;/td&gt;
&lt;td&gt;✅ Yes, WebGPU&lt;/td&gt;
&lt;td&gt;Good for conversational tutoring&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Local browser deployments&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E4B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~3 GB (q4f16)&lt;/td&gt;
&lt;td&gt;⚠️ Only on 8 GB+ VRAM GPUs&lt;/td&gt;
&lt;td&gt;Slightly better than E2B&lt;/td&gt;
&lt;td&gt;Mid-range GPUs only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 26B-A4B-IT (MoE)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~13 GB&lt;/td&gt;
&lt;td&gt;❌ Cloud only&lt;/td&gt;
&lt;td&gt;Near-31B quality, lower latency&lt;/td&gt;
&lt;td&gt;Cloud API for structured output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 31B-IT (Dense)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~16 GB&lt;/td&gt;
&lt;td&gt;❌ Cloud only&lt;/td&gt;
&lt;td&gt;Best reasoning&lt;/td&gt;
&lt;td&gt;When latency doesn't matter&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For AULA's offline-first use case, the picking logic was straightforward: &lt;strong&gt;the model has to fit in VRAM on a Raspberry Pi 5 (8 GB unified memory)&lt;/strong&gt;. E4B is too big the moment you account for KV cache + browser overhead. E2B fits with margin.&lt;/p&gt;

&lt;p&gt;The non-obvious learning: &lt;strong&gt;on my RTX 3050 (6 GB VRAM), I tried to ship with E4B because it scored better on benchmarks&lt;/strong&gt;. The model loaded but spilled into shared system memory via PCIe, dropping inference to ~1.8 tok/s. Switching to E2B (which actually fits in dedicated VRAM) jumped me back to 14+ tok/s.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; for in-browser inference, the right model is the largest one that fits entirely in dedicated VRAM after counting ~1.5 GB of browser/system overhead. Anything larger spills to PCIe and is unusable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For Cloud Boost (the optional half of AULA), I picked &lt;strong&gt;26B-A4B over 31B Dense&lt;/strong&gt; despite the lower parameter count. The mixture-of-experts architecture activates only ~4B parameters per forward pass, giving &lt;strong&gt;2-3x lower latency&lt;/strong&gt; at near-31B quality. For short structured outputs (a quiz JSON, a Mermaid diagram), this latency difference is the difference between "feels instant" and "user gives up".&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 3 — Don't force small models into rigid structured output
&lt;/h2&gt;

&lt;p&gt;This is the pattern I had to relearn three times before accepting it.&lt;/p&gt;

&lt;p&gt;Gemma 4 E2B is &lt;em&gt;brilliant&lt;/em&gt; at conversational tasks: math explanations, language tutoring, Socratic dialogue, multi-step reasoning in plain text. It is &lt;strong&gt;not reliable&lt;/strong&gt; at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Producing valid JSON without surrounding prose&lt;/li&gt;
&lt;li&gt;Generating syntactically-valid Mermaid diagrams&lt;/li&gt;
&lt;li&gt;Outputting coherent SVG with proper geometry&lt;/li&gt;
&lt;li&gt;Following "respond ONLY with X" instructions consistently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a bug. It's a known property of small open models. The "instruction following" capability scales roughly with parameter count, and at 2B effective parameters, E2B sits at the edge.&lt;/p&gt;

&lt;p&gt;My first three approaches all failed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stricter prompts.&lt;/strong&gt; "Respond ONLY with valid JSON, no markdown, no prose." Worked 70% of the time. The other 30% the model added an explanation paragraph or a &lt;code&gt;Here is the JSON:&lt;/code&gt; prefix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Higher temperature for diversity, lower for structure.&lt;/strong&gt; Marginal improvement, but introduced its own failure modes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tolerant JSON parser that strips fences and reaches for the first &lt;code&gt;{&lt;/code&gt;.&lt;/strong&gt; Helped, but didn't fix the cases where the model produced &lt;em&gt;almost-valid&lt;/em&gt; JSON with unescaped quotes inside string values.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What actually worked: &lt;strong&gt;route structured-output features to a larger model in the cloud&lt;/strong&gt; (26B-A4B), keep the local model for conversational features, and &lt;strong&gt;be transparent about the routing in the UI&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In AULA, every screen shows a badge: green for local, blue for cloud. The user always knows which engine answered. This is the design pattern I'd argue for as a general principle:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Don't pretend your small model can do something it can't. Make the limitation a UX surface, not a hidden failure mode.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Here's the shape of the routing logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Routing decision per feature, not per request&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;chooseEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hasApiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;EngineId&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;structuredOutputFeatures&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;infinite-practice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;// requires JSON&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;svg-illustration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;// requires valid SVG&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mermaid-mindmap&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// requires strict syntax&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;interactive-quiz&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;// requires JSON array&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;handwriting-ocr&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// requires vision (E2B is text-only)&lt;/span&gt;
  &lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;structuredOutputFeatures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;hasApiKey&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cloud-boost&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unavailable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;local&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// chat, voice, calculator, Socratic, etc.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the user always sees the routing decision, with an honest reason if cloud isn't available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unavailable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;showInfoMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;This feature needs Cloud Boost. Add your free Google AI Studio API key in Settings to unlock it. The rest of AULA works offline regardless.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Pattern 4 — &lt;code&gt;LlmInference&lt;/code&gt; is exclusive. Build a queue.
&lt;/h2&gt;

&lt;p&gt;This bit me on day 9 and cost me half a day to diagnose.&lt;/p&gt;

&lt;p&gt;MediaPipe's &lt;code&gt;LlmInference&lt;/code&gt; instance is &lt;strong&gt;a singleton with exclusive access&lt;/strong&gt;. It can process exactly one generation at a time. If you call &lt;code&gt;generateResponse()&lt;/code&gt; while a previous generation is still in flight, you get:&lt;br&gt;
Previous invocation or loading is still ongoing.&lt;/p&gt;

&lt;p&gt;In a single-page app with multiple routes (chat, practice, mind maps), this is easy to trigger:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User starts a long response in /chat&lt;/li&gt;
&lt;li&gt;User navigates to /practice before it finishes&lt;/li&gt;
&lt;li&gt;/practice tries to generate an exercise&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The model is locked. Everything breaks.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The fix: a FIFO queue with abort propagation across navigations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LocalEngine&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;currentAbort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AbortController&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;GenerateOptions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Cancel any in-flight generation&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="c1"&gt;// small buffer&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;task&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Recovery path when the model gets stuck&lt;/span&gt;
  &lt;span class="nf"&gt;forceReset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isGenerating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentAbort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Critical: every component that uses the engine must call &lt;code&gt;abort()&lt;/code&gt; on unmount.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without this cleanup, navigating away mid-generation leaves the model locked, and the next page that wants to generate will silently hang.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 5 — Gemma 4 26B does not stream reliably. Use &lt;code&gt;generateContent&lt;/code&gt;, not &lt;code&gt;streamGenerateContent&lt;/code&gt;.
&lt;/h2&gt;

&lt;p&gt;This one took an afternoon and a careful read of DevTools Network tab to find.&lt;/p&gt;

&lt;p&gt;The Gemini API exposes two endpoints for Gemma 4 models:&lt;br&gt;
POST .../models/gemma-4-26b-a4b-it:generateContent        ← Full response&lt;br&gt;
POST .../models/gemma-4-26b-a4b-it:streamGenerateContent  ← SSE chunks&lt;/p&gt;

&lt;p&gt;For chat use cases, you obviously want streaming. So I wired everything through &lt;code&gt;:streamGenerateContent?alt=sse&lt;/code&gt; and assumed it would Just Work.&lt;/p&gt;

&lt;p&gt;It did, for chat. &lt;strong&gt;It returned &lt;code&gt;400 Bad Request&lt;/code&gt; for AULA's Practice and Mind Map features.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The DevTools investigation revealed: when the prompt requested structured output (JSON, SVG, Mermaid), the streaming endpoint failed with &lt;code&gt;400&lt;/code&gt; while the non-streaming endpoint succeeded with the same payload. I never got a clear root cause from the API — it may be a Gemma-specific quirk in how &lt;code&gt;streamGenerateContent&lt;/code&gt; handles certain &lt;code&gt;responseSchema&lt;/code&gt; configurations or thinking-mode trailers.&lt;/p&gt;

&lt;p&gt;The fix that unblocked everything: &lt;strong&gt;two separate API client paths&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// For chat — streaming, long responses&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;streamChat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;onToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/gemma-4-26b-a4b-it:streamGenerateContent?alt=sse&amp;amp;key=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ...parse SSE chunks, call onToken per chunk&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// For structured output — single-shot, short responses&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;cloudGenerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CloudOptions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/gemma-4-26b-a4b-it:generateContent?key=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;systemInstruction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;system&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
      &lt;span class="na"&gt;generationConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxOutputTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="c1"&gt;// Filter out "thought" parts (Gemma 4 thinking mode)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;thought&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; for short structured outputs, streaming buys you nothing. The user is waiting for one complete artifact, not a slow reveal of text. Use &lt;code&gt;generateContent&lt;/code&gt;. Save streaming for genuine chat.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;One more detail worth flagging: Gemma 4 has a &lt;strong&gt;thinking mode&lt;/strong&gt; that emits "thought" parts in the response. If you naively concatenate all &lt;code&gt;parts[].text&lt;/code&gt;, you'll surface the model's chain-of-thought in the user-visible output. Filter on &lt;code&gt;part.thought === true&lt;/code&gt; and skip those. AULA's chat looked very weird until I added that filter — the model was literally showing its work to the student, which is not the goal.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this means for developers shipping Gemma 4 today
&lt;/h2&gt;

&lt;p&gt;If you're building with Gemma 4 in 2026, the patterns I'd internalize before writing a single line of code:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;MediaPipe for browser, period.&lt;/strong&gt; Don't waste a week on &lt;code&gt;transformers.js&lt;/code&gt; benchmarks. Migrate or start there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick the model that fits in VRAM, not the model that benchmarks best.&lt;/strong&gt; Spilling to PCIe destroys throughput. E2B is the only realistic browser model in 2026.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design routing as a UX surface.&lt;/strong&gt; Small models can't do everything. Make the limitation visible and let the user opt into cloud where it matters. Honesty beats hiding limitations behind retries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat &lt;code&gt;LlmInference&lt;/code&gt; as a single-threaded mutex.&lt;/strong&gt; Queue your requests, abort on unmount, expose a recovery path. The cost of not doing this is a frustrating "the AI broke" experience that the user can't diagnose.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming is for chat. &lt;code&gt;generateContent&lt;/code&gt; is for everything else.&lt;/strong&gt; Don't fight the API.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These five patterns saved me probably a week of additional debugging once I internalized them. AULA exists because Gemma 4 is genuinely good enough to run in a browser tab — but it only feels good to use because the patterns above turn the rough edges into smooth UX.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm hopeful about
&lt;/h2&gt;

&lt;p&gt;The interesting thing about all five patterns above: &lt;strong&gt;none of them are about Gemma 4's quality.&lt;/strong&gt; They're about deployment ergonomics. The model itself is remarkable. A 2-billion-parameter open model that runs at 15 tok/s in a browser tab and can hold a real tutoring conversation with a high school student is a thing that genuinely did not exist 18 months ago.&lt;/p&gt;

&lt;p&gt;For my specific use case — students in rural Latin America who have no other access to AI tools — Gemma 4 is the first model that crosses the &lt;strong&gt;practical viability line&lt;/strong&gt;. It's small enough to download once over a school WiFi connection. It's capable enough to teach. It runs offline. It's free.&lt;/p&gt;

&lt;p&gt;If you're working on local-first AI for any underserved population, I'd encourage you to start with Gemma 4. The deployment patterns above will save you a week. The model will do the rest.&lt;/p&gt;

&lt;p&gt;If you want to see what the patterns look like in a finished product, AULA is open source under MIT: &lt;a href="https://github.com/jpablortiz96/aula" rel="noopener noreferrer"&gt;github.com/jpablortiz96/aula&lt;/a&gt;. Pull requests welcome.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About the author:&lt;/strong&gt; I'm a solo founder in Cali, Colombia building educational tech for Latin American students. AULA was built solo in 11 days for this challenge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Companion submission (Build track):&lt;/strong&gt; &lt;a href="https://dev.to/jpablortiz96/aula-the-ai-tutor-that-fits-in-a-browser-tab-built-for-the-students-the-internet-leaves-behind-253n"&gt;AULA — The AI tutor that fits in a browser tab&lt;/a&gt; — live demo, video walkthrough, full architecture.&lt;/p&gt;

&lt;p&gt;🇨🇴 Made in LATAM, for the students the world forgot.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>webdev</category>
    </item>
    <item>
      <title>AULA — The AI tutor that fits in a browser tab, built for the students the internet leaves behind</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sat, 23 May 2026 04:13:16 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/aula-the-ai-tutor-that-fits-in-a-browser-tab-built-for-the-students-the-internet-leaves-behind-253n</link>
      <guid>https://dev.to/jpablortiz96/aula-the-ai-tutor-that-fits-in-a-browser-tab-built-for-the-students-the-internet-leaves-behind-253n</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AULA&lt;/strong&gt; is a complete AI tutoring platform that runs Google's Gemma 4 entirely inside the browser — no server, no account, no internet required after the first 1.5 GB download. It is designed for the &lt;strong&gt;65+ million Latin American students&lt;/strong&gt; living in areas where reliable internet is the exception, not the norm.&lt;/p&gt;

&lt;p&gt;The premise is simple: if Gemma 4 can run on a Raspberry Pi 5, it can run on a teacher's laptop in rural Boyacá, Colombia. With WebGPU and MediaPipe, this is now possible — and AULA is what that looks like as a finished product.&lt;/p&gt;

&lt;h3&gt;
  
  
  The problem AULA solves
&lt;/h3&gt;

&lt;p&gt;In Latin America, ~40% of students live with unreliable, capped, or non-existent connectivity. ChatGPT, Gemini, Khan Academy's AI tutor — all require a stable connection. The very tools that could close the global education gap are inaccessible exactly where they are needed most.&lt;/p&gt;

&lt;p&gt;AULA flips this: the AI runs &lt;em&gt;on the student's device&lt;/em&gt;, not on a server thousands of miles away.&lt;/p&gt;

&lt;h3&gt;
  
  
  What AULA does — offline (100% local, Gemma 4 E2B)
&lt;/h3&gt;

&lt;p&gt;After loading once, these features work with WiFi off, in airplane mode, in a rural school with no signal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🎓 &lt;strong&gt;Conversational tutor&lt;/strong&gt; — chat with Gemma 4 in natural language. Full LaTeX rendering for math and science. ~15 tokens/sec on a mid-range laptop GPU.&lt;/li&gt;
&lt;li&gt;🧮 &lt;strong&gt;Scientific calculator&lt;/strong&gt; that teaches — visual keypad with trig functions, exponents, roots. Gemma 4 doesn't just solve. It explains the why.&lt;/li&gt;
&lt;li&gt;🎙️ &lt;strong&gt;Voice tutoring (bidirectional)&lt;/strong&gt; — ask by speaking, listen to the response. Optional hands-free mode chains them together.&lt;/li&gt;
&lt;li&gt;🦉 &lt;strong&gt;Socratic mode&lt;/strong&gt; — Gemma 4 stops giving answers and only asks guiding questions. Pedagogy-first.&lt;/li&gt;
&lt;li&gt;🤔 &lt;strong&gt;"Explain it simpler"&lt;/strong&gt; — three escalating reformulation levels on demand.&lt;/li&gt;
&lt;li&gt;💡 &lt;strong&gt;Conceptual error detection&lt;/strong&gt; — Gemma 4 diagnoses &lt;em&gt;which&lt;/em&gt; concept the student misunderstood, not just "wrong, try again".&lt;/li&gt;
&lt;li&gt;📚 &lt;strong&gt;Persistent study sessions&lt;/strong&gt; in IndexedDB. No cloud sync ever.&lt;/li&gt;
&lt;li&gt;♿ &lt;strong&gt;Accessibility first&lt;/strong&gt; — high contrast, large text, easy reading mode (for dyslexia), auto-read responses.&lt;/li&gt;
&lt;li&gt;🌍 &lt;strong&gt;Spanish ↔ English&lt;/strong&gt; — full i18n. System prompts translate, not just the labels.&lt;/li&gt;
&lt;li&gt;🏆 &lt;strong&gt;Local gamification&lt;/strong&gt; — XP, levels, streak, achievements. All in the browser.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What AULA does — Cloud Boost (optional, Gemma 4 26B-A4B)
&lt;/h3&gt;

&lt;p&gt;For features that require strict structured output (which is beyond what a 2B-parameter model can do reliably), AULA routes through the user's own free Google AI Studio API key:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✍️ &lt;strong&gt;Handwritten whiteboard&lt;/strong&gt; — draw equations with finger or mouse, Gemma 4 reads and solves.&lt;/li&gt;
&lt;li&gt;📷 &lt;strong&gt;Photo OCR + reasoning&lt;/strong&gt; — point camera at a printed exercise, get a step-by-step solution.&lt;/li&gt;
&lt;li&gt;♾️ &lt;strong&gt;Infinite adaptive practice&lt;/strong&gt; — exercises that never repeat, with difficulty calibrated dynamically.&lt;/li&gt;
&lt;li&gt;🎯 &lt;strong&gt;Interactive student quiz&lt;/strong&gt; — self-assessment with scoring and per-error conceptual review.&lt;/li&gt;
&lt;li&gt;👩‍🏫 &lt;strong&gt;Teacher mode with PDF export&lt;/strong&gt; — generate quizzes, export student/teacher PDFs ready to print.&lt;/li&gt;
&lt;li&gt;🎨 &lt;strong&gt;SVG illustrations&lt;/strong&gt; — Gemma 4 generates educational diagrams.&lt;/li&gt;
&lt;li&gt;🗺️ &lt;strong&gt;Mermaid mind maps&lt;/strong&gt; — concept diagrams rendered interactively, downloadable as PNG/SVG.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Critical:&lt;/strong&gt; Cloud Boost is &lt;em&gt;always opt-in&lt;/em&gt;. AULA never sends data without an explicit API key configured by the user. The core educational experience never requires the internet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🎥 &lt;strong&gt;Watch the 2-minute walkthrough:&lt;/strong&gt; &lt;a href="https://youtu.be/d0jN8Kw_Cz4" rel="noopener noreferrer"&gt;https://youtu.be/d0jN8Kw_Cz4&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Live demo:&lt;/strong&gt; &lt;a href="https://aula.run" rel="noopener noreferrer"&gt;https://aula.run&lt;/a&gt; &lt;em&gt;(or local: &lt;code&gt;pnpm dev -p 3100&lt;/code&gt; after cloning)&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key screenshots
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Chat tutor running 100% locally with full LaTeX rendering&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3h84wj3p1tefo8or7uxk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3h84wj3p1tefo8or7uxk.png" alt="AULA chat with Gemma 4 local" width="757" height="787"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mermaid mind maps generated by Gemma 4 — click to enlarge, download as PNG&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tif53hcthb7renoxfc3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tif53hcthb7renoxfc3.png" alt="Mind map of photosynthesis" width="800" height="897"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SVG illustrations — educational diagrams generated by Gemma 4&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfujjl4gljda0x0t9mdo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfujjl4gljda0x0t9mdo.png" alt="Pythagoras illustration" width="800" height="899"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scientific calculator that explains, powered locally&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbafafksvwozkk8gpy4r0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbafafksvwozkk8gpy4r0.png" alt="Calculator solving sin(π/2) + 2^3" width="635" height="882"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Teacher mode with PDF export — ready for classroom&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg5uxwzal5gj95jc36xkm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg5uxwzal5gj95jc36xkm.png" alt="Teacher mode quiz" width="790" height="851"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accessibility built-in: high contrast mode&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fny207nftyvar075b5gje.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fny207nftyvar075b5gje.png" alt="High contrast mode" width="635" height="850"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/jpablortiz96/aula" rel="noopener noreferrer"&gt;https://github.com/jpablortiz96/aula&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repo includes a comprehensive README with architecture diagrams, hardware benchmarks across devices (Raspberry Pi 5 to RTX 3050 to MacBook M3), full tech stack documentation, and a roadmap for v1.1 through v3.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;License:&lt;/strong&gt; MIT&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;p&gt;AULA uses a &lt;strong&gt;dual-engine architecture&lt;/strong&gt; with intentional model selection for each tier:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Where it runs&lt;/th&gt;
&lt;th&gt;What it powers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E2B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~1.5 GB (q4f16 quantized)&lt;/td&gt;
&lt;td&gt;Browser, via MediaPipe + WebGPU&lt;/td&gt;
&lt;td&gt;All offline features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 26B-A4B-IT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cloud (MoE)&lt;/td&gt;
&lt;td&gt;Gemini API&lt;/td&gt;
&lt;td&gt;Structured-output features&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why Gemma 4 E2B for local
&lt;/h3&gt;

&lt;p&gt;The E2B variant is the only Gemma 4 model that fits realistically on consumer hardware while preserving the multimodal capability path. It runs at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~15 tokens/sec on an NVIDIA RTX 3050 laptop&lt;/li&gt;
&lt;li&gt;~20-25 tokens/sec on a MacBook M3&lt;/li&gt;
&lt;li&gt;~7 tokens/sec on a Raspberry Pi 5 (CPU fallback)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This range covers &lt;strong&gt;every realistic device a Latin American student or teacher might have access to&lt;/strong&gt; — from a $80 SBC to a school laptop. The 31B Dense model would never fit in a browser tab; the 26B MoE requires server-grade resources. E2B is the &lt;em&gt;only&lt;/em&gt; viable choice for the rural offline use case, and that's exactly why I picked it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Gemma 4 26B-A4B for cloud-enhanced features
&lt;/h3&gt;

&lt;p&gt;Some features in AULA require strict structured output: JSON for quiz exercises, syntactically-valid Mermaid for mind maps, coherent SVG for illustrations. &lt;strong&gt;Small models are unreliable for this&lt;/strong&gt; — they're brilliant at conversation but tend to add prose around JSON, produce malformed SVG, or break Mermaid syntax.&lt;/p&gt;

&lt;p&gt;Rather than fight this limitation or hide it, AULA makes the routing &lt;strong&gt;explicit and visible to the user&lt;/strong&gt;. Every screen shows which engine answered: green badge for local, blue badge for cloud. The 26B-A4B variant gives me near-31B quality at substantially lower latency thanks to its mixture-of-experts architecture — ideal for short structured outputs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical challenges I solved
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. transformers.js was not viable on NVIDIA Optimus laptops.&lt;/strong&gt;&lt;br&gt;
My first prototype used &lt;code&gt;transformers.js&lt;/code&gt; + WebGPU. On an RTX 3050, I got 2 tokens/sec because dispatch was routing through the iGPU. Migrating to &lt;strong&gt;MediaPipe's WebGPU delegate&lt;/strong&gt; unlocked 14-16 tokens/sec on the same hardware — a 7x improvement. MediaPipe is Google's official runtime for Gemma 4 on edge, and the difference is real.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Concurrency on &lt;code&gt;LlmInference&lt;/code&gt; is exclusive.&lt;/strong&gt;&lt;br&gt;
A single MediaPipe &lt;code&gt;LlmInference&lt;/code&gt; instance processes one prompt at a time. When &lt;code&gt;/chat&lt;/code&gt; and &lt;code&gt;/practice&lt;/code&gt; competed for the same singleton, the model locked with &lt;code&gt;Previous invocation or loading is still ongoing&lt;/code&gt;. I implemented a &lt;strong&gt;FIFO queue with abort propagation&lt;/strong&gt; across navigations, plus a &lt;code&gt;forceReset()&lt;/code&gt; recovery path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Gemma 4 26B does not support &lt;code&gt;streamGenerateContent&lt;/code&gt; reliably.&lt;/strong&gt;&lt;br&gt;
This took an afternoon of DevTools debugging to identify: calling &lt;code&gt;:streamGenerateContent&lt;/code&gt; returned 400, while &lt;code&gt;:generateContent&lt;/code&gt; (no streaming) worked perfectly. The fix was creating a separate &lt;code&gt;cloudNoStream.ts&lt;/code&gt; helper for Practice, Illustrator, and Mermaid — features that don't benefit from streaming anyway since the user is waiting for one complete response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Easy Reading Mode is more than a CSS toggle.&lt;/strong&gt;&lt;br&gt;
For students with dyslexia or reading difficulties, AULA changes both the visual presentation (letter spacing, line height, max-width) &lt;em&gt;and&lt;/em&gt; the system prompt sent to Gemma 4 ("Short sentences. Simple vocabulary. One idea per line."). This is the kind of accessibility that AI uniquely enables — the model adapts its output style, not just the typography.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Gemma 4 unlocked that wasn't possible 18 months ago
&lt;/h3&gt;

&lt;p&gt;Browser-native inference at this quality was genuinely impossible until WebGPU stabilized. AULA is &lt;strong&gt;only buildable in 2026&lt;/strong&gt;. The combination of Gemma 4's open weights, WebGPU's GPU access, and MediaPipe's optimized runtime is what makes a Pi-friendly AI tutor a real thing, not a thought experiment.&lt;/p&gt;

&lt;p&gt;For 65 million students in Latin America who have been excluded from the AI revolution, this matters more than I can describe in this post.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tech stack:&lt;/strong&gt; Next.js 15, TypeScript strict, Tailwind v4, MediaPipe LLM Inference, WebGPU, Gemini API (REST + SSE), Zustand, IndexedDB, jsPDF, Mermaid, tesseract.js, Web Speech API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Built solo in 11 days&lt;/strong&gt; for the DEV.to Gemma 4 Challenge.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;AULA is open source under MIT. Fork it, run it in your school, contribute to it. If you're a teacher in a low-connectivity region and want help deploying AULA, open an issue on GitHub.&lt;/p&gt;

&lt;p&gt;🇨🇴 Made in LATAM, for the students the world forgot.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
    <item>
      <title>Building AccessBridge AI: How 5 AI Agents Collaborate to Make the Web Accessible</title>
      <dc:creator>Juan Pablo Enriquez Ortiz</dc:creator>
      <pubDate>Sat, 28 Mar 2026 02:04:39 +0000</pubDate>
      <link>https://dev.to/jpablortiz96/building-accessbridge-ai-how-5-ai-agents-collaborate-to-make-the-web-accessible-24kf</link>
      <guid>https://dev.to/jpablortiz96/building-accessbridge-ai-how-5-ai-agents-collaborate-to-make-the-web-accessible-24kf</guid>
      <description>&lt;p&gt;&lt;em&gt;Built for the JS AI Build-a-thon 2026 — Agents for Impact&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem That Inspired Us
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;96.3% of the top million websites fail basic accessibility standards.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That statistic, from the 2024 WebAIM Million report, stopped me cold. We're not talking about edge cases or rare corner cases — we're talking about the overwhelming majority of the web being effectively inaccessible to 1.3 billion people who live with some form of disability.&lt;/p&gt;

&lt;p&gt;The tools that exist today are part of the problem. Axe, WAVE, Lighthouse — these are excellent auditors. They'll tell you that you have 23 accessibility violations. What they won't do is fix a single one of them. The burden always falls back on the developer, who may not have the time, budget, or expertise to address every flag.&lt;/p&gt;

&lt;p&gt;We wanted to change the question from &lt;em&gt;"Where are the problems?"&lt;/em&gt; to &lt;em&gt;"Here's the fixed version — would you like to use it?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That's AccessBridge AI.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Built
&lt;/h2&gt;

&lt;p&gt;AccessBridge AI is a multi-agent system where 5 specialized AI agents collaborate in real-time to transform any web page into universally accessible content. You paste a URL. Fifteen seconds later, you get back:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An &lt;strong&gt;accessibility score&lt;/strong&gt; (before and after) on a 0-100 scale&lt;/li&gt;
&lt;li&gt;A list of every issue found, with the agent that found it and the confidence score&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;automatically transformed HTML file&lt;/strong&gt; with fixes applied&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;full decision log&lt;/strong&gt; explaining every choice the system made&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;WCAG breakdown&lt;/strong&gt; across all four principles: Perceivable, Operable, Understandable, Robust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you analyze a URL, here's what happens under the hood:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The &lt;strong&gt;Orchestrator&lt;/strong&gt; fetches the HTML server-side (15s timeout, custom User-Agent)&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Scanner&lt;/strong&gt;, &lt;strong&gt;Vision&lt;/strong&gt;, &lt;strong&gt;Simplifier&lt;/strong&gt;, and &lt;strong&gt;Navigator&lt;/strong&gt; agents all run in parallel&lt;/li&gt;
&lt;li&gt;The Orchestrator &lt;strong&gt;resolves conflicts&lt;/strong&gt; between agents (more on this below)&lt;/li&gt;
&lt;li&gt;High-confidence fixes are &lt;strong&gt;automatically applied&lt;/strong&gt; to the HTML&lt;/li&gt;
&lt;li&gt;Low-confidence suggestions are flagged for &lt;strong&gt;human review&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Scores are calculated and the full result is returned to the UI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On our test runs: average score improvement of &lt;strong&gt;+31 to +42 points&lt;/strong&gt;, depending on how accessibility-challenged the original page was.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The BaseAgent Contract
&lt;/h3&gt;

&lt;p&gt;Every agent in the system implements a single interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;BaseAgent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;AgentResult&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Every agent receives raw HTML and a URL, and returns a structured &lt;code&gt;AgentResult&lt;/code&gt; containing issues found, fixes proposed, metadata, and a confidence score. This contract is what makes the system composable — swapping the cloud Vision agent for an offline heuristic agent requires zero changes to the Orchestrator.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;AgentResult&lt;/code&gt; shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AgentResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AccessibilityIssue&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;attribute&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;oldValue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;newValue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;startTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;endTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every fix carries a &lt;code&gt;selector&lt;/code&gt; (CSS selector targeting the element), the old value, the new value, and — critically — a human-readable &lt;code&gt;reason&lt;/code&gt;. This is what powers the Decision Log in the UI.&lt;/p&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;isEnhancement&lt;/code&gt; Flag: An Honest Score Model
&lt;/h3&gt;

&lt;p&gt;This is one of the subtler design decisions that took three iterations to get right.&lt;/p&gt;

&lt;p&gt;The problem: Vision and Simplifier agents find &lt;em&gt;opportunities&lt;/em&gt; — images that could have better alt text, paragraphs that could be simpler. These aren't pre-existing defects that the website owner created. They're improvements AccessBridge can make. If we counted them in the &lt;code&gt;scoreBefore&lt;/code&gt; calculation, we'd be artificially penalizing the site for things it never claimed to do.&lt;/p&gt;

&lt;p&gt;The solution: an &lt;code&gt;isEnhancement&lt;/code&gt; flag on &lt;code&gt;AccessibilityIssue&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AccessibilityIssue&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="nl"&gt;fixApplied&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="cm"&gt;/** True for Vision / Simplifier issues that represent *improvements*
   *  AccessBridge found, not pre-existing defects. These are shown in the
   *  UI but never penalise scoreBefore, and their fixes (if applied)
   *  add to scoreAfter. */&lt;/span&gt;
  &lt;span class="nl"&gt;isEnhancement&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scoring model then becomes additive — honest and non-decreasing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// scoreBefore: only real pre-existing defects (Scanner + Navigator)&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;calcScoreBefore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IssueLike&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isEnhancement&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SCANNER&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;NAVIGATOR&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;critical&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;major&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;                              &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// scoreAfter: scoreBefore + points earned per applied fix&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;FIX_POINTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;vision&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// contextual alt text&lt;/span&gt;
  &lt;span class="na"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// structural fixes have high WCAG impact&lt;/span&gt;
  &lt;span class="na"&gt;simplifier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// readability improvements&lt;/span&gt;
  &lt;span class="na"&gt;scanner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;calcScoreAfter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IssueLike&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="nx"&gt;before&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fixApplied&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;FIX_POINTS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;before&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;gain&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Navigator gets the highest fix points because structural changes — adding landmark regions, fixing heading hierarchy, inserting skip links — have the biggest real-world impact for keyboard and screen reader users.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parallel Execution via Promise.all
&lt;/h3&gt;

&lt;p&gt;All four specialist agents run concurrently. The Orchestrator wraps each in a try/catch so one failing agent (e.g., Azure timeout) doesn't bring down the entire analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;settled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emitEvent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;WORKING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; started analyzing…`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emitEvent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DONE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; found &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; issues`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;issueCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fixCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emitEvent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ERROR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; failed: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each &lt;code&gt;emitEvent&lt;/code&gt; call feeds the real-time agent timeline in the UI via an &lt;code&gt;EventEmitter&lt;/code&gt; pattern — the Orchestrator extends Node's &lt;code&gt;EventEmitter&lt;/code&gt;, and the API route streams events back to the browser using a readable stream.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Conflict Resolution Engine
&lt;/h3&gt;

&lt;p&gt;Agents running in parallel will inevitably step on each other's toes. We handle two conflict types:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Type 1: Same WCAG rule, same element, different agents.&lt;/strong&gt;&lt;br&gt;
The Scanner might flag &lt;code&gt;img:nth-of-type(3)&lt;/code&gt; for missing alt text (WCAG 1.1.1), and so might the Navigator. We deduplicate by keeping the first reporter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;seenIssues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;::&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wcagRule&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;seenIssues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Log conflict, first-reporter wins&lt;/span&gt;
      &lt;span class="nx"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;winner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;First-reporter wins&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;seenIssues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentType&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Type 2: Vision vs Simplifier — the context preservation conflict.&lt;/strong&gt;&lt;br&gt;
This is the interesting one. Imagine Vision generates the alt text: &lt;em&gt;"Promotes transforming your future through education and growth opportunities"&lt;/em&gt; for an image inside a paragraph. Then Simplifier comes along and rewrites that paragraph to be shorter. Now the alt text no longer makes sense in context — screen reader users would hear the simplified text followed by the original (now out-of-context) alt text.&lt;/p&gt;

&lt;p&gt;Our rule: &lt;strong&gt;Vision always wins over Simplifier on the same element.&lt;/strong&gt; If a Vision-fixed image lives inside a paragraph that Simplifier wants to rewrite, that paragraph is blocked:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Find text blocks where Vision has fixed an img inside&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;simplFix&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;simplifierResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simplFix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imgSel&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;visionImgSelectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;$block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imgSel&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Vision wins — block the Simplifier fix&lt;/span&gt;
      &lt;span class="nx"&gt;blockedSimplifierSelectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simplFix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;winner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VISION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Alt text generated by Vision Agent is calibrated to the image&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;s &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;surrounding context. Rewriting that context could make the alt text &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;misleading for screen reader users.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This conflict — and its resolution — is recorded in the Decision Log and surfaced in the Responsible AI panel so users can understand why a particular fix wasn't applied.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Secret Sauce: Contextual Alt Text
&lt;/h2&gt;

&lt;p&gt;The Vision Agent is where Azure OpenAI earns its place in the system.&lt;/p&gt;

&lt;p&gt;Most accessibility scanners will tell you: "This image has no alt text." The best ones will say: "Add meaningful alt text." But what does "meaningful" mean for a specific image on a specific page?&lt;/p&gt;

&lt;p&gt;Before even calling the API, we extract rich context from the DOM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ExtractedContext&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;imageUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;imageType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;decorative&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;functional&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;informative&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;heading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// nearest ancestor or preceding h1-h6&lt;/span&gt;
  &lt;span class="nl"&gt;surroundingText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// text content of parent element&lt;/span&gt;
  &lt;span class="nl"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// figcaption if present&lt;/span&gt;
  &lt;span class="nl"&gt;linkText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// text of wrapping &amp;lt;a&amp;gt; if present&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="c1"&gt;// title attribute&lt;/span&gt;
  &lt;span class="nl"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// the raw HTML element&lt;/span&gt;
  &lt;span class="nl"&gt;currentAlt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent classifies each image into one of three roles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Decorative&lt;/strong&gt; — purely visual, no information content → &lt;code&gt;alt=""&lt;/code&gt; (handled by Scanner)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functional&lt;/strong&gt; — inside a link or button → alt text describes the destination/action&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Informative&lt;/strong&gt; — content image → alt text describes what the image communicates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This role classification shapes the system prompt sent to GPT-4o:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are an accessibility expert generating alt text for a web image.

Image role: FUNCTIONAL (this image is inside a link or button)
For functional images, describe the DESTINATION or ACTION, not just what you see.
Generate alt text that a screen reader user would find helpful.

RULES:
- Be concise (under 125 characters)
- Describe PURPOSE, not visual appearance
- If it's functional, what does it DO or WHERE does it go?
- Do NOT start with "Image of", "Picture of", "Photo of"
- Do NOT include quotes in your response

Context:
- Surrounding text: "Learn more about our engineering bootcamp programs"
- Link text: "Apply now"
- Nearest heading: "Transform Your Career in Tech"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; &lt;em&gt;"Apply for engineering bootcamp — Transform Your Career in Tech"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Without this context, a generic vision model might return: &lt;em&gt;"A button with text."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The agent also penalizes its own confidence score when context is thin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;heading&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;surroundingText&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;linkText&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;hasContext&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mf"&gt;0.88&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.72&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A confidence below 0.5 means the fix is surfaced as a suggestion, never auto-applied. This is human-in-the-loop by design — the system acknowledges its own uncertainty.&lt;/p&gt;




&lt;h2&gt;
  
  
  Going Offline: Accessibility Without Internet
&lt;/h2&gt;

&lt;p&gt;We built two fully functional modes from day one, and the offline mode is not a degraded fallback — it's a genuine capability with a specific use case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; Because the communities that most need accessibility tooling — nonprofits, government agencies in emerging markets, small educational institutions — often have unreliable or metered internet connectivity. A tool that stops working without cloud connectivity isn't truly accessible.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;☁️ Cloud&lt;/th&gt;
&lt;th&gt;📡 Offline&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scanner (20+ WCAG rules)&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Full (same code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Navigator (structure)&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Full (same code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vision (alt text)&lt;/td&gt;
&lt;td&gt;AI-powered via GPT-4o&lt;/td&gt;
&lt;td&gt;5-tier heuristic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Simplifier (readability)&lt;/td&gt;
&lt;td&gt;AI rewriting&lt;/td&gt;
&lt;td&gt;Deterministic splitting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typical speed&lt;/td&gt;
&lt;td&gt;~12 seconds&lt;/td&gt;
&lt;td&gt;~2 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privacy&lt;/td&gt;
&lt;td&gt;Processed via Azure&lt;/td&gt;
&lt;td&gt;Zero external requests&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Offline Vision Heuristic
&lt;/h3&gt;

&lt;p&gt;When no API key is present, the Vision agent falls back to a 5-tier priority system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tier 1: &amp;lt;img&amp;gt; inside &amp;lt;a&amp;gt;
  → "Link to {link text}" or "Link to {domain name}"
  Rationale: functional images communicate navigation intent

Tier 2: &amp;lt;figure&amp;gt; with &amp;lt;figcaption&amp;gt;
  → Use the caption verbatim (the author already wrote it)

Tier 3: Meaningful filename
  → "hero-education-program.jpg" → "Hero education program image"
  (strip extension, convert hyphens/underscores to spaces, title-case)

Tier 4: Nearest heading in the DOM
  → "Image related to: {heading text}"

Tier 5: Image URL domain
  → "Image — cdn.example.com"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All offline Vision issues are marked &lt;code&gt;isEnhancement: true&lt;/code&gt; with confidence &lt;code&gt;0.5&lt;/code&gt;, which means they're auto-applied (the threshold is &lt;code&gt;≥ 0.5&lt;/code&gt;) but don't penalize the before-score.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Offline Simplifier
&lt;/h3&gt;

&lt;p&gt;The offline Simplifier uses a deterministic algorithm instead of calling GPT-4o. For any &lt;code&gt;&amp;lt;p&amp;gt;&lt;/code&gt; element with a sentence over 30 words, it attempts a three-pass split:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pass 1 — Natural break (comma near midpoint):
  Find the comma closest to the ±30% midpoint of the sentence.
  "The program, which was founded in 2019, has helped over 1,000 students..."
  → "The program, which was founded in 2019, has helped over 1,000 students..."
  → Split at comma before "has"

Pass 2 — Conjunction split:
  Find the first coordinating/subordinating conjunction after the midpoint:
  (and, but, which, because, however, although, while, whereas...)
  → Split before the conjunction, add a period

Pass 3 — Hard midpoint:
  If no natural break found, split at the word nearest the midpoint.
  (Last resort — preserves meaning better than cutting arbitrarily)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result on Wikipedia: Cloud mode +37 pts, Offline mode +31 pts. The gap is real but smaller than you'd expect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Responsible AI: Not an Afterthought
&lt;/h2&gt;

&lt;p&gt;We made a deliberate decision early: transparency and human oversight are architectural requirements, not features we'd add later.&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;AgentEvent&lt;/code&gt; is timestamped and stored:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AgentEvent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;agentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// WORKING | DONE | ERROR | CONFLICT&lt;/span&gt;
  &lt;span class="nl"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;data&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Decision Log in the UI renders every event — including conflicts — in chronological order. Conflict events are highlighted in amber. The Responsible AI panel shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Transparency&lt;/strong&gt;: total number of agent decisions logged&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-Loop&lt;/strong&gt;: count of suggestions vs auto-applied fixes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence Scoring&lt;/strong&gt;: breakdown of high/medium/low confidence fixes per agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy&lt;/strong&gt;: mode used and data retention policy (none — all processing is ephemeral)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The confidence threshold for auto-apply is explicitly &lt;code&gt;≥ 0.5&lt;/code&gt;. Anything below that is shown as a suggestion with a reason: &lt;em&gt;"Confidence 0.42 — flagged for human review."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This design reflects a real belief: AI systems that affect people's lives — and accessibility directly affects how 1.3 billion people experience the web — need to be auditable, explainable, and humble about their own limitations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building with AI: Our Claude Code Workflow
&lt;/h2&gt;

&lt;p&gt;This section is the most honest part of this post.&lt;/p&gt;

&lt;p&gt;We used &lt;strong&gt;Claude Code&lt;/strong&gt; (Anthropic's CLI coding assistant) for the vast majority of this project. Here's what that actually looked like, warts included.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Worked Exceptionally Well
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Generating the type system.&lt;/strong&gt; We gave Claude Code the exact interfaces we wanted and it produced clean, idiomatic TypeScript on the first try. The &lt;code&gt;AccessibilityIssue&lt;/code&gt;, &lt;code&gt;AgentResult&lt;/code&gt;, and &lt;code&gt;AnalysisResult&lt;/code&gt; interfaces required almost no revision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Scanner Agent.&lt;/strong&gt; We asked for a WCAG 2.1 auditor covering all four principles. Claude generated 20+ detection rules using cheerio, each wrapped in its own try/catch, with proper severity and WCAG rule codes. This would have taken a week to research and write manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UI components with specific constraints.&lt;/strong&gt; When we described the exact visual behavior we wanted — "a segmented control using visually-hidden radio inputs, two options, with an offline disclaimer that animates in with aria-live='polite'" — we got exactly that. No hallucinated React libraries, no unnecessary dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging TypeScript errors across a multi-agent system.&lt;/strong&gt; When we hit a &lt;code&gt;TS2322&lt;/code&gt; error about &lt;code&gt;IssueSeverity&lt;/code&gt; string literals, we described the error and the surrounding code, and got the right fix immediately: import the enum and use &lt;code&gt;IssueSeverity.MAJOR&lt;/code&gt; instead of the string &lt;code&gt;'major'&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Didn't Work (At First)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The scoring algorithm needed three iterations to get right.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our first attempt: a single &lt;code&gt;calcScore(issues, afterFixes: boolean)&lt;/code&gt; function that counted all issues and tried to subtract fixed ones. When we tested in offline mode, scores were going &lt;em&gt;down&lt;/em&gt; — from 72 to 51 — because Vision and Simplifier were generating issues that got counted against the baseline.&lt;/p&gt;

&lt;p&gt;Second attempt: separate before/after calculations. Better, but still wrong — the "after" score was recounting all unfixed issues instead of adding earned points.&lt;/p&gt;

&lt;p&gt;Third attempt: the additive model with &lt;code&gt;isEnhancement&lt;/code&gt; flag described above. The key insight was identifying &lt;em&gt;why&lt;/em&gt; the model was wrong, not just that it was wrong.&lt;/p&gt;

&lt;p&gt;The lesson: &lt;strong&gt;AI-assisted coding works best when you can articulate the bug precisely.&lt;/strong&gt; "The offline score goes down" didn't help. "The before-score counts Vision issues that are improvements, not defects — they shouldn't appear in the baseline" produced an exact, correct solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complex cheerio selectors were brittle.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Early versions of the agents generated selectors like &lt;code&gt;div.container &amp;gt; section:first-child &amp;gt; img:nth-child(3)&lt;/code&gt;. These worked on the test page but broke on real sites. We had to manually establish the selector priority rule (id &amp;gt; class &amp;gt; src attribute &amp;gt; nth-of-type) and explain it precisely before the generated code became stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conflict resolution logic needed manual refinement.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The initial conflict resolution was purely deduplication. The Vision-vs-Simplifier context preservation conflict — where rewriting a paragraph could make an adjacent alt text misleading — was a design decision we arrived at ourselves, then asked Claude to implement. The "what" came from us; the "how" came from Claude.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our Prompting Strategy
&lt;/h3&gt;

&lt;p&gt;The difference between a prompt that works and one that doesn't, in our experience:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specify the interface, not just the behavior.&lt;/strong&gt;&lt;br&gt;
Instead of: &lt;em&gt;"Create a Scanner Agent that checks accessibility"&lt;/em&gt;&lt;br&gt;
We used: &lt;em&gt;"Create a Scanner Agent that implements the BaseAgent interface below. It should use cheerio to parse the HTML and detect these specific WCAG violations, returning issues with these exact fields..."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Describe bugs with reproduction steps, not symptoms.&lt;/strong&gt;&lt;br&gt;
Instead of: &lt;em&gt;"The score is wrong"&lt;/em&gt;&lt;br&gt;
We used: &lt;em&gt;"The &lt;code&gt;scoreBefore&lt;/code&gt; function at line 35 is including Vision agent issues (marked &lt;code&gt;isEnhancement: true&lt;/code&gt;) in its baseline count. These should be excluded. The fix should modify the filter condition to check &lt;code&gt;!i.isEnhancement &amp;amp;&amp;amp;&lt;/code&gt; before the agentType check."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iterate on real output, not hypothetical code.&lt;/strong&gt;&lt;br&gt;
We ran the app, analyzed a real URL, saw the output, identified what was wrong, then described that specific wrong output and the expected correct output. Every iteration was grounded in real behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example prompts we used:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;Create&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;Vision&lt;/span&gt; &lt;span class="nx"&gt;Agent&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="sr"&gt;/src/&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;vision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;It&lt;/span&gt; &lt;span class="nx"&gt;must&lt;/span&gt; &lt;span class="nx"&gt;implement&lt;/span&gt; &lt;span class="nx"&gt;BaseAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="nx"&gt;It&lt;/span&gt; &lt;span class="nx"&gt;should&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Use&lt;/span&gt; &lt;span class="nx"&gt;cheerio&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;find&lt;/span&gt; &lt;span class="nx"&gt;all&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;elements&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;missing&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;generic&lt;/span&gt; &lt;span class="nx"&gt;alt&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;For&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt; &lt;span class="nx"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;extract&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="nx"&gt;object&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;definition&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Call&lt;/span&gt; &lt;span class="nx"&gt;Azure&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;exact&lt;/span&gt; &lt;span class="nx"&gt;system&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Mark&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;isEnhancement&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt;
&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Fall&lt;/span&gt; &lt;span class="nx"&gt;back&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nf"&gt;generateFallbackAlt&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;Azure&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;configured&lt;/span&gt;
&lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;fallback&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;should&lt;/span&gt; &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;linkText&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;figcaption&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;heading&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;There&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;TypeScript&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="mi"&gt;482&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nx"&gt;Type&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;assignable&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IssueSeverity&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="nx"&gt;reads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;major&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;Fix&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;importing&lt;/span&gt; &lt;span class="nx"&gt;IssueSeverity&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="sr"&gt;/types/&lt;/span&gt;&lt;span class="nx"&gt;agents&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;using&lt;/span&gt; &lt;span class="nx"&gt;IssueSeverity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MAJOR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="nx"&gt;Apply&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;same&lt;/span&gt; &lt;span class="nx"&gt;fix&lt;/span&gt; &lt;span class="nx"&gt;wherever&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;critical&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;minor&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="nx"&gt;literals&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;used&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;We tested on a range of real websites. Here's a representative sample from a real run:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test site: eduky.co (education platform)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score before: 51 / 100&lt;/li&gt;
&lt;li&gt;Score after: 93 / 100 (&lt;strong&gt;+42 points&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Issues detected: 21 across all 4 WCAG categories&lt;/li&gt;
&lt;li&gt;Fixes auto-applied: 13 (high confidence)&lt;/li&gt;
&lt;li&gt;Suggestions for review: 8 (lower confidence)&lt;/li&gt;
&lt;li&gt;Analysis time (cloud): 14.4 seconds&lt;/li&gt;
&lt;li&gt;Analysis time (offline): 1.6 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;WCAG Breakdown (before → after):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perceivable: 48 → 89 (+41)&lt;/li&gt;
&lt;li&gt;Operable: 71 → 85 (+14)&lt;/li&gt;
&lt;li&gt;Understandable: 62 → 78 (+16)&lt;/li&gt;
&lt;li&gt;Robust: 55 → 91 (+36)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Test site: Wikipedia (English article)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score before: 68 / 100&lt;/li&gt;
&lt;li&gt;Cloud mode score after: 105 → capped at 100 (&lt;strong&gt;+32 points&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Offline mode score after: 99 (&lt;strong&gt;+31 points&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Analysis time (cloud): 18.2 seconds (many images → many API calls)&lt;/li&gt;
&lt;li&gt;Analysis time (offline): 2.1 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The near-parity between cloud and offline on Wikipedia demonstrates that the heuristic offline agents are genuinely useful — most Wikipedia images follow predictable patterns (figures with captions, file-name-described diagrams) that the heuristic system handles well.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;AccessBridge AI was built in five days for a hackathon. Here's where we'd take it with more time:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browser Extension&lt;/strong&gt; — Run AccessBridge directly in the browser without pasting URLs. Inject the transformed HTML into the current tab so users can see the before/after in situ.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CI/CD Integration&lt;/strong&gt; — An API endpoint that returns a machine-readable WCAG report and exits non-zero when critical violations are detected. Plug it into your GitHub Actions pipeline: no PR gets merged if it regresses accessibility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Foundry Local Integration&lt;/strong&gt; — Replace the offline heuristics with actual on-device AI inference using Azure AI Foundry Local and Phi-4. True intelligence without any internet dependency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-language Support&lt;/strong&gt; — The Simplifier currently targets English readability (Flesch-Kincaid). Extending to Spanish, French, and Portuguese would dramatically expand the tool's impact in underserved markets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accessibility Score Tracking&lt;/strong&gt; — Store historical scores per domain. Show a site owner their accessibility trend over time, not just a single snapshot.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://accessbridge-ai.vercel.app/" rel="noopener noreferrer"&gt;🚀 Live Demo&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/jpablortiz96/accessbridge-ai" rel="noopener noreferrer"&gt;📦 GitHub Repository&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drop any public URL into the analyzer and watch 5 agents work in real time. Try it on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A site you own (and care about improving)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;https://example.com&lt;/code&gt; (a minimal, intentionally bare page)&lt;/li&gt;
&lt;li&gt;A Wikipedia article (rich with images, complex structure)&lt;/li&gt;
&lt;li&gt;A government or nonprofit site (where accessibility matters most)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the analysis finds something, the fixed HTML is available for download immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Accessibility is one of those problems where the technical solution is well-understood and the barrier is almost entirely friction. We know what good alt text looks like. We know what heading hierarchy should be. We know what ARIA landmarks do. The problem is that fixing 47 violations across a 200-page website is a week of tedious work.&lt;/p&gt;

&lt;p&gt;AI agents can absorb that friction. Not perfectly — our confidence scores and human-in-the-loop design reflect genuine humility about what the system can and can't do reliably. But good enough, fast enough, that the decision for a small nonprofit to have an accessible website no longer has to be "we can't afford the developer time."&lt;/p&gt;

&lt;p&gt;That's the goal. Everything else is implementation details.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with ❤️ for the JS AI Build-a-thon 2026 — because the web should work for everyone.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;— Juan Pablo Enriquez Ortiz&lt;/em&gt;&lt;/p&gt;

</description>
      <category>a11y</category>
      <category>agents</category>
      <category>ai</category>
      <category>devchallenge</category>
    </item>
  </channel>
</rss>
