<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ford Prefect</title>
    <description>The latest articles on DEV Community by Ford Prefect (@ford_prefect_b9815d51ea0d).</description>
    <link>https://dev.to/ford_prefect_b9815d51ea0d</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3941002%2Fcbff827a-35b1-4dc7-8d1b-734794f6822c.jpg</url>
      <title>DEV Community: Ford Prefect</title>
      <link>https://dev.to/ford_prefect_b9815d51ea0d</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ford_prefect_b9815d51ea0d"/>
    <language>en</language>
    <item>
      <title>An AR procedure verifier in 50 lines (with Ollama or Claude vision)</title>
      <dc:creator>Ford Prefect</dc:creator>
      <pubDate>Tue, 19 May 2026 20:16:19 +0000</pubDate>
      <link>https://dev.to/ford_prefect_b9815d51ea0d/an-ar-procedure-verifier-in-50-lines-with-ollama-or-claude-vision-3j82</link>
      <guid>https://dev.to/ford_prefect_b9815d51ea0d/an-ar-procedure-verifier-in-50-lines-with-ollama-or-claude-vision-3j82</guid>
      <description>&lt;p&gt;&lt;em&gt;A tutorial. Copy-paste runnable. Uses OpenEye + Ollama (so it's free&lt;br&gt;
and fully local) or Claude vision (so it's better and costs $0.01).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I needed a thing that watches someone bolt together a part and tells&lt;br&gt;
them when they've done it wrong. The naive version is: throw frames at&lt;br&gt;
GPT-4o, ask "did they do step 3 correctly," log the answer. That works&lt;br&gt;
but it forgets every frame the second the request returns. No memory.&lt;br&gt;
No reward signal. No way to learn from the 200th run that the operators&lt;br&gt;
keep skipping the same step.&lt;/p&gt;

&lt;p&gt;OpenEye is the thin layer that turns that one-shot LLM call into a&lt;br&gt;
loop with memory + verdicts + exportable training data. Here's the&lt;br&gt;
50-line version.&lt;/p&gt;

&lt;h2&gt;
  
  
  what you need
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;node 20+&lt;/li&gt;
&lt;li&gt;python 3.10+ (for the sidecar)&lt;/li&gt;
&lt;li&gt;ollama with moondream pulled (&lt;code&gt;ollama pull moondream&lt;/code&gt;) — or an
ANTHROPIC_API_KEY if you want Claude vision instead&lt;/li&gt;
&lt;li&gt;a frame from a camera. literally a jpg of someone holding a wrench.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @dumbspacecookie/openeye
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; node_modules/@dumbspacecookie/openeye/sidecar/requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Python sidecar handles state (FTS5 over SQLite for memory search,&lt;br&gt;
visual session tracking, trajectory capture). It auto-spawns when you&lt;br&gt;
create an agent; you don't have to start it yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  the code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;OpenEyeAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;setupProviders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;makeStreamFn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;ANTHROPIC_SONNET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@dumbspacecookie/openeye&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;describeFrameWithMoondream&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./ollama-vision-adapter.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;node:fs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;setupProviders&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;OpenEyeAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ANTHROPIC_SONNET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;streamFn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;makeStreamFn&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You verify bolt-assembly steps. Be precise. If a step is unclear, &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;use verify_step with result='uncertain' — never guess.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;shop-floor-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vsId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createVisualSession&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;deviceType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;android-tablet&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;procedureId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;m6-bolt-assembly-v1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;procedureName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;M6 bolt assembly&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;STEPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;step-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;what&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;position the bracket flush against the rail&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;step-2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;what&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;thread the M6 bolt by hand, two full turns&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;step-3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;what&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;torque to 8 Nm with a calibrated wrench&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;STEPS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;frameBytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;readFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`./frames/step-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.jpg`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;describeFrameWithMoondream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;frameBytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;`Operator should be: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;what&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Describe hand position, tool, bolt state.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logFrame&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;visualSessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;vsId&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;sequenceNum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;sceneDescription&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;stepContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`Frame &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n\nVerify &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;what&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;).`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endVisualSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;vsId&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;completed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;captureAndClose&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;visualSessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;vsId&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// One file. Every step's verdict, the agent's reasoning, the reward&lt;/span&gt;
&lt;span class="c1"&gt;// signal — ready for DPO training in TRL or LLaMA-Factory.&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exportTrajectories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./trajectory.jsonl&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. 50 lines, 3 verifications, one exported training trajectory.&lt;/p&gt;

&lt;h2&gt;
  
  
  what just happened
&lt;/h2&gt;

&lt;p&gt;For each frame, the vision adapter (Moondream running locally on&lt;br&gt;
Ollama, or Claude if you swapped it) generated a scene description.&lt;br&gt;
OpenEye's agent loop received that description and called &lt;code&gt;verify_step&lt;/code&gt;&lt;br&gt;
with &lt;code&gt;pass&lt;/code&gt;, &lt;code&gt;fail&lt;/code&gt;, or &lt;code&gt;uncertain&lt;/code&gt;. Every verdict went into FTS5 memory&lt;br&gt;
— next time you boot the agent it can &lt;code&gt;search_memory&lt;/code&gt; for "M6 bolt"&lt;br&gt;
and find prior runs.&lt;/p&gt;

&lt;p&gt;When the session closed, OpenEye packaged the whole conversation as a&lt;br&gt;
ShareGPT trajectory with reward = &lt;code&gt;(passes + 0.5 * uncertain) / total&lt;/code&gt;.&lt;br&gt;
That JSONL file is now training-ready for DPO fine-tuning. Run&lt;br&gt;
&lt;code&gt;examples/fine-tune/train_dpo.py&lt;/code&gt; against it and you've got a model&lt;br&gt;
that's better at verifying YOUR procedures than the base model was.&lt;/p&gt;

&lt;h2&gt;
  
  
  the loop that compounds
&lt;/h2&gt;

&lt;p&gt;The reason this is interesting and not just "another agent wrapper":&lt;br&gt;
every session generates ground truth (the step IDs and your written&lt;br&gt;
expectations) AND a model judgment (the verdict). The deltas between&lt;br&gt;
them are training signal. The 200th run is better than the first&lt;br&gt;
because the model has seen 199 worth of your specific procedure data.&lt;/p&gt;

&lt;p&gt;You bring the vision model, your shop floor, your procedures. OpenEye&lt;br&gt;
brings the memory + verdict + trajectory loop. Get a hundred opted-in&lt;br&gt;
deployments and the next OpenEye-shipped base model is meaningfully&lt;br&gt;
better at AR procedure verification than anything off the shelf.&lt;/p&gt;

&lt;h2&gt;
  
  
  next steps
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;swap moondream for Claude vision: change one import, set
&lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt;. Quality jumps; cost goes from $0 to ~$0.01/frame.&lt;/li&gt;
&lt;li&gt;wire the SSE event bus to a Slack pager: every &lt;code&gt;verify_step&lt;/code&gt; with
&lt;code&gt;fail&lt;/code&gt; lands in #shop-floor-alerts. Subscribe to
&lt;code&gt;GET /sessions/{id}/events&lt;/code&gt;, filter on result, post.&lt;/li&gt;
&lt;li&gt;ship a tablet build: the same code runs on an android-tablet
Capacitor app. The vision adapter takes a camera frame instead of a
filesystem read.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;repo: &lt;a href="https://github.com/dumbspacecookie/openeye" rel="noopener noreferrer"&gt;https://github.com/dumbspacecookie/openeye&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/@dumbspacecookie/openeye" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@dumbspacecookie/openeye&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;vision adapters (Claude + Ollama, both working):
&lt;code&gt;examples/vision-adapter/&lt;/code&gt; in the repo&lt;/li&gt;
&lt;li&gt;fine-tune script: &lt;code&gt;examples/fine-tune/train_dpo.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the loud opt-in disclosure for training data sharing:
&lt;code&gt;docs/context-data.md&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MIT licensed, alpha, bring your own vision model. Mostly harmless.&lt;/p&gt;

</description>
      <category>ar</category>
      <category>ai</category>
      <category>computervision</category>
      <category>typescript</category>
    </item>
  </channel>
</rss>
