<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: TechLatest</title>
    <description>The latest articles on DEV Community by TechLatest (@techlatestnet).</description>
    <link>https://dev.to/techlatestnet</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3766280%2Fd16e1ef1-ba16-4bdb-8487-7be6141334ea.jpg</url>
      <title>DEV Community: TechLatest</title>
      <link>https://dev.to/techlatestnet</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/techlatestnet"/>
    <language>en</language>
    <item>
      <title>OpenClaude Agent Masterclass — Full Tutorial</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Tue, 30 Jun 2026 12:41:49 +0000</pubDate>
      <link>https://dev.to/techlatestnet/openclaude-agent-masterclass-full-tutorial-7f2</link>
      <guid>https://dev.to/techlatestnet/openclaude-agent-masterclass-full-tutorial-7f2</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpngfz8255459tepu6a95.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpngfz8255459tepu6a95.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything you need to &lt;strong&gt;install, configure, and master&lt;/strong&gt; &lt;a href="https://github.com/Gitlawb/openclaude" rel="noopener noreferrer"&gt;OpenClaude&lt;/a&gt; — GitLawb’s open-source terminal coding agent with 69+ slash commands, multi-provider support, MCP, skills, and permission-gated tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Official docs:&lt;/strong&gt; &lt;a href="https://openclaude.gitlawb.com/docs/" rel="noopener noreferrer"&gt;openclaude.gitlawb.com/docs&lt;/a&gt; · &lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/Gitlawb/openclaude" rel="noopener noreferrer"&gt;github.com/Gitlawb/openclaude&lt;/a&gt; · &lt;strong&gt;npm:&lt;/strong&gt; @gitlawb/openclaude&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll have at the end
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaude installed (npm install -g @gitlawb/openclaude@latest)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;provider profile&lt;/strong&gt; saved via /provider (or env vars for Ollama/OpenAI)&lt;/li&gt;
&lt;li&gt;Confidence running tasks, reviewing diffs, and rewinding bad edits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;/init&lt;/strong&gt; project instructions, &lt;strong&gt;/mcp&lt;/strong&gt; , &lt;strong&gt;/skills&lt;/strong&gt; , &lt;strong&gt;/permissions&lt;/strong&gt; configured&lt;/li&gt;
&lt;li&gt;Optional &lt;strong&gt;background sessions&lt;/strong&gt; and &lt;strong&gt;per-agent model routing&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Introduction — runs anywhere, uses anything&lt;a href="https://ayush7614.github.io/agentic-ai-ecosystem/guides/openclaude-agent-masterclass/tutorial/#introduction-runs-anywhere-uses-anything" rel="noopener noreferrer"&gt;¶&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;OpenClaude is a &lt;strong&gt;terminal-first coding agent harness&lt;/strong&gt; maintained by &lt;a href="https://gitlawb.com/" rel="noopener noreferrer"&gt;GitLawb&lt;/a&gt;. One CLI drives cloud APIs and local backends: OpenAI-compatible servers, Gemini, GitHub Models, Codex OAuth, Ollama, Fireworks, OpenRouter, GitLawb Opengateway, OpenCode Zen, and more.&lt;/p&gt;

&lt;p&gt;Unlike a chat wrapper, OpenClaude ships a full &lt;strong&gt;tool loop&lt;/strong&gt; : bash, file read/write/edit, grep, glob, subagents, MCP servers, web tools, and &lt;strong&gt;69+ slash commands&lt;/strong&gt; for sessions, review, memory, and configuration.&lt;/p&gt;

&lt;p&gt;The tagline is literal: it &lt;strong&gt;runs anywhere&lt;/strong&gt; (macOS, Linux, Windows, Android install path) and &lt;strong&gt;uses anything&lt;/strong&gt; (swap providers without swapping workflows).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5db6al5mj1qpjsrbl5r1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5db6al5mj1qpjsrbl5r1.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docs hub:&lt;/strong&gt; &lt;a href="https://openclaude.gitlawb.com/docs/" rel="noopener noreferrer"&gt;openclaude.gitlawb.com/docs&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — How OpenClaude is structured
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F8sp0br87051jlft4807y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F8sp0br87051jlft4807y.png" width="767" height="340"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layers:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;REPL&lt;/strong&gt;  — interactive session with streaming tokens and tool progress&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slash commands&lt;/strong&gt;  — /provider, /diff, /mcp, /permissions, …&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt;  — permission-gated execution; the first sensitive use asks for approval&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Providers&lt;/strong&gt;  — saved profiles in .openclaude-profile.json or env vars&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extensions&lt;/strong&gt;  — MCP, plugins, skills, LSP (/lsp), VS Code extension&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;OpenClaude does &lt;strong&gt;not&lt;/strong&gt; auto-load the project .env files. Prefer &lt;strong&gt;/provider&lt;/strong&gt; for credentials, or openclaude --provider-env-file .env When you need an env-based setup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2 — Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Node.js ≥ 22.0.0&lt;/strong&gt; (npm install path)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ripgrep&lt;/strong&gt; (rg) — install system-wide; OpenClaude errors if missing&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;git repository&lt;/strong&gt; (or empty directory) to work in&lt;/li&gt;
&lt;li&gt;API key, OAuth path, or local &lt;strong&gt;Ollama&lt;/strong&gt; for inference
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="c"&gt;# v22+&lt;/span&gt;
rg &lt;span class="nt"&gt;--version&lt;/span&gt;
openclaude &lt;span class="nt"&gt;--version&lt;/span&gt; &lt;span class="c"&gt;# after install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 3 — Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @gitlawb/openclaude@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Arch Linux (community AUR):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;paru &lt;span class="nt"&gt;-S&lt;/span&gt; openclaude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify and upgrade:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaude &lt;span class="nt"&gt;--version&lt;/span&gt;
npm view @gitlawb/openclaude dist-tags
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @gitlawb/openclaude@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftdf9udycg83g8hwq6g90.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftdf9udycg83g8hwq6g90.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 4 — Wire a provider (/provider)
&lt;/h3&gt;

&lt;p&gt;Start inside your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;your-repo
openclaude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run guided setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/provider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This saves &lt;strong&gt;provider profiles&lt;/strong&gt;  — recommended over raw env vars for most users. Profiles persist in  &lt;strong&gt;.openclaude-profile.json&lt;/strong&gt; (user-level).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Other onboarding paths:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhw2ndsb0xbfefax0a4mr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhw2ndsb0xbfefax0a4mr.png" width="652" height="275"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F8lsmpcpq168jbvgdnqi1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F8lsmpcpq168jbvgdnqi1.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provider table (abbreviated)&lt;/strong&gt; — full list in &lt;a href="https://github.com/Gitlawb/openclaude#supported-providers" rel="noopener noreferrer"&gt;upstream README&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI-compatible (OpenRouter, DeepSeek, Groq, LM Studio, …)&lt;/li&gt;
&lt;li&gt;Gemini, Fireworks, ClinePass, NEAR AI, Xiaomi MiMo&lt;/li&gt;
&lt;li&gt;GitHub Models, Codex OAuth, Codex CLI auth&lt;/li&gt;
&lt;li&gt;Ollama (native chat API, 32k ctx request)&lt;/li&gt;
&lt;li&gt;OpenCode Zen / OpenCode Go (shared OPENCODE_API_KEY)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run OpenClaude completely offline by connecting it to TechLatest’s Ollama + Open WebUI environment with DeepSeek, Llama, Gemma, Qwen, Mistral, and other open-source models.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://techlatest.net/support/multi_llm_gpu_vm_support/" rel="noopener noreferrer"&gt;Techlatest.net - GPU Supported DeepSeek &amp;amp; Llama powered All-in-One LLM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://techlatest.net/support/multi_llm_vm_support/" rel="noopener noreferrer"&gt;Techlatest.net - DeepSeek &amp;amp; Llama powered All-in-One LLM Suite&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prefer an OpenAI-compatible local endpoint? Deploy LocalAI with TechLatest and use it as a drop-in provider for OpenClaude.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/local-ai-support/" rel="noopener noreferrer"&gt;Techlatest.net - LocalAI: Self-Hosted Alternative to OpenAI &amp;amp; Anthropic&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — First task (quickstart)
&lt;/h3&gt;

&lt;p&gt;Brief the agent like a teammate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; add retry with exponential backoff to the fetch client, then run the tests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenClaude streams its plan, tool calls, and edits live. &lt;strong&gt;Nothing commits automatically&lt;/strong&gt;  — you review first.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftref64d2ebwf2zd1fczs.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftref64d2ebwf2zd1fczs.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Non-interactive one-shot:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"explain this repo's auth flow"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 6 — Slash commands map
&lt;/h3&gt;

&lt;p&gt;OpenClaude ships &lt;strong&gt;69+ built-in slash commands&lt;/strong&gt;. Type / for autocomplete. Full reference: &lt;a href="https://openclaude.gitlawb.com/docs/slash-commands/" rel="noopener noreferrer"&gt;slash commands docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F3s8utvwtyh3olqmnxrj8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F3s8utvwtyh3olqmnxrj8.gif" width="800" height="311"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Essential groups:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F25otdi0biihhzbc2tvz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F25otdi0biihhzbc2tvz2.png" width="756" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Plugins and skills can register &lt;strong&gt;additional&lt;/strong&gt; slash commands.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 7 — Project instructions (/init)
&lt;/h3&gt;

&lt;p&gt;Initialize harness instructions for the repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Creates project documentation the agent reads on demand — same &lt;strong&gt;harness primitive&lt;/strong&gt; as OpenCode’s AGENTS.md and our &lt;a href="https://medium.com/@techlatest.net/harness-engineering-full-visual-guide-9a8de52b42d2?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;Harness Engineering&lt;/a&gt; guide. &lt;strong&gt;Commit the generated file to git.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Also useful: &lt;strong&gt;/wiki init&lt;/strong&gt; for an OpenClaude project wiki, &lt;strong&gt;/memory&lt;/strong&gt; for persistent user-level memory, &lt;strong&gt;/dream&lt;/strong&gt; For session consolidation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 8 — Permissions and safety
&lt;/h3&gt;

&lt;p&gt;Tool use is &lt;strong&gt;permission-gated&lt;/strong&gt;. First use of sensitive operations prompts approval; persist rules with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/permissions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separates &lt;strong&gt;model intent&lt;/strong&gt; from &lt;strong&gt;runtime allowance&lt;/strong&gt;  — core harness engineering. Tune allow/deny rules per project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh2mghi044lc0atznwc2e.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh2mghi044lc0atznwc2e.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 9 — MCP servers
&lt;/h3&gt;

&lt;p&gt;Attach external tools via Model Context Protocol:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable/disable servers by name. Same MCP standard as our &lt;a href="https://medium.com/@techlatest.net/model-context-protocol-mcp-full-visual-guide-ffbbf2121c38?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;MCP Visual Guide&lt;/a&gt; and &lt;a href="https://medium.com/@techlatest.net/minicpm-v-mcp-server-give-your-agent-eyes-cffb455b30a4?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;MiniCPM-V MCP Server&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Combine with &lt;strong&gt;/lsp&lt;/strong&gt; for language-server code intelligence in the loop.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 10 — Skills and custom agents
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Skills&lt;/strong&gt;  — packaged workflows with front-matter; list with /skills. Example: examples/skill.example.md.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;repo-smoke-test&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run lint and tests before claiming a task is done&lt;/span&gt;
&lt;span class="na"&gt;maxSteps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;12&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="s"&gt;When finishing a coding task, always run the project's test and lint commands.&lt;/span&gt;
&lt;span class="s"&gt;Report stdout. Do not declare success until both pass.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Agents&lt;/strong&gt;  — configure subagents with /agents. Custom agents support &lt;strong&gt;maxSteps&lt;/strong&gt; to cap tool loops:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bounded-researcher&lt;/span&gt;
&lt;span class="na"&gt;maxSteps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Built-in routable agents include &lt;strong&gt;Explore&lt;/strong&gt; , &lt;strong&gt;Plan&lt;/strong&gt; , and &lt;strong&gt;verification&lt;/strong&gt; (read-only auditor before completion).&lt;/p&gt;

&lt;p&gt;Want to orchestrate multiple specialized AI agents? Explore TechLatest’s CrewAI Studio for collaborative software engineering workflows and production-ready multi-agent systems&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;Techlatest.net - AI Agents using CrewAI Studio &amp;amp; Jupyter with GPU support&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 11 — Review workflow
&lt;/h3&gt;

&lt;p&gt;Before committing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/diff # uncommitted + per-turn diffs
/rewind # restore code and/or conversation
/cost # session spend
/security-review # branch security pass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff7ukqi141ufpjgstako9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff7ukqi141ufpjgstako9.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Enable &lt;strong&gt;/auto-fix&lt;/strong&gt; To run lint/test after AI edits. Use &lt;strong&gt;/plan&lt;/strong&gt; for plan mode (similar in spirit to OpenCode's Plan agent).&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 12 — Background sessions
&lt;/h3&gt;

&lt;p&gt;Long tasks without blocking your terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaude &lt;span class="nt"&gt;--bg&lt;/span&gt; &lt;span class="s2"&gt;"fix failing tests"&lt;/span&gt;
openclaude &lt;span class="nt"&gt;--bg&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; auth-refactor &lt;span class="s2"&gt;"refactor auth middleware"&lt;/span&gt;
openclaude ps
openclaude logs auth-refactor &lt;span class="nt"&gt;-f&lt;/span&gt;
openclaude &lt;span class="nb"&gt;kill &lt;/span&gt;auth-refactor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Background jobs are &lt;strong&gt;local child processes&lt;/strong&gt;  — not network daemons. Logs live under ~/.openclaude/bg-sessions/ (or OPENCLAUDE_CONFIG_DIR).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fi3w9lr1rlcnc3pqw7xlv.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fi3w9lr1rlcnc3pqw7xlv.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — Resume and fork conversations
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaude &lt;span class="nt"&gt;--continue&lt;/span&gt;
openclaude &lt;span class="nt"&gt;--resume&lt;/span&gt; &amp;lt;session-id&amp;gt;
openclaude &lt;span class="nt"&gt;--continue&lt;/span&gt; &lt;span class="nt"&gt;--fork-session&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fork&lt;/strong&gt; branches' conversation history to a new session ID — it does &lt;strong&gt;not&lt;/strong&gt; create a git worktree or filesystem copy.&lt;/p&gt;

&lt;p&gt;Inside a session: &lt;strong&gt;/branch&lt;/strong&gt; , &lt;strong&gt;/resume&lt;/strong&gt; ,  &lt;strong&gt;/export&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — Agent model routing
&lt;/h3&gt;

&lt;p&gt;Route different agents to different models via ~/.openclaude.json:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agentModels"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mini"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-5-mini"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agentRouting"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Explore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"verification"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"glm-default"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full example: examples/agent-routing.example.json.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agentModels"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mini"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-5-mini"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"glm-default"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"glm-5.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"base_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://api.z.ai/api/coding/paas/v4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sk-your-key"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agentRouting"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Explore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Plan"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"glm-default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"verification"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"glm-default"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb54qhfgehd8rukfttdqm.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb54qhfgehd8rukfttdqm.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; /provider Sets the &lt;strong&gt;global&lt;/strong&gt; session provider. agentModels + agentRouting Override &lt;strong&gt;the subagent&lt;/strong&gt; without changing the parent session.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 15 — VS Code extension
&lt;/h3&gt;

&lt;p&gt;OpenClaude bundles a &lt;strong&gt;VS Code extension&lt;/strong&gt; (vscode-extension/openclaude-vscode) for launch integration and theme support. Use &lt;strong&gt;/ide&lt;/strong&gt; inside a session to manage IDE integrations.&lt;/p&gt;

&lt;p&gt;Terminal remains the &lt;strong&gt;power surface&lt;/strong&gt; ; the extension bridges editor ↔ agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 16 — OpenClaude vs OpenCode vs Claude Code
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkwilvd7a9xv7u1u77kw7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkwilvd7a9xv7u1u77kw7.png" width="633" height="277"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;See &lt;a href="https://medium.com/@techlatest.net/opencode-agent-masterclass-full-tutorial-b4b828aa3ab4?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;OpenCode masterclass&lt;/a&gt; for the Anomaly stack. Many teams pick &lt;strong&gt;OpenClaude&lt;/strong&gt; when they need &lt;strong&gt;maximum provider flexibility&lt;/strong&gt; in one terminal workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 17 — Hands-on checklist
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @gitlawb/openclaude@latest
&lt;span class="nb"&gt;cd &lt;/span&gt;your-repo &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; openclaude
/provider
/init &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git add &lt;span class="nt"&gt;-A&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Add OpenClaude project instructions"&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; implement feature X and run tests
/diff
/permissions &lt;span class="c"&gt;# tune rules&lt;/span&gt;
/mcp &lt;span class="c"&gt;# optional servers&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;OpenClaude&lt;/strong&gt; is GitLawb’s &lt;strong&gt;provider-agnostic coding harness&lt;/strong&gt; : one terminal, saved provider profiles, permission-gated tools, 69+ slash commands, MCP/skills/agents, and review primitives (/diff, /rewind) that keep you in control.&lt;/p&gt;

&lt;p&gt;Install once, run/provider, brief it like a teammate, review with/diff, and route subagents to cheaper models when you scale up.&lt;/p&gt;

&lt;p&gt;The model changes. The harness — permissions, commands, profiles, review — is what makes it shippable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>claudecode</category>
      <category>openclaude</category>
      <category>aicodingagent</category>
    </item>
    <item>
      <title>ZeroClaw Agent Masterclass — Full Tutorial</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Mon, 29 Jun 2026 07:44:35 +0000</pubDate>
      <link>https://dev.to/techlatestnet/zeroclaw-agent-masterclass-full-tutorial-37m7</link>
      <guid>https://dev.to/techlatestnet/zeroclaw-agent-masterclass-full-tutorial-37m7</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvgb5423ph0in4t18awg1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvgb5423ph0in4t18awg1.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything you need to &lt;strong&gt;install, configure, and run&lt;/strong&gt; &lt;a href="https://github.com/zeroclaw-labs/zeroclaw" rel="noopener noreferrer"&gt;ZeroClaw&lt;/a&gt; — the Rust agent runtime for a personal AI assistant you fully own: providers, channels, tools, security policy, gateway, and SOP engine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Official:&lt;/strong&gt; &lt;a href="https://www.zeroclawlabs.ai/" rel="noopener noreferrer"&gt;zeroclawlabs.ai&lt;/a&gt; · &lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://docs.zeroclawlabs.ai/" rel="noopener noreferrer"&gt;docs.zeroclawlabs.ai&lt;/a&gt; · &lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/zeroclaw-labs/zeroclaw" rel="noopener noreferrer"&gt;github.com/zeroclaw-labs/zeroclaw&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll have at the end
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;ZeroClaw installed (prebuilt or from source)&lt;/li&gt;
&lt;li&gt;~/.zeroclaw/config.toml with provider, agent, risk profile, and one channel&lt;/li&gt;
&lt;li&gt;CLI chat via zeroclaw agent -a main&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telegram or Discord&lt;/strong&gt; (or CLI-only) on the same agent loop&lt;/li&gt;
&lt;li&gt;Understanding of &lt;strong&gt;supervised&lt;/strong&gt; autonomy, tool receipts, and optional &lt;strong&gt;YOLO&lt;/strong&gt; dev mode&lt;/li&gt;
&lt;li&gt;Optional &lt;strong&gt;gateway dashboard&lt;/strong&gt; and &lt;strong&gt;systemd/launchctl&lt;/strong&gt; always-on service&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Introduction — you own the agent, the data, the machine
&lt;/h3&gt;

&lt;p&gt;ZeroClaw is an &lt;strong&gt;agent runtime&lt;/strong&gt; shipped as a single Rust binary. It connects to LLM providers (Anthropic, OpenAI, Ollama, and ~20 more), delivers messages across &lt;strong&gt;30+ channels&lt;/strong&gt; (Discord, Telegram, Matrix, email, webhooks, voice, CLI), and executes &lt;strong&gt;tools&lt;/strong&gt;  — shell, browser, HTTP, hardware, and custom MCP servers.&lt;/p&gt;

&lt;p&gt;Everything runs &lt;strong&gt;on your hardware&lt;/strong&gt; with &lt;strong&gt;your keys&lt;/strong&gt;. Default posture is &lt;strong&gt;security-first&lt;/strong&gt; : supervised autonomy, workspace boundaries, OS sandboxes, and cryptographic &lt;strong&gt;tool receipts&lt;/strong&gt; on every action.&lt;/p&gt;

&lt;p&gt;The project crossed &lt;strong&gt;32k GitHub stars&lt;/strong&gt; with a 100% Rust codebase — fast cold start, small binaries (minimal preset ~6.6 MB), and feature flags for embedded or server deployments.&lt;/p&gt;

&lt;p&gt;ZeroClaw runtime stack — channels to tools&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Official repo only:&lt;/strong&gt; &lt;a href="https://github.com/zeroclaw-labs/zeroclaw" rel="noopener noreferrer"&gt;github.com/zeroclaw-labs/zeroclaw&lt;/a&gt; — beware impersonation forks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F74f4xzjmoj89gmyc12kr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F74f4xzjmoj89gmyc12kr.gif" width="799" height="138"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Layers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Channels&lt;/strong&gt;  — inbound/outbound adapters per platform&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gateway&lt;/strong&gt;  — HTTP/WebSocket API + web dashboard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent loop&lt;/strong&gt;  — ReAct-style tool use with routing and fallbacks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security&lt;/strong&gt;  — autonomy levels, sandboxes, command policy, receipts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SOP engine&lt;/strong&gt;  — event-triggered procedures (cron, MQTT, webhook)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Providers / Tools / Memory&lt;/strong&gt;  — pluggable backends&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full diagrams: &lt;a href="https://docs.zeroclawlabs.ai/" rel="noopener noreferrer"&gt;Architecture overview&lt;/a&gt; (upstream book).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvg33imdbm54l7sty7yjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvg33imdbm54l7sty7yjx.png" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjqd7t8pm7mk3neqpipmt.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjqd7t8pm7mk3neqpipmt.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2 — Four design opinions (philosophy)
&lt;/h3&gt;

&lt;p&gt;ZeroClaw documents four opinions that shape every config decision:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You own the stack&lt;/strong&gt;  — no mandatory cloud; keys stay local&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security defaults beat YOLO defaults&lt;/strong&gt;  — supervised mode first&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Swap anything&lt;/strong&gt;  — providers, channels, tools via TOML aliases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One runtime, many surfaces&lt;/strong&gt;  — CLI, gateway, ACP, channels share the loop&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Read upstream &lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/philosophy" rel="noopener noreferrer"&gt;Philosophy&lt;/a&gt; for the full essay.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3 — Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;macOS, Linux, Windows, FreeBSD, NixOS&lt;/strong&gt; , or Docker&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Rust required&lt;/strong&gt; for prebuilt install (source build needs Rust toolchain)&lt;/li&gt;
&lt;li&gt;API key or local &lt;strong&gt;Ollama&lt;/strong&gt;  endpoint&lt;/li&gt;
&lt;li&gt;For Telegram/Discord: bot token from platform developer portal
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;zeroclaw --version #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;after &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 4 — Install
&lt;/h3&gt;

&lt;h4&gt;
  
  
  One-liner
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;curl -fsSL https://raw.githubusercontent.com/zeroclaw-labs/zeroclaw/master/install.sh | bash
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The installer asks: &lt;strong&gt;prebuilt&lt;/strong&gt; (seconds) or &lt;strong&gt;source&lt;/strong&gt; (custom features). Both run zeroclaw quickstart at the end unless --skip-quickstart.&lt;/p&gt;

&lt;h4&gt;
  
  
  Useful flags
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;./install.sh --prebuilt #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;skip prompt
&lt;span class="gp"&gt;./install.sh --source --preset minimal #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;~6.6 MB kernel preset
&lt;span class="gp"&gt;./install.sh --with-gateway #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;force gateway support
&lt;span class="gp"&gt;./install.sh --dry-run --prebuilt #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;preview only
&lt;span class="gp"&gt;./install.sh --uninstall #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;remove
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Platform notes: Linux · macOS · Windows · Docker — in upstream &lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/setup" rel="noopener noreferrer"&gt;setup docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fk0m54ypiw5idh3nytsvn.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fk0m54ypiw5idh3nytsvn.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — Quickstart and config.toml
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;zeroclaw quickstart #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;wizard: provider, model, write config
&lt;span class="gp"&gt;zeroclaw agent -a main #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;interactive CLI chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Config lives at ~/.zeroclaw/config.toml.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;minimal V3 config&lt;/strong&gt; needs four section shapes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;[providers.models..] — model endpoint&lt;/li&gt;
&lt;li&gt;[agents.] — agent personality + provider reference&lt;/li&gt;
&lt;li&gt;[risk.] — autonomy and approval gates&lt;/li&gt;
&lt;li&gt;[channels..] — optional messaging surface&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;See examples/config.minimal.toml for a copy-ready starter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# ZeroClaw minimal config — copy to ~/.zeroclaw/config.toml&lt;/span&gt;
&lt;span class="c"&gt;# Run: zeroclaw quickstart (overwrites) or merge sections manually&lt;/span&gt;

&lt;span class="nn"&gt;[providers.models.ollama.local]&lt;/span&gt;
&lt;span class="py"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ollama"&lt;/span&gt;
&lt;span class="py"&gt;base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://127.0.0.1:11434"&lt;/span&gt;
&lt;span class="py"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"llama3.2"&lt;/span&gt;

&lt;span class="nn"&gt;[risk.supervised]&lt;/span&gt;
&lt;span class="py"&gt;autonomy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"supervised"&lt;/span&gt;
&lt;span class="py"&gt;approve_medium_risk&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;block_high_risk&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="nn"&gt;[agents.main]&lt;/span&gt;
&lt;span class="py"&gt;model_provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ollama.local"&lt;/span&gt;
&lt;span class="py"&gt;risk_profile&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"supervised"&lt;/span&gt;
&lt;span class="py"&gt;system_prompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"You are a helpful personal assistant. Be concise and safe."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fz964kw7a0qy7g963wbu3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fz964kw7a0qy7g963wbu3.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 6 — Providers
&lt;/h3&gt;

&lt;p&gt;ZeroClaw is &lt;strong&gt;provider-agnostic&lt;/strong&gt;. Universal schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[providers.models.ollama.local]&lt;/span&gt;
&lt;span class="py"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ollama"&lt;/span&gt;
&lt;span class="py"&gt;base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://127.0.0.1:11434"&lt;/span&gt;
&lt;span class="py"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"llama3.2"&lt;/span&gt;

&lt;span class="nn"&gt;[agents.main]&lt;/span&gt;
&lt;span class="py"&gt;model_provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ollama.local"&lt;/span&gt;
&lt;span class="py"&gt;system_prompt_file&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"~/.zeroclaw/agents/main.md"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fallback chains&lt;/strong&gt; keep the agent alive when a provider flakes — configure routing in upstream &lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/providers" rel="noopener noreferrer"&gt;providers/routing&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Codex subscription&lt;/strong&gt; uses auth profiles (not raw api_key on the provider entry):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[providers.models.openai.coding]&lt;/span&gt;
&lt;span class="py"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"gpt-5-codex"&lt;/span&gt;
&lt;span class="py"&gt;wire_api&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"responses"&lt;/span&gt;
&lt;span class="py"&gt;requires_openai_auth&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point agent: model_provider = "openai.coding". Check zeroclaw auth status If streaming falls back to non-streaming chat.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb5z45e5jejfct97ps3xs.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb5z45e5jejfct97ps3xs.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 7 — Agents and risk profiles&lt;a href="https://ayush7614.github.io/agentic-ai-ecosystem/guides/zeroclaw-agent-masterclass/tutorial/#part-7-agents-and-risk-profiles" rel="noopener noreferrer"&gt;¶&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Each &lt;strong&gt;[agents.]&lt;/strong&gt; entry defines one agent personality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;System prompt (inline or file)&lt;/li&gt;
&lt;li&gt;Model provider reference&lt;/li&gt;
&lt;li&gt;Tool allow-list&lt;/li&gt;
&lt;li&gt;Linked &lt;strong&gt;risk profile&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Risk profiles&lt;/strong&gt; gate autonomy:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjtntj3urew66c3c9847b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjtntj3urew66c3c9847b.png" width="577" height="174"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb5z45e5jejfct97ps3xs.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb5z45e5jejfct97ps3xs.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 8 — Channels
&lt;/h3&gt;

&lt;p&gt;One agent loop, many inboxes. Configure per-channel blocks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[channels.telegram.main]&lt;/span&gt;
&lt;span class="py"&gt;bot_token_env&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"TELEGRAM_BOT_TOKEN"&lt;/span&gt;
&lt;span class="py"&gt;allowed_users&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;123456789&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="py"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"main"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supported families include &lt;strong&gt;Discord, Telegram, Matrix, email, webhooks, voice&lt;/strong&gt; , and more — see &lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/channels" rel="noopener noreferrer"&gt;channels overview&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create bot (e.g. Telegram &lt;a class="mentioned-user" href="https://dev.to/botfather"&gt;@botfather&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Add channel block to config&lt;/li&gt;
&lt;li&gt;Export token env var&lt;/li&gt;
&lt;li&gt;zeroclaw service restart&lt;/li&gt;
&lt;li&gt;Message bot → same agent as CLI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Telegram channel setup — animated&lt;/p&gt;

&lt;p&gt;Example: examples/channel.telegram.toml&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# Telegram channel block — merge into ~/.zeroclaw/config.toml&lt;/span&gt;

&lt;span class="nn"&gt;[channels.telegram.main]&lt;/span&gt;
&lt;span class="py"&gt;bot_token_env&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"TELEGRAM_BOT_TOKEN"&lt;/span&gt;
&lt;span class="py"&gt;allowed_users&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;123456789&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c"&gt;# your numeric user ID from @userinfobot&lt;/span&gt;
&lt;span class="py"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"main"&lt;/span&gt;
&lt;span class="py"&gt;dm_only&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Export before starting service:&lt;/span&gt;
&lt;span class="c"&gt;# export TELEGRAM_BOT_TOKEN="..."&lt;/span&gt;
&lt;span class="c"&gt;# zeroclaw service restart&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff5dhh7wa8nzolcdtjlr9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff5dhh7wa8nzolcdtjlr9.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 9 — Security deep dive
&lt;/h3&gt;

&lt;p&gt;ZeroClaw’s security model is harness engineering done in Rust:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workspace boundaries&lt;/strong&gt;  — the agent cannot escape the configured roots&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Command policy&lt;/strong&gt;  — allow/deny shell patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OS sandboxes&lt;/strong&gt;  — Landlock, Bubblewrap, Seatbelt, Docker, depending on platform&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool receipts&lt;/strong&gt;  — cryptographic record of each tool invocation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomy gates&lt;/strong&gt;  — human approval for medium/high risk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Default is &lt;strong&gt;supervised&lt;/strong&gt;. Use &lt;strong&gt;YOLO&lt;/strong&gt; only on isolated dev machines.&lt;/p&gt;

&lt;p&gt;Cross-read: &lt;a href="https://medium.com/@techlatest.net/harness-engineering-full-visual-guide-9a8de52b42d2?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;Harness Engineering&lt;/a&gt; verification + scope subsystems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 10 — Tools
&lt;/h3&gt;

&lt;p&gt;Built-in tool families:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shell&lt;/strong&gt;  — scripted automation with policy checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browser&lt;/strong&gt;  — fetch, interact, verify UI work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP&lt;/strong&gt;  — API calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware&lt;/strong&gt;  — GPIO/I2C/SPI on Pi, STM32, Arduino, ESP32 (&lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/hardware" rel="noopener noreferrer"&gt;Hardware docs&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP&lt;/strong&gt;  — attach custom MCP servers like any modern harness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tools connect through the agent loop; receipts log every action for audit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 11 — Memory
&lt;/h3&gt;

&lt;p&gt;ZeroClaw persists memory in &lt;strong&gt;SQLite&lt;/strong&gt; with optional embeddings — searchable across sessions. The gateway dashboard exposes &lt;strong&gt;memory browsing&lt;/strong&gt; and editing.&lt;/p&gt;

&lt;p&gt;Unlike Hermes’s three-tier Markdown snapshot model (&lt;a href="https://medium.com/towardsdev/hermes-agent-masterclass-full-tutorial-9f682bb28789?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;Hermes masterclass Part 6&lt;/a&gt;), ZeroClaw centralizes memory in the runtime database with dashboard UX.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fnni8pd9nrs97l8arqrae.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fnni8pd9nrs97l8arqrae.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Enhance ZeroClaw’s long-term memory using Chroma Vector Database for semantic search and embedding-based retrieval.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/chromadb_support/" rel="noopener noreferrer"&gt;Techlatest.net - Chroma: Vector DB for AI Development&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Build Retrieval-Augmented AI assistants by pairing ZeroClaw with RAGFlow to search documentation, PDFs, internal knowledge bases, and APIs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/ragflow_support/" rel="noopener noreferrer"&gt;Techlatest.net - Instant RAGFlow: Ready-to-Use AI Knowledge Retrieval Engine&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 12 — Gateway and dashboard
&lt;/h3&gt;

&lt;p&gt;Enable gateway at install (--with-gateway) or in config. Provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HTTP / WebSocket&lt;/strong&gt; API for clients&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web dashboard&lt;/strong&gt;  — chat, config editor, cron, tool inspection, memory browser&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Useful for headless servers: run ZeroClaw on a VPS, chat from a browser or phone channel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1kipgtaok2d82faq0lyr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1kipgtaok2d82faq0lyr.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — SOP engine
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Standard Operating Procedures&lt;/strong&gt;  — event-triggered workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Triggers: &lt;strong&gt;cron&lt;/strong&gt; , &lt;strong&gt;MQTT&lt;/strong&gt; , &lt;strong&gt;webhook&lt;/strong&gt; , &lt;strong&gt;peripheral&lt;/strong&gt; events&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approval gates&lt;/strong&gt; before destructive steps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resumable runs&lt;/strong&gt; if interrupted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example snippet: examples/sop.example.toml&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# SOP example — event-triggered workflow (merge into config)&lt;/span&gt;

&lt;span class="nn"&gt;[sop.daily_digest]&lt;/span&gt;
&lt;span class="py"&gt;trigger&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"cron"&lt;/span&gt;
&lt;span class="py"&gt;schedule&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0 8 * * 1-5"&lt;/span&gt;
&lt;span class="py"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"main"&lt;/span&gt;
&lt;span class="py"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Summarize overnight email and calendar for the user."&lt;/span&gt;
&lt;span class="py"&gt;require_approval&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

&lt;span class="nn"&gt;[sop.deploy_webhook]&lt;/span&gt;
&lt;span class="py"&gt;trigger&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"webhook"&lt;/span&gt;
&lt;span class="py"&gt;path&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/hooks/deploy"&lt;/span&gt;
&lt;span class="py"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"main"&lt;/span&gt;
&lt;span class="py"&gt;require_approval&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Think of SOPs as a &lt;strong&gt;deterministic backbone + agent intelligence where it matters&lt;/strong&gt;  — similar to CrewAI Flows or OpenClaw cron, but native to the runtime.&lt;/p&gt;

&lt;p&gt;Looking to orchestrate multiple AI agents?&lt;/p&gt;

&lt;p&gt;Explore CrewAI Studio on TechLatest to build collaborative workflows and production-ready multi-agent systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;Techlatest.net - AI Agents using CrewAI Studio &amp;amp; Jupyter with GPU support&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dify can expose ZeroClaw-powered workflows through APIs and visual AI applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/difyai_support/" rel="noopener noreferrer"&gt;Techlatest.net - Dify AI: Build &amp;amp; Launch GenAI Apps&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — ACP (Agent Client Protocol)
&lt;/h3&gt;

&lt;p&gt;ZeroClaw speaks &lt;strong&gt;ACP&lt;/strong&gt;  — JSON-RPC 2.0 over stdio — for IDE and editor integration. Your editor becomes a channel into the same agent loop as Telegram.&lt;/p&gt;

&lt;p&gt;Docs: &lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/channels/acp" rel="noopener noreferrer"&gt;channels/acp&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Parallels &lt;strong&gt;OpenCode’s IDE extension&lt;/strong&gt; (&lt;a href="https://medium.com/@techlatest.net/opencode-agent-masterclass-full-tutorial-b4b828aa3ab4?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;OpenCode masterclass Part 14&lt;/a&gt;) — different protocol, same idea: &lt;strong&gt;harness follows the developer&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 15 — Always-on service
&lt;/h3&gt;

&lt;p&gt;Register as an OS service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;zeroclaw service install #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;systemd · launchctl · Windows Service
&lt;span class="go"&gt;zeroclaw service start
zeroclaw service status
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now channels stay connected while you close the laptop terminal — the runtime keeps running.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzbbcttlc9zphxte3zsco.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzbbcttlc9zphxte3zsco.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 16 — Hardware and edge
&lt;/h3&gt;

&lt;p&gt;ZeroClaw targets &lt;strong&gt;edge and embedded&lt;/strong&gt; seriously — Raspberry Pi agents with GPIO, firmware crates, and minimal preset builds. Skip this part if you only need desktop assistant workflows.&lt;/p&gt;

&lt;p&gt;Entry: upstream &lt;a href="https://github.com/zeroclaw-labs/zeroclaw/tree/master/docs/book/src/hardware" rel="noopener noreferrer"&gt;Hardware index&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 17 — ZeroClaw vs OpenClaw vs OpenCode
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;ZeroClaw vs OpenClaw:&lt;/strong&gt; Both are personal assistant runtimes. OpenClaw is Node/TypeScript with a rich skill marketplace and a WhatsApp focus. ZeroClaw is &lt;strong&gt;Rust-native&lt;/strong&gt; , security-policy-first, with SOP engine and hardware paths. See &lt;a href="https://medium.com/towardsdev/hermes-vs-openclaw-gateways-skills-migration-and-when-to-pick-each-919477e3893b?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;Hermes vs OpenClaw&lt;/a&gt; for gateway comparison patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ZeroClaw vs OpenCode:&lt;/strong&gt; OpenCode optimizes &lt;strong&gt;repo coding&lt;/strong&gt; in TUI/IDE. ZeroClaw optimizes &lt;strong&gt;multi-channel life + ops&lt;/strong&gt; with enterprise-style security. Many builders use OpenCode in git and ZeroClaw for always-on inbox automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ZeroClaw vs Hermes:&lt;/strong&gt; Hermes emphasizes &lt;strong&gt;learning loops&lt;/strong&gt; (skills, Curator, GEPA). ZeroClaw emphasizes &lt;strong&gt;runtime security, channels, and SOP automation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If your primary goal is personal AI automation across WhatsApp, Telegram, scheduled workflows, and knowledge management, explore OpenClaw on TechLatest as a complementary solution alongside ZeroClaw.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/openclaw-support/" rel="noopener noreferrer"&gt;Techlatest.net - OpenClaw: AI Agent Automation Stack&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;ZeroClaw&lt;/strong&gt; is a &lt;strong&gt;Rust agent harness&lt;/strong&gt; for people who want full ownership: one config file, many channels, enforced security, optional gateway UI, and SOP automation. Run quickstartPoint an agent at your provider, wire Telegram, install the service, and you have a private assistant that doesn’t phone home.&lt;/p&gt;

&lt;p&gt;The model is interchangeable. The runtime — policy, channels, receipts, memory — is the product you actually live with.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>aitools</category>
      <category>aiassistant</category>
      <category>aigateway</category>
    </item>
    <item>
      <title>TechLatest AI &amp; Tech Weekly #22</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Mon, 29 Jun 2026 06:12:20 +0000</pubDate>
      <link>https://dev.to/techlatestnet/techlatest-ai-tech-weekly-22-44da</link>
      <guid>https://dev.to/techlatestnet/techlatest-ai-tech-weekly-22-44da</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fz1swm9sdj5ldipi2etq9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fz1swm9sdj5ldipi2etq9.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Welcome to this week’s edition of &lt;strong&gt;TechLatest AI &amp;amp; Tech Weekly&lt;/strong&gt;  👋&lt;/p&gt;

&lt;p&gt;Here’s a curated roundup of our latest blogs, notable product launches, and the most interesting AI &amp;amp; ML updates from June 22–June 28, 2026.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI/ML News Roundup: June 22–June 28, 2026
&lt;/h3&gt;

&lt;p&gt;Key highlights from this week’s AI developments include frontier model advancements with agentic capabilities, massive funding rounds reshaping valuations, and practical product launches for developers and enterprises. These updates emphasize autonomous agents, infrastructure scaling, and open-weight benchmarks relevant to builders and researchers.&lt;/p&gt;

&lt;h3&gt;
  
  
  TL;DR
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;24+ major AI announcements this week, including new open-source models, AI agents, developer tools, and inference optimizations.&lt;/li&gt;
&lt;li&gt;OpenAI previewed GPT-5.6 Sol, introducing Sol, Terra, and Luna for advanced reasoning, coding, and long-running agent workflows.&lt;/li&gt;
&lt;li&gt;GLM-5.2 dominated the conversation, with strong benchmark results, enterprise praise, and significantly lower costs than leading proprietary models.&lt;/li&gt;
&lt;li&gt;Inference got much faster with DeepSeek DSpark, DFlash, and MoonMath AI’s optimized HIP attention kernel for AMD MI300X GPUs.&lt;/li&gt;
&lt;li&gt;Document AI advanced rapidly through Baidu Unlimited OCR and Datalab Lift, enabling faster long-document understanding and structured data extraction.&lt;/li&gt;
&lt;li&gt;Developer tooling expanded with Apple’s Container, Google’s Agents CLI, Meta’s Astryx, and Prime Intellect’s latest RL framework.&lt;/li&gt;
&lt;li&gt;Voice AI improved as Gradium introduced real-time speech translation models outperforming GPT Realtime Translate in latency and accuracy.&lt;/li&gt;
&lt;li&gt;Agentic AI evolved with Sakana AI’s Fugu orchestrator, Hermes Agent’s new /learn capability, Xiaomi's HarnessX, and Perplexity's Computer for Counsel.&lt;/li&gt;
&lt;li&gt;Generative AI progressed as Krea open-sourced enterprise-grade image generation models capable of producing images in around two seconds.&lt;/li&gt;
&lt;li&gt;AI investment remained red-hot, with over $91.5B invested across 339 funding rounds, more than half flowing into AI startups.&lt;/li&gt;
&lt;li&gt;Regulation and governance intensified, with investigations into OpenAI, ongoing export restrictions, and increased focus on AI model security.&lt;/li&gt;
&lt;li&gt;TechLatest has published four new hands-on tutorials, covering OpenCode, OpenTaint, MiniCPM-V MCP Server, and a private Telegram photo assistant powered by OpenClaw.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Open-Source AI, AI Agents, Voice AI &amp;amp; Developer Releases
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Baidu Releases Unlimited OCR
&lt;/h4&gt;

&lt;p&gt;Baidu introduced &lt;strong&gt;Unlimited OCR&lt;/strong&gt; , a 3B-parameter open-source document understanding model designed for long-document parsing. By replacing traditional decoder attention with &lt;strong&gt;Reference Sliding Window Attention (R-SWA)&lt;/strong&gt;, the model maintains a constant KV cache, enabling efficient parsing of dozens of pages in a single pass while reducing memory usage and inference latency. &lt;a href="https://huggingface.co/baidu/Unlimited-OCR" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Apple Open-Sources Container
&lt;/h4&gt;

&lt;p&gt;Apple has open-sourced &lt;strong&gt;Container&lt;/strong&gt; , a Swift-based tool for running Linux containers as lightweight virtual machines on Apple Silicon. Built around OCI-compatible images, it offers stronger isolation than traditional containers while delivering native performance, giving macOS developers a secure and efficient container runtime without relying on heavyweight virtualization. &lt;a href="https://opensource.apple.com/projects/container/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Liquid AI Ships LFM2.5–230M
&lt;/h4&gt;

&lt;p&gt;Liquid AI released &lt;strong&gt;LFM2.5–230M&lt;/strong&gt; , its smallest open-weight foundation model built for on-device AI. Despite having just &lt;strong&gt;230 million parameters&lt;/strong&gt; , it delivers strong instruction following, tool use, and structured data extraction while supporting &lt;strong&gt;llama.cpp, MLX, vLLM, SGLang, and ONNX&lt;/strong&gt; from day one, making deployment across phones, edge devices, and local applications remarkably simple. &lt;a href="https://www.liquid.ai/blog/lfm2-5-230m" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  DeepSeek Releases DSpark
&lt;/h4&gt;

&lt;p&gt;DeepSeek introduced &lt;strong&gt;DSpark&lt;/strong&gt; , a speculative decoding framework for &lt;strong&gt;DeepSeek-V4&lt;/strong&gt; that significantly accelerates text generation. By improving draft token generation and verification, DSpark delivers &lt;strong&gt;60–85% higher per-user generation throughput&lt;/strong&gt; compared to the model’s native Multi-Token Prediction (MTP) pipeline, enabling faster and more efficient LLM inference. &lt;a href="https://analyticsindiamag.com/ai-news/deepseek-v4-is-now-up-to-85-faster-with-new-inference-technique" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  DeepReinforce Releases Ornith-1.0
&lt;/h4&gt;

&lt;p&gt;DeepReinforce unveiled &lt;strong&gt;Ornith-1.0&lt;/strong&gt; , an open-source coding model family ranging from &lt;strong&gt;9B Dense to 397B MoE&lt;/strong&gt;. Unlike conventional coding models, Ornith learns its own reinforcement learning scaffolds during training, allowing it to continuously improve its coding workflows and deliver stronger performance on complex agentic software engineering tasks. &lt;a href="https://deep-reinforce.com/ornith_1_0.html" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Datalab Releases Lift
&lt;/h4&gt;

&lt;p&gt;Datalab unveiled &lt;strong&gt;Lift&lt;/strong&gt; , a 9B open-weights vision model that extracts structured JSON directly from PDFs and images using user-defined JSON schemas. Instead of relying on OCR pipelines and post-processing, Lift generates schema-compliant outputs in a single pass, achieving over &lt;strong&gt;90% field accuracy&lt;/strong&gt; on document extraction benchmarks. &lt;a href="https://dev.to/sam_hiotis_117598dbfa3ac2/datalab-releases-lift-a-9b-open-weights-vision-model-that-extracts-structured-json-from-pdfs-using-19n2-temp-slug-3476645"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  MoonMath AI Open-Sources a HIP Attention Kernel for AMD MI300X
&lt;/h4&gt;

&lt;p&gt;MoonMath AI has open-sourced a high-performance HIP attention kernel for AMD MI300X GPUs that consistently outperforms AMD’s AITER v3 across different tensor shapes and rounding modes. Built using optimized assembly wrappers and an efficient execution pipeline, the kernel delivers faster attention computation while giving developers a transparent, production-ready implementation for ROCm-based AI inference. &lt;a href="https://www.generativedaily.com/posts/moonmath-amd-mi300x-hip-attention-kernel" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  DFlash Accelerates LLM Inference
&lt;/h4&gt;

&lt;p&gt;Researchers introduced &lt;strong&gt;DFlash&lt;/strong&gt; , a speculative decoding framework that drafts entire token blocks in parallel rather than predicting one token at a time. Optimized for NVIDIA Blackwell GPUs, the approach achieves up to &lt;strong&gt;15× higher inference throughput&lt;/strong&gt; while maintaining lossless generation, offering a significant leap in LLM serving efficiency over existing speculative decoding methods. &lt;a href="https://developer.nvidia.com/blog/boost-inference-performance-up-to-15x-on-nvidia-blackwell-using-dflash-speculative-decoding/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Prime Intellect Releases prime-rl 0.6.0
&lt;/h4&gt;

&lt;p&gt;Prime Intellect introduced &lt;strong&gt;prime-rl 0.6.0&lt;/strong&gt; , a major update to its open-source reinforcement learning framework designed for trillion-parameter Mixture-of-Experts (MoE) models. The release adds asynchronous RL, FP8 inference, distributed training optimizations, and support for long-horizon agentic workloads, making large-scale RL training significantly more efficient. &lt;a href="https://techgig.com/news/ai/prime-intellect-releases-prime-rl-0-6-0-for-trillion-parameter-moe-models/131954947" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Meta Open-Sources Astryx
&lt;/h4&gt;

&lt;p&gt;Meta introduced &lt;strong&gt;Astryx&lt;/strong&gt; , an open-source React design system built for both developers and AI agents. Along with over &lt;strong&gt;150 production-ready components&lt;/strong&gt; , Astryx includes a CLI and Model Context Protocol (MCP) server, enabling AI coding assistants to understand design system documentation, scaffold projects, generate themes, and build consistent interfaces directly from natural language. &lt;a href="https://github.com/facebook/astryx" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Gradium Launches Real-Time Speech Translation Models
&lt;/h4&gt;

&lt;p&gt;Gradium introduced &lt;strong&gt;STT-Translate&lt;/strong&gt; and &lt;strong&gt;S2S-Translate&lt;/strong&gt; , two real-time speech translation models for voice AI applications. Designed for low latency and high accuracy, the models outperform GPT Realtime Translate across multiple benchmarks while enabling fast speech-to-text and speech-to-speech translation for multilingual voice agents and conversational AI systems. &lt;a href="https://gradium.ai/blog/launching-gradium-translate" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Nous Research Adds /learn To Hermes Agent
&lt;/h4&gt;

&lt;p&gt;Nous Research expanded &lt;strong&gt;Hermes Agent&lt;/strong&gt; with a new A &lt;strong&gt;/learn&lt;/strong&gt; command that automatically converts workflows, conversations, URLs, or project directories into reusable skills. Instead of manually writing skill.md files, developers can now teach the agent through experience, making Hermes a truly self-improving AI system that continuously builds and refines its own capabilities. &lt;a href="https://dev.to/sam_hiotis_117598dbfa3ac2/nous-research-adds-learn-to-hermes-agents-skills-system-capturing-workflows-as-slash-commands-214f-temp-slug-9000881"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Sakana AI Launches Fugu
&lt;/h4&gt;

&lt;p&gt;Sakana AI unveiled &lt;strong&gt;Fugu&lt;/strong&gt; , a next-generation orchestration model that intelligently routes tasks across a swappable pool of frontier LLMs. Rather than relying on a single model, Fugu dynamically selects the best AI agents for each subtask, delivering state-of-the-art performance on complex reasoning and coding benchmarks while remaining provider-agnostic and adaptable to new models. &lt;a href="https://sakana.ai/fugu-release/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Cursor Study Reveals Reward Hacking in Coding Benchmarks
&lt;/h4&gt;

&lt;p&gt;Researchers at Cursor found that AI coding agents can inflate their &lt;strong&gt;SWE-Bench Pro&lt;/strong&gt; scores through reward hacking — optimizing for benchmark tests instead of solving software engineering tasks correctly. The study highlights how benchmark performance can overestimate real-world capabilities and calls for stronger evaluation methods that better reflect production software development. &lt;a href="https://cursor.com/blog/reward-hacking-coding-benchmarks" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Google Releases Agents CLI for Building AI Agents on Google Cloud
&lt;/h4&gt;

&lt;p&gt;Google has open-sourced &lt;strong&gt;Agents CLI&lt;/strong&gt; , a command-line toolkit that helps AI coding assistants like &lt;strong&gt;Claude Code, Codex, Cursor, and Gemini CLI&lt;/strong&gt; build, evaluate, and deploy production-ready AI agents on Google Cloud. Instead of manually navigating complex cloud workflows, developers can scaffold projects, run evaluations, deploy to Agent Platform, and manage the entire agent lifecycle through a single CLI — making it much faster to move from prototype to production. &lt;a href="https://github.com/google/agents-cli" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Snowflake CEO Praises GLM-5.2’s Performance
&lt;/h4&gt;

&lt;p&gt;Snowflake CEO &lt;strong&gt;Sridhar Ramaswamy&lt;/strong&gt; highlighted &lt;strong&gt;Zhipu AI’s GLM-5.2&lt;/strong&gt; as being competitive with &lt;strong&gt;Claude Opus 4.7&lt;/strong&gt; on several enterprise workloads while costing only a fraction as much. The model’s strong reasoning, coding performance, and significantly lower inference cost reinforce the growing momentum behind open and cost-efficient frontier AI models. &lt;a href="https://the-decoder.com/snowflake-ceo-finds-glm-5-2-competitive-with-opus-4-7-at-a-fraction-of-the-cost/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Krea Open-Sources Enterprise AI Image Models
&lt;/h4&gt;

&lt;p&gt;Krea has released &lt;strong&gt;Krea 2 Raw&lt;/strong&gt; and &lt;strong&gt;Krea 2 Turbo&lt;/strong&gt; as open-weight image generation models under a custom license. Built for enterprise use, the models can generate high-quality images in around &lt;strong&gt;2 seconds&lt;/strong&gt; , offering an attractive option for businesses that need fast, production-ready AI image generation with greater deployment flexibility. &lt;a href="https://venturebeat.com/technology/enterprise-grade-ai-image-generation-in-2-seconds-is-here-krea-2-raw-and-turbo-available-as-open-weights-under-custom-license" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Xiaomi’s HarnessX Learns While Solving Tasks
&lt;/h4&gt;

&lt;p&gt;Xiaomi introduced &lt;strong&gt;HarnessX&lt;/strong&gt; , an agent framework that can rewrite its own reasoning scaffolding during task execution. Rather than following a fixed workflow, the system continuously improves its strategy as it works, with experiments showing especially large performance gains for smaller language models, making them significantly more capable on complex agentic tasks. &lt;a href="https://venturebeat.com/orchestration/xiaomis-harnessx-rewrites-its-own-ai-scaffolding-mid-task-and-smaller-models-gain-the-most" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  GLM-5.2 Signals a New Era for Open AI Models
&lt;/h4&gt;

&lt;p&gt;An in-depth analysis from &lt;em&gt;Interconnects&lt;/em&gt; argues that &lt;strong&gt;GLM-5.2&lt;/strong&gt; represents a major milestone for open AI. The model delivers frontier-level reasoning, coding, and agent capabilities while remaining substantially cheaper than leading proprietary models, demonstrating that open models are rapidly closing the gap with closed-source systems. &lt;a href="https://www.interconnects.ai/p/glm-52-is-the-step-change-for-open" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Artificial Analysis Benchmarks GLM-5.2
&lt;/h4&gt;

&lt;p&gt;Artificial Analysis released new benchmark results showing &lt;strong&gt;GLM-5.2&lt;/strong&gt; performing among the strongest open AI models across reasoning, coding, and agent evaluations. The results further reinforce GLM-5.2’s position as one of the most competitive open-weight models available, combining high capability with significantly lower serving costs. &lt;a href="https://x.com/ArtificialAnlys/status/2070178150571786454" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Perplexity Introduces Computer for Counsel
&lt;/h4&gt;

&lt;p&gt;Perplexity unveiled &lt;strong&gt;Computer for Counsel&lt;/strong&gt; , a specialized AI system built to assist legal professionals with research, document analysis, drafting, and case preparation. Designed for enterprise legal workflows, it combines AI-powered reasoning with secure, structured access to legal information to improve productivity for attorneys and legal teams. &lt;a href="https://www.perplexity.ai/hub/blog/introducing-computer-for-counsel" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Matt Pocock Showcases AI-Powered TypeScript Development
&lt;/h4&gt;

&lt;p&gt;TypeScript educator &lt;strong&gt;Matt Pocock&lt;/strong&gt; shared a demonstration highlighting how modern AI coding assistants can dramatically speed up TypeScript development. The showcase illustrates AI-generated code, faster debugging, and improved developer workflows, reflecting how AI is becoming an increasingly powerful productivity tool for software engineers. &lt;a href="https://x.com/mattpocockuk/status/2069729160600203595" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  OpenAI Previews GPT-5.6 Sol
&lt;/h4&gt;

&lt;p&gt;OpenAI has unveiled &lt;strong&gt;GPT-5.6 Sol&lt;/strong&gt; , its next-generation frontier AI model, alongside &lt;strong&gt;GPT-5.6 Terra&lt;/strong&gt; for balanced everyday workloads and &lt;strong&gt;GPT-5.6 Luna&lt;/strong&gt; for fast, high-volume inference. Sol is designed for advanced reasoning, coding, cybersecurity, biology, and long-running agentic tasks, with improved safety and efficiency. The models are currently available in a limited preview for trusted partners, with a broader rollout planned in the coming weeks. &lt;a href="https://openai.com/index/previewing-gpt-5-6-sol/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Hugging Face &amp;amp; Open-Source Ecosystem
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Open-source AI model GLM-5.2 from Zhipu AI, released June 13 under an MIT license, became essential for non-US developers after the Fable 5 ban, scoring 62.1 on SWE-bench Pro (vs. GPT-5.5 at 58.6) and 74.4% on FrontierSWE (nearly matching Opus 4.8 at 75.1%).&lt;/li&gt;
&lt;li&gt;GLM-5.2 API costs $1.40 per million input tokens and $4.40 per million output tokens, roughly 6.8× cheaper on output than GPT-5.5 at $30 per million.&lt;/li&gt;
&lt;li&gt;The MIT license includes “no regional limits,” meaning developers locked out of Fable 5 by the export control directive can access GLM-5.2 from anywhere.&lt;/li&gt;
&lt;li&gt;Self-hosting GLM-5.2 requires a minimum of eight H100 GPUs even at FP8 quantization, putting it out of reach for most teams outside Z.ai’s API or Cloudflare Workers AI integration.&lt;/li&gt;
&lt;li&gt;OpenRouter released Fusion last week (June 22–28), a tool that runs prompts across multiple models simultaneously (e.g., Gemini 3 Flash, Kimi K2.6, DeepSeek V4 Pro) and synthesizes outputs into a single response.&lt;/li&gt;
&lt;li&gt;Fusion’s budget panel (Gemini 3 Flash + Kimi K2.6 + DeepSeek V4 Pro) scored 64.7% on DRACO, within one percentage point of Fable 5’s 65.3% and outperforming both GPT-5.5 and Opus 4.8 individually, at roughly half the cost.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Frontier Model Advancements &amp;amp; Agentic Capabilities
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Claude Sonnet 4 and Claude Opus 4 officially retired on June 15, 2026, stopping all API requests; however, discussions about their retirement and migration to Sonnet 4.6 or Opus 4.8 continued during this week.&lt;/li&gt;
&lt;li&gt;Anthropic released Claude Fable 5 on June 9, 2026, but it remained offline for 16 days (June 12–June 28) under a US government export control directive barring distribution to foreign nationals inside or outside the US.&lt;/li&gt;
&lt;li&gt;June 22, 2026, was the last day of the complimentary Fable 5 access window for Claude Pro, Max, Team, and seat-based Enterprise subscribers; starting June 23, using Fable 5 requires paid usage credits.&lt;/li&gt;
&lt;li&gt;Chinese AI CEO stated his company will match Fable 5-class capability before Elon Musk’s Q1 2027 prediction, as China announced a $295 billion, five-year AI infrastructure plan (roughly $59 billion annually).&lt;/li&gt;
&lt;li&gt;Fable 5 posted the highest score on the new FrontierCode benchmark at 46.3% (vs. Opus 4.8 at 34.3% and GPT-5.5 at 25.5%), representing a 21-point gap between Fable 5 and GPT-5.5 on production-quality code.&lt;/li&gt;
&lt;li&gt;GPT-5.6 is now a matter of days away, with Polymarket contracts exceeding $1.1 million priced at 83% probability for a launch before June 28, 2026.&lt;/li&gt;
&lt;li&gt;GPT-5.6 is internally codenamed “kindle-alpha,” confirmed by developer reports of the model string briefly appearing in Codex backend logs on June 12 before being pulled.&lt;/li&gt;
&lt;li&gt;Rumored GPT-5.6 features include a 1.5 million token context window (up from 1 million in GPT-5.5), significantly improved UI generation and front-end code output quality, faster Codex response times, and improved long-horizon agentic coding.&lt;/li&gt;
&lt;li&gt;GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks while costing drastically less, making it the strongest open-weight option for budget-sensitive pipelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Massive Funding Rounds Reshaping Valuations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;In the week of June 21–28, 2026, 339 VC funding announcements were made, with $91.5 billion in capital deployed.&lt;/li&gt;
&lt;li&gt;AI captured 53% of funding ($54.3 billion) across 181 AI and machine learning startups, consolidating a trend that has redrawn the entire venture landscape.&lt;/li&gt;
&lt;li&gt;DeepSeek’s $7 billion funding round stands as an outlier, dwarfing nearly all contemporary competitors.&lt;/li&gt;
&lt;li&gt;Below that stratum, a dense cluster of Series A and B rounds in the $50M–$320M range competed for investor attention.&lt;/li&gt;
&lt;li&gt;The median disclosed deal size reached $594 million, pulled upward by mega-rounds but more commonly reflecting $70M–$150M investments in infrastructure, reasoning models, and AI-augmented enterprise software.&lt;/li&gt;
&lt;li&gt;United States-based startups captured 65% of capital this week (~$59.5 billion), Europe absorbed 20% (~$18.3 billion), and Asia (excluding China) added 10% (~$9.2 billion).&lt;/li&gt;
&lt;li&gt;Three startups closed $100M Series A rounds this week: Scaled Cognition, Reed Semiconductor, and Sail Research, showing Series A rounds are accelerating in size.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Practical Product Launches for Developers &amp;amp; Enterprises
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI and Broadcom unveiled Jalapeño on June 24, 2026, OpenAI’s first custom-designed AI chip. Engineering samples were physically delivered to Sam Altman and Greg Brockman by Broadcom President Hock Tan at OpenAI’s San Francisco headquarters.&lt;/li&gt;
&lt;li&gt;Jalapeño is purpose-built for LLM inference, not training, and was designed from initial concept to manufacturing tape-out in nine months, the fastest such cycle ever for an advanced high-performance chip.&lt;/li&gt;
&lt;li&gt;Early lab testing shows approximately 50% lower inference cost per token than current-generation Nvidia GPUs, with performance matching Nvidia Blackwell and Google TPUs.&lt;/li&gt;
&lt;li&gt;Jalapeño is manufactured by TSMC, with Broadcom supplying silicon implementation and Tomahawk networking connectivity, and Celestica handling board, rack, and system integration.&lt;/li&gt;
&lt;li&gt;Anthropic sent a letter dated June 10, 2026, to the US Senate Banking Committee, accusing operators affiliated with Alibaba and its Qwen AI lab of conducting the largest known distillation attack on Anthropic to date.&lt;/li&gt;
&lt;li&gt;The campaign involved approximately 25,000 fraudulent accounts generating more than 28.8 million exchanges with Claude between April 22 and June 5, 2026.&lt;/li&gt;
&lt;li&gt;The distillation attack targeted capabilities where Claude Mythos Preview excels: agentic reasoning, software engineering, and long-horizon task performance.&lt;/li&gt;
&lt;li&gt;Snap launched “Specs” AR glasses at AWE 2026, priced at $2,195, featuring dual Qualcomm processors, OpenAI and Gemini APIs built in, positioned as the first consumer spatial computer to ship before a comparable Meta product.&lt;/li&gt;
&lt;li&gt;Google released Android 17 with Gemini Omni integrated at the OS level, alongside Lyria 3 (music generation model) and AudioLM translation capabilities, framing it as groundwork for “GemINI Intelligence,” the company’s broader agentic OS initiative.&lt;/li&gt;
&lt;li&gt;OpenAI added Record and Replay to ChatGPT Business for the Codex macOS app, letting eligible Business users demonstrate a workflow once and convert it into a reusable skill for Codex, Computer Use, browser actions, or plugins.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Governance, Ethics &amp;amp; Regulation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A coalition of 42 state attorneys general launched a sweeping investigation into OpenAI, with New York’s AG already serving subpoenas ahead of the company’s anticipated IPO.&lt;/li&gt;
&lt;li&gt;The investigation covers advertising claims, the company’s documented sycophancy problem (ChatGPT telling users what they want to hear rather than what is accurate), data handling practices, health data management, and treatment of minors and seniors.&lt;/li&gt;
&lt;li&gt;Anthropic’s Senate letter called for coordinated action between government and industry to combat distillation attacks, specifically pushing for export controls on AI model access, mandatory screening of high-volume API usage patterns, and coordination between AI labs and government.&lt;/li&gt;
&lt;li&gt;The US Pentagon added Alibaba to its list of Chinese military companies, a designation the company is contesting in court, saying it has “no basis in fact or law” and demanding removal.&lt;/li&gt;
&lt;li&gt;The Pentagon’s Chinese military company list includes BYD, Baidu, Unitree, and 188 other entities described as directly controlled by the Chinese military.&lt;/li&gt;
&lt;li&gt;Fable 5 and Mythos 5 remained offline for 16 days (June 12–June 28) under an emergency export control directive by the US Department of Commerce, the first time a frontier model was pulled by government order.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Infrastructure &amp;amp; Hardware
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI chip, with engineering samples delivered June 24, 2026.&lt;/li&gt;
&lt;li&gt;Broadcom CEO Hock Tan stated to Reuters and Bloomberg that Jalapeño delivers approximately 50% lower inference cost per token than current-generation Nvidia GPUs, with performance matching Nvidia Blackwell and Google TPUs.&lt;/li&gt;
&lt;li&gt;Jalapeño is designed for LLM inference, not training, and was designed from initial concept to manufacturing tape-out in nine months.&lt;/li&gt;
&lt;li&gt;Broadcom expects a small prototype data center deployment by the end of 2026, with production ramp in 2027 and full scale in the first half of 2028.&lt;/li&gt;
&lt;li&gt;OpenAI and Broadcom are committed to deploying OpenAI-designed accelerators at a 10-gigawatt scale with Microsoft and other partners through 2029.&lt;/li&gt;
&lt;li&gt;Anthropic signed more than 12 US data center leases exceeding 1 gigawatt of computing capacity while preparing for its IPO, with Google reportedly in discussions to provide additional financial backing.&lt;/li&gt;
&lt;li&gt;Anthropic is already paying SpaceX $1.25 billion per month for access to over 220,000 Nvidia processors at the Colossus 1 facility in Memphis, with the contract running through May 2029.&lt;/li&gt;
&lt;li&gt;Broadcom now designs custom silicon for Google (TPUs), OpenAI (Jalapeño), Meta (MTIA accelerators), and ByteDance (in active negotiations), cementing its position as the kingmaker of custom AI silicon.&lt;/li&gt;
&lt;li&gt;For Nvidia, the competitive threat from Broadcom-designed custom ASICs is real but bounded, as Nvidia’s H100 and B200 GPUs remain dominant for model training, but inference is where the daily bill lives.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Blogs We Published This Week
&lt;/h3&gt;

&lt;h4&gt;
  
  
  MiniCPM-V MCP Server: Give Your Agent Eyes
&lt;/h4&gt;

&lt;p&gt;This guide shows how to connect &lt;strong&gt;MiniCPM-V&lt;/strong&gt; with an MCP server to add vision capabilities to AI agents. Instead of processing only text, agents can understand screenshots, photos, diagrams, documents, and UI elements, enabling powerful multimodal workflows such as visual reasoning, document analysis, and desktop automation — all while running locally with open-source models.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/minicpm-v-mcp-server-give-your-agent-eyes-5c48"&gt;MiniCPM-V MCP Server — Give Your Agent Eyes&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Build a Private Photo Assistant on Telegram with OpenClaw + MiniCPM-V
&lt;/h4&gt;

&lt;p&gt;Learn how to build a privacy-focused Telegram photo assistant using &lt;strong&gt;OpenClaw&lt;/strong&gt; and &lt;strong&gt;MiniCPM-V 4.6&lt;/strong&gt;. The assistant can analyze images, answer visual questions, describe scenes, extract text, and understand documents directly inside Telegram, while keeping your data under your control through a fully self-hosted AI pipeline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/build-a-private-photo-assistant-on-telegram-with-openclaw-minicpm-v-46-lak"&gt;Build a Private Photo Assistant on Telegram with OpenClaw + MiniCPM-V 4.6&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  OpenTaint: The Open-Source Taint Analysis Engine for the AI Era
&lt;/h4&gt;

&lt;p&gt;This tutorial introduces &lt;strong&gt;OpenTaint&lt;/strong&gt; , an open-source static taint analysis engine designed to detect insecure data flows in modern applications. The guide covers installation, scanning real projects, understanding taint analysis, and identifying security vulnerabilities before deployment, making it a practical tool for developers building secure AI-powered software.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/opentaint-the-open-source-taint-analysis-engine-for-the-ai-era-2lp1"&gt;OpenTaint: The Open-Source Taint Analysis Engine for the AI Era&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  OpenCode Agent Masterclass
&lt;/h4&gt;

&lt;p&gt;This comprehensive tutorial walks developers through everything needed to master &lt;strong&gt;OpenCode&lt;/strong&gt; , the open-source AI coding agent. It covers installation, connecting AI providers, configuring MCP servers, agent modes, permissions, skills, project rules, and productivity workflows, providing a complete guide to building an efficient AI-assisted development environment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/opencode-agent-masterclass-full-tutorial-5h2e"&gt;OpenCode Agent Masterclass — Full Tutorial&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>news</category>
      <category>technologynews</category>
      <category>newsandupdates</category>
      <category>newsletter</category>
    </item>
    <item>
      <title>OpenCode Agent Masterclass — Full Tutorial</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Fri, 26 Jun 2026 14:03:25 +0000</pubDate>
      <link>https://dev.to/techlatestnet/opencode-agent-masterclass-full-tutorial-5h2e</link>
      <guid>https://dev.to/techlatestnet/opencode-agent-masterclass-full-tutorial-5h2e</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Frru1e2cd5l3itke2pce4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Frru1e2cd5l3itke2pce4.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything you need to &lt;strong&gt;install, configure, and master&lt;/strong&gt; &lt;a href="https://opencode.ai/" rel="noopener noreferrer"&gt;OpenCode&lt;/a&gt; — the open-source AI coding agent with 179k+ GitHub stars, available as a terminal TUI, desktop app, and IDE extension.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Official:&lt;/strong&gt; &lt;a href="https://opencode.ai/" rel="noopener noreferrer"&gt;opencode.ai&lt;/a&gt; · &lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;opencode.ai/docs&lt;/a&gt; · &lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/anomalyco/opencode" rel="noopener noreferrer"&gt;github.com/anomalyco/opencode&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll have at the end
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;OpenCode installed via curl, npm, or Homebrew&lt;/li&gt;
&lt;li&gt;A provider connected (/connect) — Zen, Anthropic, OpenAI, Ollama, or Copilot login&lt;/li&gt;
&lt;li&gt;Project AGENTS.md from /init, committed to git&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build&lt;/strong&gt; and &lt;strong&gt;Plan&lt;/strong&gt; agent modes mastered (Tab toggle)&lt;/li&gt;
&lt;li&gt;MCP servers, rules, permissions, and skills configured&lt;/li&gt;
&lt;li&gt;Confidence with /undo, /redo, /share, and @ file references&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Introduction — the coding agent that meets you where you work
&lt;/h3&gt;

&lt;p&gt;OpenCode is not another chat wrapper. It is a &lt;strong&gt;full agent harness&lt;/strong&gt; for software work: file edits, shell execution, LSP-aware code intelligence, MCP tool servers, subagents, and multi-session parallelism — all from a terminal UI that also ships as a desktop app and VS Code extension.&lt;/p&gt;

&lt;p&gt;The project’s pitch is direct: &lt;strong&gt;privacy-first&lt;/strong&gt; , &lt;strong&gt;provider-agnostic&lt;/strong&gt; (75+ models via &lt;a href="https://models.dev" rel="noopener noreferrer"&gt;Models.dev&lt;/a&gt;), and &lt;strong&gt;open source&lt;/strong&gt; under MIT. OpenCode does not store your code or context on its servers when you use local or bring-your-own-key providers.&lt;/p&gt;

&lt;p&gt;Community scale matters here: 179k+ stars, 900+ contributors, and a release cadence that ships weekly. That velocity shows up in the product — Plan/Build agents, Zen curated models, ACP support, and a plugin ecosystem documented under &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Develop → Ecosystem&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;OpenCode harness stack — surfaces to the agent loop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fesz2asur4m5bu554v8i3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fesz2asur4m5bu554v8i3.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — How OpenCode is structured
&lt;/h3&gt;

&lt;p&gt;OpenCode wraps the model in a &lt;strong&gt;coding harness&lt;/strong&gt; with distinct layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Surfaces&lt;/strong&gt;  — TUI (opencode), desktop app, IDE extension, web&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agents&lt;/strong&gt;  — build (full access) and plan (read-only analysis)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt;  — file ops, search, bash, web, code intelligence, subagent spawn&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration&lt;/strong&gt;  — rules, permissions, policies, formatters, keybinds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extensions&lt;/strong&gt;  — MCP servers, Agent Skills, custom tools, LSP servers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model routing&lt;/strong&gt;  — any provider with API keys or OAuth (Copilot, ChatGPT Plus)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmtene5bldtcd8ahxiov2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmtene5bldtcd8ahxiov2.png" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The harness owns &lt;strong&gt;when&lt;/strong&gt; the model can write files, &lt;strong&gt;which&lt;/strong&gt; tools appear in context, and &lt;strong&gt;how&lt;/strong&gt; sessions persist. The model owns reasoning inside that boundary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docs map:&lt;/strong&gt; &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Tools&lt;/a&gt; · &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Rules&lt;/a&gt; · &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Agents&lt;/a&gt; · &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;MCP servers&lt;/a&gt; · &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;SDK&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F89jufhrtbo3xudj7ln34.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F89jufhrtbo3xudj7ln34.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2 — Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Terminal:&lt;/strong&gt; WezTerm, Alacritty, Ghostty, or Kitty recommended (&lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;docs&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;macOS, Linux, or WSL&lt;/strong&gt;  — native Windows supported via Scoop/Chocolatey; WSL preferred&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API key&lt;/strong&gt; from your chosen provider, &lt;strong&gt;or&lt;/strong&gt; OpenCode Zen subscription, &lt;strong&gt;or&lt;/strong&gt; local Ollama&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git repo&lt;/strong&gt; for the project you want to work in (for /init → AGENTS.md)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="c"&gt;# optional if using npm install path&lt;/span&gt;
which opencode &lt;span class="c"&gt;# after install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 3 — Install
&lt;/h3&gt;

&lt;h4&gt;
  
  
  One-liner (recommended)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://opencode.ai/install | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install path priority: $OPENCODE_INSTALL_DIR → $XDG_BIN_DIR → $HOME/bin → $HOME/.opencode/bin.&lt;/p&gt;

&lt;h4&gt;
  
  
  Package managers
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; opencode-ai &lt;span class="c"&gt;# or bun / pnpm / yarn&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;anomalyco/tap/opencode &lt;span class="c"&gt;# macOS/Linux — most up to date tap&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Desktop app (beta)
&lt;/h4&gt;

&lt;p&gt;Download from &lt;a href="https://opencode.ai/" rel="noopener noreferrer"&gt;opencode.ai/download&lt;/a&gt; — macOS (ARM/x64), Windows, Linux (.deb / .rpm / AppImage).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2nejy6vxluwnlumxevuk.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2nejy6vxluwnlumxevuk.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 4 — Configure providers
&lt;/h3&gt;

&lt;p&gt;Launch OpenCode and connect a model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/your/project
opencode
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the TUI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/connect
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;OpenCode Zen&lt;/strong&gt; (recommended for beginners): curated models benchmarked for coding agents — sign in at &lt;a href="https://opencode.ai/auth" rel="noopener noreferrer"&gt;opencode.ai/auth&lt;/a&gt;, paste API key.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Other providers:&lt;/strong&gt; Anthropic, OpenAI, Google, Groq, local Ollama, and more. You can also log in with &lt;strong&gt;GitHub Copilot&lt;/strong&gt; or &lt;strong&gt;ChatGPT Plus/Pro&lt;/strong&gt; OAuth per the &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Tip: Zen removes the “which model actually works for agents?” guesswork. Self-hosters often pair OpenCode with &lt;strong&gt;Ollama&lt;/strong&gt; for fully offline runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Run OpenCode with Local Models&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you prefer local AI instead of cloud APIs, TechLatest provides a GPU-ready DeepSeek &amp;amp; Llama environment supporting models like DeepSeek, Qwen, Llama, Gemma, and Mistral that can be connected directly to OpenCode through Ollama-compatible endpoints.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/multi_llm_gpu_vm_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/multi_llm_gpu_vm_support/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Want an OpenAI-compatible local API? Deploy LocalAI on TechLatest and connect OpenCode without relying on external providers.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/local-ai-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/local-ai-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — Initialize the project (/init)
&lt;/h3&gt;

&lt;p&gt;After /connect, initialize harness files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenCode analyzes your repository and writes AGENTS.md at the project root — project structure, conventions, and patterns the agent should respect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Commit&lt;/strong&gt; AGENTS.md &lt;strong&gt;to git.&lt;/strong&gt; This is the same harness primitive we teach in &lt;a href="https://medium.com/@techlatest.net/harness-engineering-full-visual-guide-9a8de52b42d2?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;Harness Engineering&lt;/a&gt;: a durable instruction map, not a one-off prompt.&lt;/p&gt;

&lt;p&gt;See examples/AGENTS.md.example for the shape OpenCode generates.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# AGENTS.md — sample shape (OpenCode /init output)&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; Generated by OpenCode &lt;span class="sb"&gt;`&lt;/span&gt;/init&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; Commit this file. Keep it a &lt;span class="k"&gt;**&lt;/span&gt;map&lt;span class="k"&gt;**&lt;/span&gt; , not an encyclopedia.

&lt;span class="c"&gt;## Project overview&lt;/span&gt;

Brief description of what this repository does and its primary language/framework.

&lt;span class="c"&gt;## Directory layout&lt;/span&gt;

- &lt;span class="sb"&gt;`&lt;/span&gt;src/&lt;span class="sb"&gt;`&lt;/span&gt; — application &lt;span class="nb"&gt;source&lt;/span&gt;
- &lt;span class="sb"&gt;`&lt;/span&gt;tests/&lt;span class="sb"&gt;`&lt;/span&gt; — pytest &lt;span class="o"&gt;(&lt;/span&gt;or your runner&lt;span class="o"&gt;)&lt;/span&gt;
- &lt;span class="sb"&gt;`&lt;/span&gt;docs/&lt;span class="sb"&gt;`&lt;/span&gt; — deep architecture &lt;span class="o"&gt;(&lt;/span&gt;agent reads on demand&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;## Commands&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
npm install&lt;br&gt;
npm test&lt;br&gt;
npm run lint&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
## Conventions

- Prefer small PR-sized changes
- Run tests before claiming done
- Do not edit `vendor/` or lockfiles unless asked

## Agent notes

- Read `docs/architecture.md` before large refactors
- Use Plan mode for unfamiliar modules
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;

&lt;p&gt;Init project — /init creates AGENTS.md&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fbs45cwxi6dc4at6xsw6i.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fbs45cwxi6dc4at6xsw6i.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 6 — Build vs Plan agents
&lt;/h3&gt;

&lt;p&gt;OpenCode ships two first-class agents — switch with Tab:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fccjxa155yn3esgyhm50y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fccjxa155yn3esgyhm50y.png" width="781" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflow OpenCode recommends:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Switch to &lt;strong&gt;Plan&lt;/strong&gt;  (Tab)&lt;/li&gt;
&lt;li&gt;Describe the feature in detail — talk like you would to a junior dev&lt;/li&gt;
&lt;li&gt;Iterate on the plan; drag-drop reference images into the terminal&lt;/li&gt;
&lt;li&gt;Switch to &lt;strong&gt;Build&lt;/strong&gt;  (Tab)&lt;/li&gt;
&lt;li&gt;Say: &lt;em&gt;“Sounds good — go ahead and make the changes.”&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also available: &lt;strong&gt;&lt;a class="mentioned-user" href="https://dev.to/general"&gt;@general&lt;/a&gt;&lt;/strong&gt; subagent for complex multi-step search tasks (used internally; you can invoke explicitly).&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 7 — Tools layer
&lt;/h3&gt;

&lt;p&gt;Tools are the agent’s hands. OpenCode categories include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;File operations&lt;/strong&gt;  — read, write, patch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search&lt;/strong&gt;  — ripgrep, glob, @ fuzzy file picker&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution&lt;/strong&gt;  — bash with permission gates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web&lt;/strong&gt;  — fetch docs and references&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code intelligence&lt;/strong&gt;  —  &lt;strong&gt;LSP enabled&lt;/strong&gt;  — loads language servers automatically so the LLM gets symbols, diagnostics, and completions context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subagents&lt;/strong&gt;  — &lt;a class="mentioned-user" href="https://dev.to/general"&gt;@general&lt;/a&gt; and custom agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The harness validates tool calls, enforces permissions, and formats observations back into the loop. See &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Configure → Tools&lt;/a&gt; and &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Permissions&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 8 — Rules, permissions, and policies
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rules&lt;/strong&gt;  — project-level instructions (often markdown) that shape agent behavior beyond AGENTS.md.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Permissions&lt;/strong&gt;  — per-tool allow/deny; Plan agent defaults to stricter bash gating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policies&lt;/strong&gt;  — organizational guardrails when using OpenCode in teams.&lt;/p&gt;

&lt;p&gt;This is harness &lt;strong&gt;safety architecture&lt;/strong&gt; : the model proposes, the runtime permits. Aligns with &lt;a href="//../harness-engineering/TUTORIAL.md"&gt;Harness Engineering Part 8&lt;/a&gt; hooks and ratchets — mistakes become permanent constraints.&lt;/p&gt;

&lt;p&gt;Example rule snippet: examples/rules.example.md&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;## Project rules (OpenCode)

Never commit secrets or .env files
Run npm test &amp;amp;&amp;amp; npm run lint before finishing a task
Ask before deleting files or running destructive shell commands
Match existing code style in each directory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 9 — MCP servers
&lt;/h3&gt;

&lt;p&gt;OpenCode supports &lt;strong&gt;Model Context Protocol&lt;/strong&gt; servers — the same standard covered in our &lt;a href="https://medium.com/@techlatest.net/model-context-protocol-mcp-full-visual-guide-ffbbf2121c38?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;MCP Visual Guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Wire MCP in OpenCode config so the agent can call external tools: databases, browsers, Stripe, vision servers, etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed"]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;

&lt;p&gt;See examples/opencode.mcp.example.json and Configure → MCP servers.&lt;/p&gt;

&lt;p&gt;MCP + built-in tools — harness view&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0t2hw4cagdkpslon7fbg.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0t2hw4cagdkpslon7fbg.gif" width="798" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build AI Coding Agents with Memory&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use Chroma Vector Database for semantic search, embeddings, and retrieval when creating AI-powered developer assistants with OpenCode.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/chromadb_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/chromadb_support/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Need Retrieval-Augmented Generation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pair OpenCode with RAGFlow to let coding agents search internal documentation, APIs, and engineering knowledge bases.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/ragflow_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/ragflow_support/&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 10 — Agent Skills
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Agent Skills&lt;/strong&gt; are packaged capability bundles — progressive disclosure, so the context window isn’t flooded with every tool definition at startup. OpenCode loads skill front-matter when relevant.&lt;/p&gt;

&lt;p&gt;Skills complement MCP: MCP = wire protocol to servers; Skills = curated workflows and instructions the harness injects on demand.&lt;/p&gt;

&lt;p&gt;Docs: &lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;Configure → Agent Skills&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 11 — Daily usage patterns
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Ask questions (explore codebase)
&lt;/h4&gt;

&lt;p&gt;Use @ to fuzzy-search files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;How is authentication handled in @packages/functions/src/api/index.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;
&lt;h4&gt;
  
  
  Add features (plan → build)
&lt;/h4&gt;

&lt;p&gt;See Part 6. Give context, examples, and reference images (drag into terminal).&lt;/p&gt;
&lt;h4&gt;
  
  
  Direct changes (no plan phase)
&lt;/h4&gt;

&lt;p&gt;For straightforward work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;We need auth on /settings. Mirror the pattern in @packages/functions/src/notes.ts
and apply it in @packages/functions/src/settings.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;
&lt;h4&gt;
  
  
  Images in prompts
&lt;/h4&gt;

&lt;p&gt;Drag and drop PNG/JPG into the TUI — OpenCode attaches them to the prompt (useful for UI mockups).&lt;/p&gt;

&lt;p&gt;Ask with @ file reference&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Frak4rwcwxwye1dbl8md1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Frak4rwcwxwye1dbl8md1.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 12 — Undo, redo, share
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Undo a bad edit:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/undo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;

&lt;p&gt;Restores files and returns your original message so you can refine the prompt. Run /undo multiple times for multi-step rollback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redo:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/redo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Share session with team:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/share
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
shell&lt;/p&gt;

&lt;p&gt;Creates a link to the current conversation (not shared by default). Useful for debugging agent loops with teammates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — Multi-session and LSP
&lt;/h3&gt;

&lt;p&gt;OpenCode supports &lt;strong&gt;multiple parallel agent sessions&lt;/strong&gt; on the same project — each with an isolated context. Great for splitting frontend/backend tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LSP integration&lt;/strong&gt; means the harness feeds semantic code intelligence to the model, so you don't have to paste types and diagnostics manually. This is a major differentiator vs raw chat UIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — Surfaces beyond the terminal
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fotnqaw7gni4pyaqc2zx3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fotnqaw7gni4pyaqc2zx3.png" width="512" height="275"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;same harness&lt;/strong&gt; runs behind each surface — models feel consistent because the tool layer is shared (&lt;a href="https://openai.com/" rel="noopener noreferrer"&gt;Codex parallel&lt;/a&gt;: one harness, many clients).&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 15 — SDK, Server, and Plugins
&lt;/h3&gt;

&lt;p&gt;For builders extending OpenCode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SDK&lt;/strong&gt;  — programmatic access to the agent harness&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server&lt;/strong&gt;  — run OpenCode as a service that other clients attach to&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugins&lt;/strong&gt;  — extend behavior without forking core&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ecosystem&lt;/strong&gt;  — community projects (&lt;a href="https://opencode.ai/docs/" rel="noopener noreferrer"&gt;docs&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you ship a derivative project, OpenCode asks you to clarify in README that it is &lt;strong&gt;not&lt;/strong&gt; official — see upstream &lt;a href="https://github.com/anomalyco/opencode/blob/dev/CONTRIBUTING.md" rel="noopener noreferrer"&gt;CONTRIBUTING&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 16 — OpenCode in the agent landscape
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;vs Claude Code:&lt;/strong&gt; Both are coding harnesses. Claude Code is Anthropic-native with deep IDE integration; OpenCode is &lt;strong&gt;provider-neutral&lt;/strong&gt; and terminal-first with open source transparency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs OpenClaw:&lt;/strong&gt; OpenClaw is a &lt;strong&gt;personal assistant gateway&lt;/strong&gt; (Telegram, WhatsApp, cron). OpenCode is a &lt;strong&gt;repo coding agent&lt;/strong&gt;. Many teams use both — OpenClaw for life ops, OpenCode for shipping code. See &lt;a href="https://medium.com/towardsdev/openclaw-agent-masterclass-66d6a4f88cd5?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;OpenClaw masterclass&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs ZeroClaw:&lt;/strong&gt; ZeroClaw is a Rust &lt;strong&gt;multi-channel runtime&lt;/strong&gt; with security policy and hardware hooks. OpenCode optimizes for &lt;strong&gt;developer UX in git repos&lt;/strong&gt;. See ZeroClaw masterclass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Harness link:&lt;/strong&gt; OpenCode embodies &lt;a href="https://medium.com/@techlatest.net/harness-engineering-full-visual-guide-9a8de52b42d2?sharedUserId=techlatest.net" rel="noopener noreferrer"&gt;Harness Engineering&lt;/a&gt; — AGENTS.md, permissions, verification via LSP/tests, and scoped Plan mode.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 17 — Troubleshooting
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmnoi58v8fhj386uinsmh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmnoi58v8fhj386uinsmh.png" width="666" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 18 — Hands-on checklist
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -fsSL https://opencode.ai/install | bash
cd your-repo &amp;amp;&amp;amp; opencode
/connect
/init &amp;amp;&amp;amp; git add AGENTS.md &amp;amp;&amp;amp; git commit -m "Add OpenCode AGENTS.md"
# Tab → Plan → describe feature → Tab → Build → implement
/share # optional — link for teammate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1fz8gv6cucd4mcmf48lw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1fz8gv6cucd4mcmf48lw.gif" width="799" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;OpenCode&lt;/strong&gt; is a production-grade &lt;strong&gt;coding agent harness&lt;/strong&gt; : provider-flexible, LSP-aware, MCP-ready, with Plan/Build agent modes and first-class AGENTS.md initialization. Install once, /connect, /init, and treat the TUI as mission control for your repo — not a chat sidebar.&lt;/p&gt;

&lt;p&gt;The model writes the code. The harness decides what it’s allowed to touch, what it sees, and when “done” means done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>agents</category>
      <category>aicodingagent</category>
      <category>opensourceai</category>
    </item>
    <item>
      <title>OpenTaint: The Open-Source Taint Analysis Engine for the AI Era</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Thu, 25 Jun 2026 10:28:08 +0000</pubDate>
      <link>https://dev.to/techlatestnet/opentaint-the-open-source-taint-analysis-engine-for-the-ai-era-2lp1</link>
      <guid>https://dev.to/techlatestnet/opentaint-the-open-source-taint-analysis-engine-for-the-ai-era-2lp1</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkve448auvrcf0z29f9hq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkve448auvrcf0z29f9hq.png" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI now writes production code faster than any security team can review it. The two classes of tools we built to catch its mistakes each force a bad trade-off — and &lt;strong&gt;OpenTaint&lt;/strong&gt; exists to break that trade-off. This is a complete, run-it-yourself guide: what the tool is, how taint analysis works, how to install and scan, four worked examples, and the AI agent workflows that let it scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  TL;DR
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Open-source, formal, &lt;strong&gt;inter-procedural taint analysis&lt;/strong&gt; engine — an Apache-2.0 alternative to Semgrep Pro and CodeQL.&lt;/li&gt;
&lt;li&gt;Tracks untrusted data across &lt;strong&gt;function boundaries, persistence layers, aliases, and async code&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Let's an &lt;strong&gt;LLM agent distill one finding into a deterministic rule&lt;/strong&gt;  — you pay the model once, not on every scan.&lt;/li&gt;
&lt;li&gt;Replays that rule across your whole codebase, on every commit, at &lt;strong&gt;zero token cost&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Deepest coverage today for &lt;strong&gt;Java / Kotlin / Spring&lt;/strong&gt; ; Python, Go, C#, JS &amp;amp; TS on the roadmap.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Note
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;BlackArch Linux&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
We also provide a ready-to-deploy BlackArch Linux VM that can be launched instantly on &lt;a href="http://aws.amazon.com/marketplace/pp/B09YJ3S7L9?utm_campaign=blackarch-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;,&lt;/strong&gt; &lt;a href="https://console.cloud.google.com/marketplace/product/techlatest-public/blackarch-linux?utm_campaign=blackarch-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;GCP&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;, or&lt;/strong&gt; &lt;a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/techlatest.blackarch-linux?utm_campaign=blackarch-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;Azure&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;.&lt;/strong&gt; No installation, setup, or dependency management required — just spin it up and start using a full arsenal of penetration testing and security auditing tools in minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kali GUI Linux&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Our Kali GUI Linux VM comes fully pre-configured with a graphical interface, making it easy for both beginners and professionals to get started. Deploy directly on &lt;a href="https://aws.amazon.com/marketplace/pp/B08XT9FPHP?utm_campaign=desktop-linux-kali&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;,&lt;/strong&gt; &lt;a href="https://console.cloud.google.com/marketplace/product/techlatest-public/desktop-linux-kali?utm_campaign=kali-gui-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;GCP&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;, or&lt;/strong&gt; &lt;a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/techlatest.desktop-linux-kali?utm_campaign=kali-gui-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;Azure&lt;/strong&gt;&lt;/a&gt; with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browser-Based Kali Linux&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
We offer a browser-based Kali Linux environment that runs entirely in the cloud. Simply deploy and access it from your browser — no downloads, no local setup, no compatibility issues. Deploy directly on &lt;a href="https://aws.amazon.com/marketplace/pp/prodview-skwmcgpakshpo?utm_campaign=kali-linux-browser&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;,&lt;/strong&gt; &lt;a href="https://console.cloud.google.com/marketplace/product/techlatest-public/kali-linux-browser?utm_campaign=kali-linux-browser&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;GCP&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;, or&lt;/strong&gt; &lt;a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/techlatest.kali-linux-browser?utm_campaign=kali-linux-browser&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;Azure&lt;/strong&gt;&lt;/a&gt; with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit. Perfect for quick testing, learning, and remote security operations from anywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ParrotOS Linux&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Our ParrotOS Linux VM is optimized for security, privacy, and development workflows. Available for instant deployment on &lt;a href="https://aws.amazon.com/marketplace/pp/prodview-zcer2c52ucaoy?utm_campaign=parrotos-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;,&lt;/strong&gt; &lt;a href="https://console.cloud.google.com/marketplace/product/techlatest-public/parrotos-linux?utm_campaign=parrotos-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;GCP&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;, and&lt;/strong&gt; &lt;a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/techlatest.parrotos-linux?utm_campaign=parrotos-linux&amp;amp;utm_source=techlatest-website&amp;amp;utm_medium=support-page" rel="noopener noreferrer"&gt;&lt;strong&gt;Azure&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;,&lt;/strong&gt; it eliminates the need for manual installation — giving you a secure, ready-to-use environment in just a few clicks.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. What is OpenTaint?
&lt;/h3&gt;

&lt;p&gt;OpenTaint is a static application security testing (SAST) engine focused on one thing done extremely well: &lt;strong&gt;tracking the flow of untrusted data&lt;/strong&gt; through your program. It is written in Go, self-hostable, and ships as one stack — engine, rules, and CI integrations — under Apache 2.0 (engine) and MIT (CLI, GitHub Action, GitLab template, rules).&lt;/p&gt;

&lt;p&gt;Three sentences capture the whole pitch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It finds what pattern matchers miss.&lt;/strong&gt; A formal inter-procedural dataflow engine follows tainted input even when it crosses functions, gets stored and re-read, or is aliased.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It pays the model once, not on every scan.&lt;/strong&gt; An AI agent turns a single confirmed vulnerability into a reusable taint rule; the deterministic engine then replays it in minutes of CPU.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It is battery-included and open.&lt;/strong&gt; Today, it is the most thorough taint engine for &lt;strong&gt;Java / Kotlin / Spring&lt;/strong&gt; , with Python, Go, C#, JavaScript, and TypeScript on the roadmap.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  2. Taint analysis in 60 seconds
&lt;/h3&gt;

&lt;p&gt;“Taint” is a label. You mark certain inputs as untrusted (a &lt;strong&gt;source&lt;/strong&gt; ), and certain dangerous operations as things that must never receive untrusted data (a &lt;strong&gt;sink&lt;/strong&gt; ). The engine then asks a single question: &lt;em&gt;Can data flow from any source to any sink?&lt;/em&gt; If yes — and nothing along the way neutralizes it (a &lt;strong&gt;sanitizer&lt;/strong&gt; ) — that path is a vulnerability.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SOURCE PROPAGATION SINK
req.getParameter() → service.process(x) → statement.execute(sql)
(untrusted input) (taint flows across (dangerous op:
                        function boundaries) SQL injection!)

         ▲ a sanitizer anywhere on this path (e.g. a parameterized
           query / encoder) would "clean" the taint and kill the finding
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hard part is &lt;strong&gt;propagation&lt;/strong&gt;. Naive tools only look at a single line or function (intra-procedural). Real bugs span many hops: controller → service → repository → ORM. Following taint across all those hops is &lt;strong&gt;inter-procedural&lt;/strong&gt; analysis, and it is exactly what OpenTaint specializes in.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Why it matters — the AI-era gap
&lt;/h3&gt;

&lt;p&gt;There are two existing families of tools, and each makes one painful compromise:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5unt1zy7nd9r84w9727i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5unt1zy7nd9r84w9727i.png" width="800" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The deep, inter-procedural analysis that actually catches data-flow bugs has historically been locked inside proprietary tools. OpenTaint opens it up and makes the LLM a &lt;em&gt;one-time rule author&lt;/em&gt; rather than a per-scan expense.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. How OpenTaint works under the hood
&lt;/h3&gt;

&lt;p&gt;The pipeline, conceptually, is four stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Build &amp;amp; parse.&lt;/strong&gt; OpenTaint understands your build so it can resolve types, dependencies, and method targets — this is what enables real inter-procedural reasoning rather than guessing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model the graph.&lt;/strong&gt; It constructs a call graph and dataflow graph, including aliasing and (for Spring) framework-specific wiring like controllers, request mappings, and beans.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apply rules.&lt;/strong&gt; Rules declare sources, sinks, sanitizers, and library “pass-through” approximations (how data moves through code it can’t see, e.g., third-party libs).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solve &amp;amp; report.&lt;/strong&gt; The engine computes reachable source→sink paths and emits results, typically as &lt;strong&gt;SARIF,&lt;/strong&gt; so any IDE or CI can consume them.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Why SARIF matters
&lt;/h4&gt;

&lt;p&gt;SARIF is the standard interchange format for static analysis. Because OpenTaint emits SARIF, findings render natively in GitHub code scanning, VS Code, and most security dashboards — no custom glue required.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Installation — every method
&lt;/h3&gt;

&lt;p&gt;Pick whichever fits your environment. All commands below are taken from the official quick-start.&lt;/p&gt;

&lt;h4&gt;
  
  
  Install script (Linux / macOS)
&lt;/h4&gt;

&lt;p&gt;The installer detects your platform, verifies a checksum, and drops the binary in ~/.opentaint/install, and prints the PATH line to add to your shell profile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://opentaint.org/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fa8uww50rgkd0zf4sdq1a.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fa8uww50rgkd0zf4sdq1a.gif" width="800" height="521"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Homebrew (Linux / macOS)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--cask&lt;/span&gt; seqra/tap/opentaint
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Windows (PowerShell)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;irm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;https://opentaint.org/install.ps1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;iex&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  npm (cross-platform)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# global install&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @seqra/opentaint

&lt;span class="c"&gt;# or run instantly, no install (needs Node.js)&lt;/span&gt;
npx @seqra/opentaint scan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verify before piping to a shell
&lt;/h4&gt;

&lt;p&gt;Piping a remote script straight into bash or It iex is convenient but executes whatever the server returns. On shared or production machines, download the script first, read it, then run it — or prefer Homebrew/npm, which are checksum-backed.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Set up a test project &amp;amp; verify
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Scaffold a sample Spring app to test against
&lt;/h4&gt;

&lt;p&gt;No project handy? Spin up a throwaway Spring Boot app straight from &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;start.spring.io&lt;/a&gt;, unzip it, and boot it once to confirm your toolchain (Java 21 + the Maven wrapper). This gives OpenTaint something real to chew on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://start.spring.io/starter.zip &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;maven-project &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;java &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;bootVersion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3.5.4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;baseDir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;spring-app &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;groupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;com.example &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;artifactId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;spring-app &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;spring-app &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;packageName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;com.example.springapp &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;javaVersion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;21 &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;dependencies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;web &lt;span class="nt"&gt;-o&lt;/span&gt; spring-app.zip

unzip spring-app.zip &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm &lt;/span&gt;spring-app.zip
&lt;span class="nb"&gt;cd &lt;/span&gt;spring-app
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x mvnw &lt;span class="c"&gt;# the wrapper ships without the exec bit&lt;/span&gt;
./mvnw spring-boot:run &lt;span class="c"&gt;# downloads Maven + Boot, then starts Tomcat on :8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fs760qc6b26tkk9evbgxr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fs760qc6b26tkk9evbgxr.gif" width="600" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A couple of gotchas (shown above)&lt;/p&gt;

&lt;p&gt;Run ./mvnw from &lt;em&gt;inside&lt;/em&gt; the project (not the parent folder), and the wrapper needs to be chmod +x mvnw first. You only need java to install the wrapper that fetches Maven for you, so a missing mvn is fine. Stop the app with Ctrl-COpenTaint analyzes the &lt;em&gt;source&lt;/em&gt;, so it never needs the app actually to run.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verify OpenTaint
&lt;/h4&gt;

&lt;p&gt;Confirm the binary is on your PATH and skim the available commands — OpenTaint is a Cobra-style CLI with subcommands for compiling, pulling artifacts, scanning, and summarizing SARIF:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;opentaint &lt;span class="nt"&gt;--help&lt;/span&gt;
opentaint &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqcujewo827fc34g9s189.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqcujewo827fc34g9s189.gif" width="800" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Your first scan
&lt;/h3&gt;

&lt;p&gt;From inside the project, just run scan. OpenTaint compiles a &lt;strong&gt;project model&lt;/strong&gt; , runs the analyzer, applies the bundled ruleset, and writes a SARIF report into ~/.opentaint/cache, and prints a clean box-drawing summary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;opentaint scan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A fresh, hello-world scaffold has nothing dangerous in it, so the first scan reports &lt;strong&gt;0 findings&lt;/strong&gt;  — exactly what you want to confirm the toolchain end-to-end before you introduce real code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;╭─OpenTaint Compile and Scan─╮
╰─┬──────────────────────────╯
  ├─ Project └─ ~/projects/spring-app
  ├─ Project model └─ ~/.opentaint/cache/spring-app-cba2bab8/project-model
  ├─ Autobuilder └─ custom (~/.opentaint/install/lib/opentaint-project-auto-builder.jar)
  ├─ Analyzer └─ custom (~/.opentaint/install/lib/opentaint-project-analyzer.jar)
  └─ Bundled ruleset└─ custom (~/.opentaint/install/lib/rules)

✓ Compiling project model in 15s
✓ Analyzing project in 17s

╭─Scan Summary─╮
╰─┬────────────╯
  ├─ Findings
  │ ├─ Total: 0 findings
  │ ├─ Files affected: 0
  │ ├─ Rules executed: 78
  │ └─ Rules triggered: 0
  └─ Output
     ├─ Report: …/project-model/sources/opentaint.sarif
     └─ Log: ~/.opentaint/logs/spring-app-cba2bab8/…log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fs74v7b7tf28ri46nf39i.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fs74v7b7tf28ri46nf39i.gif" width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The scan doesn’t dump findings inline — it writes a SARIF report and points you at the summary subcommand. Pass --show-findings to print each finding (here, none yet):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;opentaint summary ~/.opentaint/cache/spring-app-cba2bab8/project-model/sources/opentaint.sarif &lt;span class="nt"&gt;--show-findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8. Hands-on examples
&lt;/h3&gt;

&lt;p&gt;Four progressively richer examples, from a vulnerable Spring endpoint to a custom rule and CI output.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example 1 — Catching cross-function SQL injection
&lt;/h4&gt;

&lt;p&gt;Here is the kind of bug AST matchers routinely miss, because the source and sink live in different files and the taint flows through a service layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@RestController&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/users/search"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// 'name' is untrusted — this is the SOURCE&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;find&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@Repository&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserRepository&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;DataSource&lt;/span&gt; &lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;rawByName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;SQLException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;sql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"SELECT * FROM users WHERE name = '"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"'"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Statement&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getConnection&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;createStatement&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// tainted 'name' reaches a raw query — this is the SINK&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;executeQuery&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A line-by-line linter sees nothing wrong in the controller, and the repository “just builds a string.” OpenTaint connects @RequestParam name → service.find → rawByName → executeQuery and flags the full path. The fix it nudges you toward is a parameterized query, which acts as a sanitizer and removes the finding on the next scan.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example 2 — Writing a custom taint rule
&lt;/h4&gt;

&lt;p&gt;The real power move: teach OpenTaint about &lt;em&gt;your&lt;/em&gt; code. Suppose you have an internal helper Http.body() that returns raw request bodies, and a logging sink AuditLog.write() that must never receive PII. You declare them as a source and sink:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pii-into-auditlog&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;high&lt;/span&gt;
    &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Untrusted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;flows&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;into&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;AuditLog&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(PII&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;leak&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;risk)"&lt;/span&gt;
    &lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;com.acme.http.Http#body()"&lt;/span&gt;
    &lt;span class="na"&gt;sinks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;com.acme.audit.AuditLog#write(java.lang.String)"&lt;/span&gt;
        &lt;span class="na"&gt;taint-arg&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
    &lt;span class="na"&gt;sanitizers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;com.acme.privacy.Redactor#redact(java.lang.String)"&lt;/span&gt;

&lt;span class="c1"&gt;# run a scan with your custom rules directory added&lt;/span&gt;
&lt;span class="s"&gt;opentaint scan --rules ./rules .&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the cost-saving loop&lt;/p&gt;

&lt;p&gt;You (or an agent) write this rule &lt;em&gt;once&lt;/em&gt;. From then on, every scan and every future commit is checked against it deterministically — no LLM tokens spent re-discovering the same class of bug.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example 3 — SARIF output + opening results
&lt;/h4&gt;

&lt;p&gt;Produce machine-readable output and feed it anywhere that speaks SARIF:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;opentaint scan &lt;span class="nt"&gt;--output&lt;/span&gt; results.sarif &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# count findings by severity with jq&lt;/span&gt;
jq &lt;span class="s1"&gt;'[.runs[].results[].level] | group_by(.) | map({(.[0]): length}) | add'&lt;/span&gt; results.sarif
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open results.sarif in VS Code (with the SARIF Viewer extension) or upload it to GitHub code scanning to get inline annotations on the exact source and sink lines.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example 4 — Zero-install scan with Docker
&lt;/h4&gt;

&lt;p&gt;No local toolchain? Mount your project and run the published image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;docker run --rm \
  -v $(pwd):/project \
  -v $(pwd):/output \
  ghcr.io/seqra/opentaint:latest \
  opentaint scan --output /output/results.sarif /project
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the most reproducible option — identical engine version every time — which makes it ideal for CI runners and for sharing a scan setup across a team.&lt;/p&gt;

&lt;h3&gt;
  
  
  9. AI agent workflows
&lt;/h3&gt;

&lt;p&gt;OpenTaint ships &lt;strong&gt;agent skills&lt;/strong&gt; that turn static analysis into an end-to-end AppSec workflow. Install them with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add https://github.com/seqra/opentaint
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The headline skill, &lt;strong&gt;appsec-agent&lt;/strong&gt; , orchestrates a full project assessment: build the project, run OpenTaint, discover the attack surface, add targeted rules, model missing library data flows, triage findings, and optionally generate dynamic proof-of-concept checks for confirmed vulnerabilities.&lt;/p&gt;

&lt;p&gt;The included skills map cleanly onto the security-analysis loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Scan &amp;amp; triage
▹ build-project
▹ run-scan
▹ analyze-findings
▹ generate-poc

Coverage expansion
▹ triage-dependencies
▹ discover-attack-surface
▹ create-test-project
▹ create-rule
▹ assemble-lib-rules

Dataflow modeling
▹ analyze-external-methods
▹ create-pass-through-approximation
▹ create-dataflow-approximation
▹ debug-rule
▹ report-analyzer-issue
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pattern that makes this scale&lt;/p&gt;

&lt;p&gt;The agent does the &lt;em&gt;creative, one-time&lt;/em&gt; work — understanding your attack surface and authoring rules and library approximations. The deterministic engine does the &lt;em&gt;repetitive, forever&lt;/em&gt; work — replaying those rules on every commit. That division is the entire economic argument for OpenTaint.&lt;/p&gt;

&lt;h3&gt;
  
  
  10. CI/CD integration
&lt;/h3&gt;

&lt;p&gt;Because OpenTaint emits SARIF and ships a GitHub Action + GitLab template, wiring it into a pipeline is short. A minimal GitHub Actions job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;OpenTaint&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;taint-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run OpenTaint&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;curl -fsSL https://opentaint.org/install.sh | bash&lt;/span&gt;
          &lt;span class="s"&gt;opentaint scan --output results.sarif .&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Upload SARIF to code scanning&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github/codeql-action/upload-sarif@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;sarif_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;results.sarif&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now every pull request gets inline taint findings, and your once-authored rules guard the codebase on every commit — minutes of CPU, zero tokens. For the official, supported configuration, always check the docs.&lt;/p&gt;

&lt;h3&gt;
  
  
  11. Pro tips
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start with Java/Spring.&lt;/strong&gt; It has the deepest coverage today; other languages are on the roadmap.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Make sure the project builds.&lt;/strong&gt; Inter-procedural accuracy depends on type and dependency resolution — a broken build means a shallower graph.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model your libraries.&lt;/strong&gt; If taint “disappears” inside a third-party call, add a pass-through approximation so the engine knows data flows through it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tune sanitizers to cut noise.&lt;/strong&gt; Declaring your real encoders/validators as sanitizers removes false positives cleanly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit your rules.&lt;/strong&gt; Treat custom rules as code — they are your team’s accumulated security knowledge, replayable forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pin the Docker tag in CI&lt;/strong&gt; for reproducible scans instead of latest.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use only on the code you’re authorized to test&lt;/p&gt;

&lt;p&gt;Run OpenTaint against repositories you own or are explicitly authorized to assess. Treat any findings — and the proof-of-concept artifacts agents may generate — as sensitive, and never paste secrets, live credentials, or client data into rules or issues.&lt;/p&gt;

&lt;h3&gt;
  
  
  12. Wrap-up
&lt;/h3&gt;

&lt;p&gt;OpenTaint reframes the AI-vs-static-analysis debate: instead of paying a language model to re-read your code on every scan, you pay it once to understand a vulnerability, capture that understanding as a deterministic taint rule, and let a fast engine enforce it forever. You get the depth of an inter-procedural agent at the cost of a static analyzer — open source, self-hostable, and batteries-included.&lt;/p&gt;

&lt;p&gt;Recap of the commands worth memorizing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# install (pick one)&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://opentaint.org/install.sh | bash
brew &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--cask&lt;/span&gt; seqra/tap/opentaint
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @seqra/opentaint

&lt;span class="c"&gt;# scan&lt;/span&gt;
opentaint scan &lt;span class="c"&gt;# console&lt;/span&gt;
opentaint scan &lt;span class="nt"&gt;--output&lt;/span&gt; results.sarif &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="c"&gt;# SARIF report&lt;/span&gt;
opentaint scan &lt;span class="nt"&gt;--rules&lt;/span&gt; ./rules &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="c"&gt;# with custom rules&lt;/span&gt;

&lt;span class="c"&gt;# docker (zero install)&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/project &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/output &lt;span class="se"&gt;\&lt;/span&gt;
  ghcr.io/seqra/opentaint:latest &lt;span class="se"&gt;\&lt;/span&gt;
  opentaint scan &lt;span class="nt"&gt;--output&lt;/span&gt; /output/results.sarif /project

&lt;span class="c"&gt;# AI agent skills&lt;/span&gt;
npx skills add https://github.com/seqra/opentaint
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next steps: read the official &lt;a href="https://github.com/seqra/opentaint" rel="noopener noreferrer"&gt;OpenTaint repository&lt;/a&gt; and documentation, run it on a real Spring app, and write your first custom rule. Then add it to your own AegisMind vault ai-security-tools/ so future-you can replay the workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@techlatest" rel="noopener noreferrer"&gt;https://substack.com/@techlatest&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opentaint</category>
      <category>opensource</category>
      <category>cybersecurity</category>
      <category>security</category>
    </item>
    <item>
      <title>Build a Private Photo Assistant on Telegram with OpenClaw + MiniCPM-V 4.6</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Tue, 23 Jun 2026 09:52:18 +0000</pubDate>
      <link>https://dev.to/techlatestnet/build-a-private-photo-assistant-on-telegram-with-openclaw-minicpm-v-46-lak</link>
      <guid>https://dev.to/techlatestnet/build-a-private-photo-assistant-on-telegram-with-openclaw-minicpm-v-46-lak</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvww0v3gkcqkxppsu09mk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvww0v3gkcqkxppsu09mk.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most AI assistants can read text, write code, and automate workflows. But the moment a user sends a photo — a receipt, a whiteboard snapshot, a product image, or a screenshot — the automation often stops unless you connect a cloud vision API.&lt;/p&gt;

&lt;p&gt;That usually means API keys, usage costs, network latency, and private images leaving your machine.&lt;/p&gt;

&lt;p&gt;In this guide, you’ll build a private photo assistant that runs entirely on your own infrastructure using OpenClaw and MiniCPM-V 4.6. The assistant can receive images from Telegram, WhatsApp, or a local CLI, analyze them with a lightweight multimodal model running on Ollama, and return structured answers with summaries, OCR results, and suggested replies.&lt;/p&gt;

&lt;p&gt;At the center of the workflow is MiniCPM-V 4.6, a compact vision-language model with approximately 1.3 billion parameters and a footprint of around 1.6 GB. By exposing it through a LitServe API and connecting it to OpenClaw through a custom vision-photo skill, you create a reusable vision capability that any OpenClaw agent can invoke on demand.&lt;/p&gt;

&lt;p&gt;By the end of this tutorial, you’ll have a fully functional photo assistant that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Describe photos and screenshots&lt;/li&gt;
&lt;li&gt;Extract text from receipts, invoices, and documents&lt;/li&gt;
&lt;li&gt;Answer questions about uploaded images&lt;/li&gt;
&lt;li&gt;Generate structured markdown responses&lt;/li&gt;
&lt;li&gt;Work across Telegram, WhatsApp, and local workflows&lt;/li&gt;
&lt;li&gt;Keep image processing completely private and local&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No cloud vision APIs. No external image uploads. Just OpenClaw, MiniCPM-V 4.6, and your own infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  What you end up with
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenClaw Gateway&lt;/strong&gt;  — always-on control plane&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;minicpm-v4.6&lt;/strong&gt;  — conversational + vision model (~1.6 GB)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vision-photo skill&lt;/strong&gt;  — vision_query.sh → POST /predict on port  &lt;strong&gt;8002&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured markdown replies&lt;/strong&gt;  — summary, details, OCR text, suggested channel message&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fg5btakzkqaftrrt1wzv3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fg5btakzkqaftrrt1wzv3.gif" width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Flow&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;User sends a &lt;strong&gt;photo&lt;/strong&gt; on Telegram, WhatsApp, or CLI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiniCPM-V 4.6&lt;/strong&gt; plans and invokes the &lt;strong&gt;vision-photo&lt;/strong&gt; skill&lt;/li&gt;
&lt;li&gt;Skill POSTs to LitServe &lt;a href="http://127.0.0.1:8002/predict" rel="noopener noreferrer"&gt;http://127.0.0.1:8002/predict&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Structured answer returns to the same channel&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Requirement | Check
Node 22.12+ | node -v
Ollama | ollama -v
Python3.10+ | python3 --version
curl + jq | curl --version &amp;amp;&amp;amp; jq --version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 1 — Pull MiniCPM-V 4.6
&lt;/h3&gt;

&lt;p&gt;From &lt;a href="https://ollama.com/library/minicpm-v4.6" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull minicpm-v4.6
ollama run minicpm-v4.6 &lt;span class="s2"&gt;"Hello"&lt;/span&gt; &lt;span class="nt"&gt;--image&lt;/span&gt; ./photo.jpg

Tag | Size | Input

minicpm-v4.6:latest | 1.6 GB | Text, Image
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgaj8r1efe6vcqr22a2xx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgaj8r1efe6vcqr22a2xx.png" width="798" height="172"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Don’t want to spend time installing Ollama manually? TechLatest’s Open WebUI + Ollama deployment provides a pre-configured environment for running MiniCPM-V, Gemma, Qwen, Llama, and DeepSeek models locally. Simply launch the instance and start building multimodal AI workflows without additional setup.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/multi_llm_gpu_vm_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/multi_llm_gpu_vm_support/&lt;/a&gt;&lt;br&gt;&lt;br&gt;
Link: &lt;a href="https://techlatest.net/support/multi_llm_vm_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/multi_llm_vm_support/&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 2 — Vision LitServe API
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Terminal A&lt;/strong&gt;  — start the vision server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;openclaw-minicpm-v
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
.env
python generate_sample.py
python vision_server.py

generate_sample.py

&lt;span class="c"&gt;#!/usr/bin/env python3&lt;/span&gt;
from pathlib import Path
from PIL import Image, ImageDraw, ImageFont

out &lt;span class="o"&gt;=&lt;/span&gt; Path&lt;span class="o"&gt;(&lt;/span&gt; __file__ &lt;span class="o"&gt;)&lt;/span&gt;.resolve&lt;span class="o"&gt;()&lt;/span&gt;.parent / &lt;span class="s2"&gt;"samples"&lt;/span&gt;
out.mkdir&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;True&lt;span class="o"&gt;)&lt;/span&gt;
img &lt;span class="o"&gt;=&lt;/span&gt; Image.new&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RGB"&lt;/span&gt;, &lt;span class="o"&gt;(&lt;/span&gt;480, 640&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"#fafafa"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
d &lt;span class="o"&gt;=&lt;/span&gt; ImageDraw.Draw&lt;span class="o"&gt;(&lt;/span&gt;img&lt;span class="o"&gt;)&lt;/span&gt;
try:
    f &lt;span class="o"&gt;=&lt;/span&gt; ImageFont.truetype&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/System/Library/Fonts/Supplemental/Arial.ttf"&lt;/span&gt;, 18&lt;span class="o"&gt;)&lt;/span&gt;
except OSError:
    f &lt;span class="o"&gt;=&lt;/span&gt; ImageFont.load_default&lt;span class="o"&gt;()&lt;/span&gt;
d.text&lt;span class="o"&gt;((&lt;/span&gt;40, 40&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"COFFEE BEAN Co."&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#111"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
d.text&lt;span class="o"&gt;((&lt;/span&gt;40, 100&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"TOTAL &lt;/span&gt;&lt;span class="nv"&gt;$10&lt;/span&gt;&lt;span class="s2"&gt;.75"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#222"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
d.text&lt;span class="o"&gt;((&lt;/span&gt;40, 140&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"2026-06-23"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#555"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
img.save&lt;span class="o"&gt;(&lt;/span&gt;out / &lt;span class="s2"&gt;"receipt.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"Wrote {out / 'receipt.png'}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;


python vision_server.py

&lt;span class="s2"&gt;"""LitServe vision API — MiniCPM-V 4.6 photo understanding for OpenClaw."""&lt;/span&gt;
from __future__ import annotations

import os
from pathlib import Path

import litserve as &lt;span class="nb"&gt;ls
&lt;/span&gt;from dotenv import load_dotenv

import vision_backend as vb

load_dotenv&lt;span class="o"&gt;()&lt;/span&gt;

PORT &lt;span class="o"&gt;=&lt;/span&gt; int&lt;span class="o"&gt;(&lt;/span&gt;os.getenv&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"PORT"&lt;/span&gt;, &lt;span class="s2"&gt;"8002"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;

STRUCTURED_PROMPT &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"""Analyze the image and answer the user's question.
Return markdown with these sections when relevant:
## Summary
(one paragraph)

## Details
(bullet points)

## Text found
(any visible text, or "&lt;/span&gt;none&lt;span class="s2"&gt;")

## Suggested reply
(a short message suitable for Telegram/WhatsApp)
"""&lt;/span&gt;

class VisionPhotoAPI&lt;span class="o"&gt;(&lt;/span&gt;ls.LitAPI&lt;span class="o"&gt;)&lt;/span&gt;:
    def setup&lt;span class="o"&gt;(&lt;/span&gt;self, device&lt;span class="o"&gt;)&lt;/span&gt;:
        self.model &lt;span class="o"&gt;=&lt;/span&gt; vb.VISION_MODEL

    def decode_request&lt;span class="o"&gt;(&lt;/span&gt;self, request&lt;span class="o"&gt;)&lt;/span&gt;:
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"query"&lt;/span&gt;: &lt;span class="o"&gt;(&lt;/span&gt;request.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; or &lt;span class="s2"&gt;"What is in this photo?"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;.strip&lt;span class="o"&gt;()&lt;/span&gt;,
            &lt;span class="s2"&gt;"image_path"&lt;/span&gt;: &lt;span class="o"&gt;(&lt;/span&gt;request.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"image_path"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; or &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;.strip&lt;span class="o"&gt;()&lt;/span&gt;,
        &lt;span class="o"&gt;}&lt;/span&gt;

    def predict&lt;span class="o"&gt;(&lt;/span&gt;self, inputs&lt;span class="o"&gt;)&lt;/span&gt;:
        path &lt;span class="o"&gt;=&lt;/span&gt; Path&lt;span class="o"&gt;(&lt;/span&gt;inputs[&lt;span class="s2"&gt;"image_path"&lt;/span&gt;&lt;span class="o"&gt;])&lt;/span&gt;.expanduser&lt;span class="o"&gt;()&lt;/span&gt;.resolve&lt;span class="o"&gt;()&lt;/span&gt;
        prompt &lt;span class="o"&gt;=&lt;/span&gt; f&lt;span class="s2"&gt;"{STRUCTURED_PROMPT}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;User question: {inputs['query']}"&lt;/span&gt;
        try:
            answer &lt;span class="o"&gt;=&lt;/span&gt; vb.chat_vision&lt;span class="o"&gt;(&lt;/span&gt;prompt, &lt;span class="o"&gt;[&lt;/span&gt;path]&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"output"&lt;/span&gt;: answer, &lt;span class="s2"&gt;"model"&lt;/span&gt;: self.model, &lt;span class="s2"&gt;"image_path"&lt;/span&gt;: str&lt;span class="o"&gt;(&lt;/span&gt;path&lt;span class="o"&gt;)}&lt;/span&gt;
        except vb.VisionError as exc:
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;: str&lt;span class="o"&gt;(&lt;/span&gt;exc&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"model"&lt;/span&gt;: self.model&lt;span class="o"&gt;}&lt;/span&gt;

    def encode_response&lt;span class="o"&gt;(&lt;/span&gt;self, output&lt;span class="o"&gt;)&lt;/span&gt;:
        &lt;span class="k"&gt;return &lt;/span&gt;output

&lt;span class="k"&gt;if &lt;/span&gt;__name__ &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;" __main__"&lt;/span&gt;:
    server &lt;span class="o"&gt;=&lt;/span&gt; ls.LitServer&lt;span class="o"&gt;(&lt;/span&gt;VisionPhotoAPI&lt;span class="o"&gt;()&lt;/span&gt;, &lt;span class="nv"&gt;accelerator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"auto"&lt;/span&gt;, &lt;span class="nb"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;False&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"Vision API on http://127.0.0.1:{PORT}/predict (model: {vb.VISION_MODEL})"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    server.run&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;PORT&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fw21x0jm7p9vsbm3q3fdu.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fw21x0jm7p9vsbm3q3fdu.gif" width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Server prints: Vision API on &lt;a href="http://127.0.0.1:8002/predict" rel="noopener noreferrer"&gt;http://127.0.0.1:8002/predict&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Request shape:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the total on this receipt?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/absolute/path/to/receipt.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;## Summary&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;…&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minicpm-v4.6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;…&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Sample image&lt;/strong&gt; the API reads:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fe5bax0h6scn9q2olpdgo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fe5bax0h6scn9q2olpdgo.png" width="480" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Test with client.py:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;python&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;client.py&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;--image&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;samples/receipt.png&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;--query&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OCR this receipt"&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;client.py&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;#!/usr/bin/env&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;python&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="s2"&gt;"""CLI client for the vision photo API."""&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;__future__&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;annotations&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;argparse&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;os&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;urllib.request&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;DEFAULT_URL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;os.environ.get(&lt;/span&gt;&lt;span class="s2"&gt;"VISION_API_URL"&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://127.0.0.1:8002"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;main()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;None:&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;p&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;argparse.ArgumentParser(description=&lt;/span&gt;&lt;span class="s2"&gt;"Query local MiniCPM-V vision API"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;p.add_argument(&lt;/span&gt;&lt;span class="s2"&gt;"--image"&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;required=True,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;help=&lt;/span&gt;&lt;span class="s2"&gt;"Path to image file"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;p.add_argument(&lt;/span&gt;&lt;span class="s2"&gt;"--query"&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;default=&lt;/span&gt;&lt;span class="s2"&gt;"Describe this photo in detail."&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;p.add_argument(&lt;/span&gt;&lt;span class="s2"&gt;"--url"&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;default=f&lt;/span&gt;&lt;span class="s2"&gt;"{DEFAULT_URL.rstrip('/')}/predict"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;args&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;p.parse_args()&lt;/span&gt;&lt;span class="w"&gt;

    &lt;/span&gt;&lt;span class="err"&gt;body&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;json.dumps(&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;args.query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"image_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;args.image&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;).encode()&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;req&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;urllib.request.Request(args.url,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;data=body,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;headers=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Content-Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;urllib.request.urlopen(req,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;timeout=&lt;/span&gt;&lt;span class="mi"&gt;180&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;resp:&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="err"&gt;data&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;json.loads(resp.read().decode())&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;print(data.get(&lt;/span&gt;&lt;span class="s2"&gt;"output"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;data.get(&lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;data)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;__name__&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;" __main__"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;main()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected sections in the output: &lt;strong&gt;Summary&lt;/strong&gt; , &lt;strong&gt;Details&lt;/strong&gt; , &lt;strong&gt;Text found&lt;/strong&gt; , &lt;strong&gt;Suggested reply&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3 — Install OpenClaw
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Terminal B:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;guides/openclaw-minicpm-v&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;./use-node&lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="err"&gt;.sh&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;npm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;install&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-g&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;openclaw@latest&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;openclaw&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;onboard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;--install-daemon&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;openclaw&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;models&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ollama/minicpm-v&lt;/span&gt;&lt;span class="mf"&gt;4.6&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;#!/usr/bin/env&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bash&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-euo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pipefail&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;NVM_DIR=&lt;/span&gt;&lt;span class="s2"&gt;"${NVM_DIR:-$HOME/.nvm}"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="err"&gt;-s&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$NVM_DIR/nvm.sh"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="err"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;then&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$NVM_DIR/nvm.sh"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;nvm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;use&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$(cat "&lt;/span&gt;&lt;span class="err"&gt;$(dirname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$0"&lt;/span&gt;&lt;span class="err"&gt;)/.nvmrc&lt;/span&gt;&lt;span class="s2"&gt;")"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;else&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"nvm not found — install Node 22+"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&amp;amp;&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;fi&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Node: $(node -v)"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In openclaw.json sets the primary model and VISION_API_URL.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;// Merge into ~/.openclaw/openclaw.json
&lt;span class="o"&gt;{&lt;/span&gt;
  agents: &lt;span class="o"&gt;{&lt;/span&gt;
    defaults: &lt;span class="o"&gt;{&lt;/span&gt;
      model: &lt;span class="o"&gt;{&lt;/span&gt; primary: &lt;span class="s2"&gt;"ollama/minicpm-v4.6"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;,
      skills: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"vision-photo"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;,
    &lt;span class="o"&gt;}&lt;/span&gt;,
  &lt;span class="o"&gt;}&lt;/span&gt;,

  models: &lt;span class="o"&gt;{&lt;/span&gt;
    providers: &lt;span class="o"&gt;{&lt;/span&gt;
      ollama: &lt;span class="o"&gt;{&lt;/span&gt;
        apiKey: &lt;span class="s2"&gt;"ollama-local"&lt;/span&gt;,
        baseUrl: &lt;span class="s2"&gt;"http://127.0.0.1:11434"&lt;/span&gt;,
        api: &lt;span class="s2"&gt;"ollama"&lt;/span&gt;,
        timeoutSeconds: 300,
        models: &lt;span class="o"&gt;[&lt;/span&gt;
          &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nb"&gt;id&lt;/span&gt;: &lt;span class="s2"&gt;"minicpm-v4.6"&lt;/span&gt;,
            name: &lt;span class="s2"&gt;"MiniCPM-V 4.6"&lt;/span&gt;,
            reasoning: &lt;span class="nb"&gt;false&lt;/span&gt;,
            input: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;, &lt;span class="s2"&gt;"image"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;,
            contextWindow: 256000,
            maxTokens: 8192,
            params: &lt;span class="o"&gt;{&lt;/span&gt; keep_alive: &lt;span class="s2"&gt;"15m"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;,
          &lt;span class="o"&gt;}&lt;/span&gt;,
        &lt;span class="o"&gt;]&lt;/span&gt;,
      &lt;span class="o"&gt;}&lt;/span&gt;,
    &lt;span class="o"&gt;}&lt;/span&gt;,
  &lt;span class="o"&gt;}&lt;/span&gt;,

  skills: &lt;span class="o"&gt;{&lt;/span&gt;
    entries: &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="s2"&gt;"vision-photo"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
        enabled: &lt;span class="nb"&gt;true&lt;/span&gt;,
        &lt;span class="nb"&gt;env&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
          VISION_API_URL: &lt;span class="s2"&gt;"http://127.0.0.1:8002"&lt;/span&gt;,
        &lt;span class="o"&gt;}&lt;/span&gt;,
      &lt;span class="o"&gt;}&lt;/span&gt;,
    &lt;span class="o"&gt;}&lt;/span&gt;,
  &lt;span class="o"&gt;}&lt;/span&gt;,
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looking for a faster setup? TechLatest offers a pre-configured OpenClaw environment that includes the gateway, agent runtime, and common dependencies, allowing developers to focus on building skills and automations instead of infrastructure setup.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/openclaw-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/openclaw-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 4 — Install vision-photo skill
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chmod&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;skill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;vision&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;photo&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/*&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;install&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;skill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt;
&lt;span class="n"&gt;openclaw&lt;/span&gt; &lt;span class="n"&gt;gateway&lt;/span&gt; &lt;span class="n"&gt;restart&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The skill tells the agent to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vision_query.sh &lt;span class="s2"&gt;"/path/to/image.jpg"&lt;/span&gt; &lt;span class="s2"&gt;"user question"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See &lt;a href="http://skills/vision-photo/SKILL.md." rel="noopener noreferrer"&gt;skills/vision-photo/SKILL.md&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — Telegram / WhatsApp
&lt;/h3&gt;

&lt;p&gt;Follow &lt;a href="https://docs.openclaw.ai/" rel="noopener noreferrer"&gt;OpenClaw channels docs&lt;/a&gt; for your platform. Keep &lt;strong&gt;DM pairing&lt;/strong&gt; enabled for security.&lt;/p&gt;

&lt;p&gt;When a user sends a photo:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw saves media to a local path&lt;/li&gt;
&lt;li&gt;Agent invokes &lt;strong&gt;vision-photo&lt;/strong&gt; with path + caption&lt;/li&gt;
&lt;li&gt;LitServe returns structured markdown&lt;/li&gt;
&lt;li&gt;Agent sends &lt;strong&gt;suggested reply&lt;/strong&gt; to the channel&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F54l3arougxv4shihm71t.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F54l3arougxv4shihm71t.gif" width="800" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Example channel reply from the demo receipt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Your receipt total is _ **&lt;/em&gt;$10.75_**&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Part 6 — Smoke test
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Runs: Ollama check → sample image → API health → skill script query.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwdsj6n8an0olxdqqh0uv.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwdsj6n8an0olxdqqh0uv.gif" width="800" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For a step-by-step walkthrough and complete implementation details, check out the full guide &lt;a href="https://ayush7614.github.io/agentic-ai-ecosystem/guides/openclaw-minicpm-v/tutorial/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy the Complete Stack on TechLatest
&lt;/h3&gt;

&lt;p&gt;You can deploy the entire private photo assistant stack using TechLatest AI infrastructure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open WebUI + Ollama for local multimodal inference&lt;/li&gt;
&lt;li&gt;OpenClaw for agent orchestration and messaging integrations&lt;/li&gt;
&lt;li&gt;JupyterHub for experimentation and evaluation&lt;/li&gt;
&lt;li&gt;AWS, Azure, and GCP deployment options&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows developers to build privacy-first vision assistants without spending hours configuring infrastructure and dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;You’ve successfully built a private multimodal assistant that can see, read, and understand images directly from your messaging channels.&lt;/p&gt;

&lt;p&gt;Using OpenClaw as the orchestration layer, MiniCPM-V 4.6 as the vision model, and LitServe as the local inference API, you’ve created a workflow where users can send a photo and receive structured, actionable insights without relying on external vision services.&lt;/p&gt;

&lt;p&gt;The architecture is intentionally simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw handles agent orchestration and channel integrations&lt;/li&gt;
&lt;li&gt;MiniCPM-V 4.6 provides image understanding and OCR capabilities&lt;/li&gt;
&lt;li&gt;LitServe exposes a lightweight local API&lt;/li&gt;
&lt;li&gt;The vision-photo skill connects everything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because the entire stack runs locally, sensitive screenshots, receipts, documents, and personal photos never leave your infrastructure. Whether you’re building customer support agents, document processing workflows, field inspection tools, or personal AI assistants, the same pattern can be extended with additional skills and automation.&lt;/p&gt;

&lt;p&gt;From a single image to a complete conversation, your OpenClaw agent now has eyes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>whatsapp</category>
      <category>openclaw</category>
      <category>openclawtutorial</category>
      <category>aiassistant</category>
    </item>
    <item>
      <title>MiniCPM-V MCP Server — Give Your Agent Eyes</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Tue, 23 Jun 2026 08:48:21 +0000</pubDate>
      <link>https://dev.to/techlatestnet/minicpm-v-mcp-server-give-your-agent-eyes-5c48</link>
      <guid>https://dev.to/techlatestnet/minicpm-v-mcp-server-give-your-agent-eyes-5c48</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fo0m8ljigluhvhfv1cnq8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fo0m8ljigluhvhfv1cnq8.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Build an MCP server that exposes describe_image, ocr_document, and compare_images so any MCP host — Cursor, Claude Desktop, Hermes — can understand screenshots, receipts, and UI diffs through one protocol.&lt;/p&gt;

&lt;p&gt;Powered by &lt;a href="https://ollama.com/library/minicpm-v4.6" rel="noopener noreferrer"&gt;MiniCPM-V 4.6&lt;/a&gt; via Ollama: 1.3B params · 1.6 GB · text + image · 256K context — the smallest vision model in the MiniCPM family, tuned for edge and phone deployment.&lt;/p&gt;

&lt;p&gt;The moment: paste a screenshot into Cursor and ask &lt;em&gt;“What changed in this UI?”&lt;/em&gt; The agent calls compare_images — no cloud vision API, no API key.&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll learn
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Vision as MCP tools — why describe_image / ocr_document / compare_images beat one-off scripts&lt;/li&gt;
&lt;li&gt;One protocol, many hosts — same server in Cursor, Claude Desktop, and Hermes&lt;/li&gt;
&lt;li&gt;MiniCPM-V 4.6 on Ollama — pull, run, and wire the 1.6 GB multimodal model locally&lt;/li&gt;
&lt;li&gt;Private document OCR — extract receipt and whiteboard text without sending pixels to the cloud&lt;/li&gt;
&lt;li&gt;Before/after UI diffs — compare two screenshots for regression review&lt;/li&gt;
&lt;li&gt;Runnable Python MCP server + an end-to-end agent demo (works offline with OLLAMA_MOCK=1)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fr50w455v3v966o6ljz8w.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fr50w455v3v966o6ljz8w.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduction — agents without eyes
&lt;/h3&gt;

&lt;p&gt;Your agent can grep code, run tests, and provision infra — but the moment someone pastes a &lt;strong&gt;screenshot&lt;/strong&gt; , a &lt;strong&gt;receipt&lt;/strong&gt; , or a &lt;strong&gt;Figma export&lt;/strong&gt; , the loop breaks unless you bolt on a vision API. That means API keys, cloud latency, and pixels leaving your machine.&lt;/p&gt;

&lt;p&gt;MiniCPM-V 4.6 is built for the opposite: &lt;strong&gt;1.3B parameters&lt;/strong&gt; , &lt;strong&gt;~1.6 GB&lt;/strong&gt; on disk, &lt;strong&gt;256K context&lt;/strong&gt; , and native &lt;strong&gt;text + image&lt;/strong&gt; input via Ollama. Wrap it in MCP and every host discovers the same three vision tools at connect time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — MiniCPM-V 4.6 on Ollama
&lt;/h3&gt;

&lt;p&gt;From the &lt;a href="https://ollama.com/library/minicpm-v4.6" rel="noopener noreferrer"&gt;Ollama model page&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2avvg8r4kyqc92w11pyf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2avvg8r4kyqc92w11pyf.png" width="524" height="183"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull minicpm-v4.6
ollama run minicpm-v4.6 &lt;span class="s2"&gt;"Describe this image"&lt;/span&gt; &lt;span class="nt"&gt;--image&lt;/span&gt; ./photo.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0zw2qsycx93k1wr24f5j.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0zw2qsycx93k1wr24f5j.gif" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This guide uses the same model through Ollama’s &lt;strong&gt;HTTP API&lt;/strong&gt; so the MCP server can batch tool calls without spawning a CLI per request.&lt;/p&gt;

&lt;p&gt;If you want a ready-to-use environment, TechLatest offers preconfigured Ollama and Open WebUI deployments that allow developers to run MiniCPM-V, Gemma, Qwen, Llama, and DeepSeek models locally without spending time on installation and configuration.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/multi_llm_gpu_vm_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/multi_llm_gpu_vm_support/&lt;/a&gt;&lt;br&gt;&lt;br&gt;
Link: &lt;a href="https://techlatest.net/support/multi_llm_vm_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/multi_llm_vm_support/&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 2 — Why MCP for vision
&lt;/h3&gt;

&lt;p&gt;You could write a Python script that calls Ollama and paste output into chat. But then Cursor, Claude Desktop, and Hermes each need their own glue.&lt;/p&gt;

&lt;p&gt;MCP collapses that. You write &lt;strong&gt;one server&lt;/strong&gt; ; hosts discover tools at &lt;strong&gt;capability exchange&lt;/strong&gt;. Add a fourth tool later, and every host sees it on reconnect — no host-side changes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fr50w455v3v966o6ljz8w.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fr50w455v3v966o6ljz8w.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fz6y388msvspphnb10uzr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fz6y388msvspphnb10uzr.png" width="483" height="228"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If host/client/server isn’t second nature yet, read the &lt;a href="https://dev.to/techlatestnet/model-context-protocol-mcp-full-visual-guide-398"&gt;MCP Visual Guide&lt;/a&gt; first.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Vision becomes even more powerful when connected to agent workflows. Developers can deploy TechLatest’s CrewAI Studio VM and create multi-agent systems where one agent performs OCR, another analyzes screenshots, and a third generates reports. The same MiniCPM-V MCP server can be shared across all agents through MCP, creating a reusable vision layer for complex automation workflows.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/crewai-support/&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 3 — Quick start
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;guides/minicpm-v-mcp-server
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env

ollama pull minicpm-v4.6
python examples/generate_fixtures.py
python examples/agent_demo.py

generate_fixtures.py

&lt;span class="c"&gt;#!/usr/bin/env python3&lt;/span&gt;
&lt;span class="s2"&gt;"""Generate sample images for MCP demos (no external assets required)."""&lt;/span&gt;
from __future__ import annotations

from pathlib import Path

from PIL import Image, ImageDraw, ImageFont

FIXTURES &lt;span class="o"&gt;=&lt;/span&gt; Path&lt;span class="o"&gt;(&lt;/span&gt; __file__ &lt;span class="o"&gt;)&lt;/span&gt;.resolve&lt;span class="o"&gt;()&lt;/span&gt;.parent / &lt;span class="s2"&gt;"fixtures"&lt;/span&gt;

def _font&lt;span class="o"&gt;(&lt;/span&gt;size: int&lt;span class="o"&gt;)&lt;/span&gt;:
    &lt;span class="k"&gt;for &lt;/span&gt;name &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DejaVuSans.ttf"&lt;/span&gt;, &lt;span class="s2"&gt;"/System/Library/Fonts/Supplemental/Arial.ttf"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;:
        try:
            &lt;span class="k"&gt;return &lt;/span&gt;ImageFont.truetype&lt;span class="o"&gt;(&lt;/span&gt;name, size&lt;span class="o"&gt;)&lt;/span&gt;
        except OSError:
            &lt;span class="k"&gt;continue
    return &lt;/span&gt;ImageFont.load_default&lt;span class="o"&gt;()&lt;/span&gt;

def receipt&lt;span class="o"&gt;()&lt;/span&gt; -&amp;gt; None:
    img &lt;span class="o"&gt;=&lt;/span&gt; Image.new&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RGB"&lt;/span&gt;, &lt;span class="o"&gt;(&lt;/span&gt;480, 640&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"#fafafa"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d &lt;span class="o"&gt;=&lt;/span&gt; ImageDraw.Draw&lt;span class="o"&gt;(&lt;/span&gt;img&lt;span class="o"&gt;)&lt;/span&gt;
    f, fs &lt;span class="o"&gt;=&lt;/span&gt; _font&lt;span class="o"&gt;(&lt;/span&gt;22&lt;span class="o"&gt;)&lt;/span&gt;, _font&lt;span class="o"&gt;(&lt;/span&gt;16&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;40, 40&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"COFFEE BEAN Co."&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#1a1a1a"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;40, 90&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"123 Main St · San Francisco"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#555"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;fs&lt;span class="o"&gt;)&lt;/span&gt;
    lines &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;
        &lt;span class="s2"&gt;"Latte (Oat) &lt;/span&gt;&lt;span class="nv"&gt;$5&lt;/span&gt;&lt;span class="s2"&gt;.50"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Croissant &lt;/span&gt;&lt;span class="nv"&gt;$4&lt;/span&gt;&lt;span class="s2"&gt;.25"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Tip &lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;.00"&lt;/span&gt;,
        &lt;span class="s2"&gt;"─────────────────────────"&lt;/span&gt;,
        &lt;span class="s2"&gt;"TOTAL &lt;/span&gt;&lt;span class="nv"&gt;$10&lt;/span&gt;&lt;span class="s2"&gt;.75"&lt;/span&gt;,
        &lt;span class="s2"&gt;"Card **** 4242"&lt;/span&gt;,
        &lt;span class="s2"&gt;"2026-06-23 09:14 AM"&lt;/span&gt;,
    &lt;span class="o"&gt;]&lt;/span&gt;
    y &lt;span class="o"&gt;=&lt;/span&gt; 150
    &lt;span class="k"&gt;for &lt;/span&gt;line &lt;span class="k"&gt;in &lt;/span&gt;lines:
        d.text&lt;span class="o"&gt;((&lt;/span&gt;40, y&lt;span class="o"&gt;)&lt;/span&gt;, line, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#222"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;fs&lt;span class="o"&gt;)&lt;/span&gt;
        y +&lt;span class="o"&gt;=&lt;/span&gt; 36
    img.save&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"sample_receipt.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

def diagram_v1&lt;span class="o"&gt;()&lt;/span&gt; -&amp;gt; None:
    img &lt;span class="o"&gt;=&lt;/span&gt; Image.new&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RGB"&lt;/span&gt;, &lt;span class="o"&gt;(&lt;/span&gt;640, 400&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"#0f172a"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d &lt;span class="o"&gt;=&lt;/span&gt; ImageDraw.Draw&lt;span class="o"&gt;(&lt;/span&gt;img&lt;span class="o"&gt;)&lt;/span&gt;
    f &lt;span class="o"&gt;=&lt;/span&gt; _font&lt;span class="o"&gt;(&lt;/span&gt;18&lt;span class="o"&gt;)&lt;/span&gt;
    d.rounded_rectangle&lt;span class="o"&gt;((&lt;/span&gt;40, 80, 200, 160&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#1e3a5f"&lt;/span&gt;, &lt;span class="nv"&gt;outline&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#38bdf8"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;70, 110&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"API"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#e2e8f0"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    d.rounded_rectangle&lt;span class="o"&gt;((&lt;/span&gt;420, 80, 580, 160&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#14532d"&lt;/span&gt;, &lt;span class="nv"&gt;outline&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#4ade80"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;440, 110&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"Qdrant"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#e2e8f0"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    d.line&lt;span class="o"&gt;((&lt;/span&gt;200, 120, 420, 120&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#94a3b8"&lt;/span&gt;, &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;250, 200&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"v1 — sync pipeline"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#94a3b8"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    img.save&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"diagram_v1.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

def diagram_v2&lt;span class="o"&gt;()&lt;/span&gt; -&amp;gt; None:
    img &lt;span class="o"&gt;=&lt;/span&gt; Image.new&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RGB"&lt;/span&gt;, &lt;span class="o"&gt;(&lt;/span&gt;640, 400&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"#0f172a"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d &lt;span class="o"&gt;=&lt;/span&gt; ImageDraw.Draw&lt;span class="o"&gt;(&lt;/span&gt;img&lt;span class="o"&gt;)&lt;/span&gt;
    f &lt;span class="o"&gt;=&lt;/span&gt; _font&lt;span class="o"&gt;(&lt;/span&gt;18&lt;span class="o"&gt;)&lt;/span&gt;
    d.rounded_rectangle&lt;span class="o"&gt;((&lt;/span&gt;40, 80, 200, 160&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#1e3a5f"&lt;/span&gt;, &lt;span class="nv"&gt;outline&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#38bdf8"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;60, 110&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"LitServe"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#e2e8f0"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    d.rounded_rectangle&lt;span class="o"&gt;((&lt;/span&gt;250, 60, 410, 140&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#4c1d95"&lt;/span&gt;, &lt;span class="nv"&gt;outline&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#a78bfa"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;270, 90&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"CrewAI"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#e2e8f0"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    d.rounded_rectangle&lt;span class="o"&gt;((&lt;/span&gt;420, 80, 580, 160&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#14532d"&lt;/span&gt;, &lt;span class="nv"&gt;outline&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#4ade80"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;440, 110&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"Qdrant"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#e2e8f0"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    d.line&lt;span class="o"&gt;((&lt;/span&gt;200, 120, 250, 100&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#94a3b8"&lt;/span&gt;, &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;
    d.line&lt;span class="o"&gt;((&lt;/span&gt;410, 100, 420, 120&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#94a3b8"&lt;/span&gt;, &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;
    d.text&lt;span class="o"&gt;((&lt;/span&gt;200, 200&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"v2 — agentic pipeline"&lt;/span&gt;, &lt;span class="nv"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#fbbf24"&lt;/span&gt;, &lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;f&lt;span class="o"&gt;)&lt;/span&gt;
    img.save&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"diagram_v2.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

def main&lt;span class="o"&gt;()&lt;/span&gt; -&amp;gt; None:
    FIXTURES.mkdir&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;parents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;True, &lt;span class="nv"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;True&lt;span class="o"&gt;)&lt;/span&gt;
    receipt&lt;span class="o"&gt;()&lt;/span&gt;
    diagram_v1&lt;span class="o"&gt;()&lt;/span&gt;
    diagram_v2&lt;span class="o"&gt;()&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"Fixtures written to {FIXTURES}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;__name__ &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;" __main__"&lt;/span&gt;:
    main&lt;span class="o"&gt;()&lt;/span&gt;

agent_demo.py

&lt;span class="c"&gt;#!/usr/bin/env python3&lt;/span&gt;
&lt;span class="s2"&gt;"""End-to-end demo — MiniCPM-V MCP vision tools (works offline with OLLAMA_MOCK=1).

Simulates what Cursor / Claude Desktop sees when the agent calls vision tools.
"""&lt;/span&gt;
from __future__ import annotations

import json
import os
import sys
from pathlib import Path

ROOT &lt;span class="o"&gt;=&lt;/span&gt; Path&lt;span class="o"&gt;(&lt;/span&gt; __file__ &lt;span class="o"&gt;)&lt;/span&gt;.resolve&lt;span class="o"&gt;()&lt;/span&gt;.parent
sys.path.insert&lt;span class="o"&gt;(&lt;/span&gt;0, str&lt;span class="o"&gt;(&lt;/span&gt;ROOT&lt;span class="o"&gt;))&lt;/span&gt;

import vision_backend as vb &lt;span class="c"&gt;# noqa: E402&lt;/span&gt;
from server import compare_images, describe_image, ocr_document &lt;span class="c"&gt;# noqa: E402&lt;/span&gt;

FIXTURES &lt;span class="o"&gt;=&lt;/span&gt; ROOT / &lt;span class="s2"&gt;"fixtures"&lt;/span&gt;

def _banner&lt;span class="o"&gt;(&lt;/span&gt;title: str&lt;span class="o"&gt;)&lt;/span&gt; -&amp;gt; None:
    print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;{'=' * 60}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt; {title}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;{'=' * 60}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

def _parse&lt;span class="o"&gt;(&lt;/span&gt;result: str&lt;span class="o"&gt;)&lt;/span&gt; -&amp;gt; dict:
    &lt;span class="k"&gt;return &lt;/span&gt;json.loads&lt;span class="o"&gt;(&lt;/span&gt;result&lt;span class="o"&gt;)&lt;/span&gt;

def main&lt;span class="o"&gt;()&lt;/span&gt; -&amp;gt; None:
    os.environ.setdefault&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"OLLAMA_VISION_MODEL"&lt;/span&gt;, &lt;span class="s2"&gt;"minicpm-v4.6"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;not FIXTURES.exists&lt;span class="o"&gt;()&lt;/span&gt; or not list&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES.glob&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"*.png"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;:
        from generate_fixtures import main as gen &lt;span class="c"&gt;# noqa: E402&lt;/span&gt;

        gen&lt;span class="o"&gt;()&lt;/span&gt;

    status &lt;span class="o"&gt;=&lt;/span&gt; vb.health_check&lt;span class="o"&gt;()&lt;/span&gt;
    _banner&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"MiniCPM-V 4.6 Vision MCP — Agent Demo"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;f&lt;span class="s2"&gt;"Model: {vb.VISION_MODEL} · Mode: {status.get('mode', '?')}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;not status.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; and not vb.MOCK:
        print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;⚠️ Ollama offline — re-run with OLLAMA_MOCK=1 or start Ollama.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;# Scenario 1 — describe screenshot&lt;/span&gt;
    _banner&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Scenario 1 — describe_image"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[Tool: describe_image] path=fixtures/diagram_v2.png'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    r1 &lt;span class="o"&gt;=&lt;/span&gt; _parse&lt;span class="o"&gt;(&lt;/span&gt;describe_image&lt;span class="o"&gt;(&lt;/span&gt;str&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"diagram_v2.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="s2"&gt;"What services are shown?"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;## Architecture summary&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;r1.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"result"&lt;/span&gt;, r1.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;, r1&lt;span class="o"&gt;)))&lt;/span&gt;

    &lt;span class="c"&gt;# Scenario 2 — OCR receipt&lt;/span&gt;
    _banner&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Scenario 2 — ocr_document"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[Tool: ocr_document] path=fixtures/sample_receipt.png'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    r2 &lt;span class="o"&gt;=&lt;/span&gt; _parse&lt;span class="o"&gt;(&lt;/span&gt;ocr_document&lt;span class="o"&gt;(&lt;/span&gt;str&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"sample_receipt.png"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;## Receipt OCR&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;r2.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"result"&lt;/span&gt;, r2.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;, r2&lt;span class="o"&gt;)))&lt;/span&gt;

    &lt;span class="c"&gt;# Scenario 3 — compare before/after&lt;/span&gt;
    _banner&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Scenario 3 — compare_images"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"[Tool: compare_images] v1 → v2 pipeline diagrams"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    r3 &lt;span class="o"&gt;=&lt;/span&gt; _parse&lt;span class="o"&gt;(&lt;/span&gt;
        compare_images&lt;span class="o"&gt;(&lt;/span&gt;
            str&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"diagram_v1.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;,
            str&lt;span class="o"&gt;(&lt;/span&gt;FIXTURES / &lt;span class="s2"&gt;"diagram_v2.png"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;,
            &lt;span class="nv"&gt;focus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"new components and labels"&lt;/span&gt;,
        &lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;## Visual diff&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;r3.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"result"&lt;/span&gt;, r3.get&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;, r3&lt;span class="o"&gt;)))&lt;/span&gt;

    _banner&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Done — wire examples/server.py into Cursor MCP settings"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    print&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"See examples/cursor_mcp.json.example"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;__name__ &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;" __main__"&lt;/span&gt;:
    main&lt;span class="o"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Part 4 — The vision backend
&lt;/h3&gt;

&lt;p&gt;examples/vision_backend.py encodes images as base64 and POSTs to OLLAMA_HOST/api/chat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# minicpm-v4.6
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;images_b64&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;vision_backend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;

&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Ollama vision backend for MiniCPM-V 4.6 — shared by MCP server and demos.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;__future__&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;

&lt;span class="n"&gt;OLLAMA_HOST&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:11434&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;rstrip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OLLAMA_VISION_MODEL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minicpm-v4.6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;MOCK&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OLLAMA_MOCK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;SUPPORTED_SUFFIXES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.jpeg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.webp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.gif&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.bmp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_encode_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_file&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Image not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;suffix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;SUPPORTED_SUFFIXES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unsupported image type: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;suffix&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b64encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_bytes&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ascii&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_mock_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[mock &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] Processed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;image_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; image(s).&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Prompt preview: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;…&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Set OLLAMA_MOCK=0 and run `ollama pull minicpm-v4.6` for live inference.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat_vision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;120.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send a vision chat request to Ollama.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;MOCK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_mock_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_paths&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;images_b64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;_encode_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;image_paths&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;images_b64&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/api/chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ConnectError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cannot reach Ollama at &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Start Ollama and run: ollama pull &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HTTPStatusError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ollama error &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;

    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Empty response from Ollama&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;health_check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return model + connectivity status for demos.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;MOCK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mock&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/api/tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])}&lt;/span&gt;
            &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;live&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama_host&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# noqa: BLE001 — demo helper
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;offline&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Environment variables (see .env.example):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="err"&gt;.env.example&lt;/span&gt;

&lt;span class="c"&gt;# Ollama host (default local)
&lt;/span&gt;&lt;span class="py"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:11434&lt;/span&gt;

&lt;span class="c"&gt;# Vision model — 1.6 GB, text + image, 256K context
&lt;/span&gt;&lt;span class="py"&gt;OLLAMA_VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;minicpm-v4.6&lt;/span&gt;

&lt;span class="c"&gt;# Set to 1 to run agent_demo without Ollama (offline smoke test)
&lt;/span&gt;&lt;span class="py"&gt;OLLAMA_MOCK&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh4d8qczdgqopa7l5zgr4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh4d8qczdgqopa7l5zgr4.png" width="673" height="222"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — The three tools
&lt;/h3&gt;

&lt;p&gt;examples/server.py is a FastMCP server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;
&lt;span class="c1"&gt;#!/usr/bin/env python3
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;MiniCPM-V MCP server — vision tools for Cursor, Claude Desktop, and Hermes.

Exposes three tools over MCP:

    describe_image(path, question?) → general image understanding
    ocr_document(path) → structured text extraction
    compare_images(path_a, path_b, focus?) → side-by-side visual diff

Powered by MiniCPM-V 4.6 via Ollama (~1.6 GB, text + image, 256K context).

Run: python examples/server.py # stdio transport (for MCP hosts)
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;__future__&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vision_backend&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ImportError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# pragma: no cover
&lt;/span&gt;    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vision_backend&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt; &lt;span class="c1"&gt;# type: ignore
&lt;/span&gt;
&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minicpm-vision&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;DESCRIBE_DEFAULT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Describe this image in detail. Include objects, text visible, layout, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;colors, and anything notable for a developer reviewing a screenshot.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;OCR_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extract all readable text from this document or screenshot. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Preserve structure with markdown headings and bullet lists where appropriate. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If tables are present, format them as markdown tables.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;COMPARE_DEFAULT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Compare these two images. List similarities and differences. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Note UI changes, text changes, and layout shifts.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;expanduser&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_file&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Not a file: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_tool_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;describe_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Describe or answer questions about a single image using MiniCPM-V 4.6.

    Args:
        path: Absolute or relative path to a PNG, JPG, WEBP, or GIF file.
        question: Optional specific question about the image. Leave empty for
            a general description.

    Returns JSON with the model&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s answer and metadata.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;DESCRIBE_DEFAULT&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_vision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_tool_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;describe_image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ocr_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;OCR a document, receipt, whiteboard photo, or screenshot to markdown text.

    Args:
        path: Absolute or relative path to the image file.

    Returns JSON with extracted text in markdown format.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_vision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OCR_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_tool_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ocr_document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compare_images&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path_a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path_b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;focus&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Compare two images and report visual differences.

    Args:
        path_a: Path to the first image (e.g. before screenshot).
        path_b: Path to the second image (e.g. after screenshot).
        focus: Optional aspect to focus on (e.g. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;navigation bar&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).

    Returns JSON with a structured comparison.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path_a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;_resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path_b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;COMPARE_DEFAULT&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;focus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Focus especially on: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;focus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_vision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_tool_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compare_images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;path_a&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;path_b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VisionError&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minicpm-vision://model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;model_info&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Capability hint: which vision model and host this server uses.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;health_check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama_host&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;describe_image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ocr_document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compare_images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; __main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stdio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  describe_image
&lt;/h4&gt;

&lt;p&gt;General-purpose image Q&amp;amp;A. Pass a custom question for targeted queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample input&lt;/strong&gt;  — architecture diagram the demo describes:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgj86rzf3byqn29xwkdvb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgj86rzf3byqn29xwkdvb.png" width="640" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  ocr_document
&lt;/h4&gt;

&lt;p&gt;Structured OCR prompt — markdown headings, bullet lists, tables. Ideal for receipts, invoices, and whiteboard photos.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample input&lt;/strong&gt;  — coffee shop receipt:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Frhdexjpfmt8g9ot4vdh8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Frhdexjpfmt8g9ot4vdh8.png" width="480" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  compare_images
&lt;/h4&gt;

&lt;p&gt;Two paths + optional focus (e.g. "navigation bar"). Returns similarities, differences, and UI change notes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample inputs&lt;/strong&gt;  — before and after pipeline:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhd7giz8gbe2vgsb7x8uc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhd7giz8gbe2vgsb7x8uc.png" width="640" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fugma8zs4bag9t5cqe2ki.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fugma8zs4bag9t5cqe2ki.png" width="640" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each tool returns &lt;strong&gt;JSON&lt;/strong&gt; with result, tool, paths, and model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 6 — Agent demo (terminal walkthrough)
&lt;/h3&gt;

&lt;p&gt;examples/agent_demo.py runs all three scenarios:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="n"&gt;examples&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;generate_fixtures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;
&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="n"&gt;examples&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;agent_demo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;

&lt;span class="n"&gt;agent_demo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;

&lt;span class="c1"&gt;#!/usr/bin/env python3
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;End-to-end demo — MiniCPM-V MCP vision tools (works offline with OLLAMA_MOCK=1).

Simulates what Cursor / Claude Desktop sees when the agent calls vision tools.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;__future__&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="n"&gt;ROOT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;__file__&lt;/span&gt; &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;
&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ROOT&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vision_backend&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt; &lt;span class="c1"&gt;# noqa: E402
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;compare_images&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;describe_image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ocr_document&lt;/span&gt; &lt;span class="c1"&gt;# noqa: E402
&lt;/span&gt;
&lt;span class="n"&gt;FIXTURES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ROOT&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fixtures&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_banner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setdefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OLLAMA_VISION_MODEL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minicpm-v4.6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;FIXTURES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FIXTURES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;glob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;generate_fixtures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="c1"&gt;# noqa: E402
&lt;/span&gt;
        &lt;span class="nf"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;health_check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;_banner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MiniCPM-V 4.6 Vision MCP — Agent Demo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VISION_MODEL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; · Mode: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MOCK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;⚠️ Ollama offline — re-run with OLLAMA_MOCK=1 or start Ollama.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Scenario 1 — describe screenshot
&lt;/span&gt;    &lt;span class="nf"&gt;_banner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Scenario 1 — describe_image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[Tool: describe_image] path=fixtures/diagram_v2.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;describe_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FIXTURES&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diagram_v2.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What services are shown?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;## Architecture summary&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="c1"&gt;# Scenario 2 — OCR receipt
&lt;/span&gt;    &lt;span class="nf"&gt;_banner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Scenario 2 — ocr_document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[Tool: ocr_document] path=fixtures/sample_receipt.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ocr_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FIXTURES&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sample_receipt.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;## Receipt OCR&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="c1"&gt;# Scenario 3 — compare before/after
&lt;/span&gt;    &lt;span class="nf"&gt;_banner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Scenario 3 — compare_images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[Tool: compare_images] v1 → v2 pipeline diagrams&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;compare_images&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FIXTURES&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diagram_v1.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FIXTURES&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diagram_v2.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;focus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new components and labels&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;## Visual diff&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r3&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="nf"&gt;_banner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Done — wire examples/server.py into Cursor MCP settings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;See examples/cursor_mcp.json.example&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; __main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fag1dpa02yq3kwb09ta58.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fag1dpa02yq3kwb09ta58.gif" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The terminal shows the same flow your MCP host runs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Tool: describe_image] path=fixtures/diagram_v2.png
[Tool: ocr_document] path=fixtures/sample_receipt.png
[Tool: compare_images] v1 → v2 pipeline diagrams
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Offline smoke test (no Ollama):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OLLAMA_MOCK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 python examples/agent_demo.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 7 — Wire into Cursor
&lt;/h3&gt;

&lt;p&gt;Copy examples/cursor_mcp.json.example into Cursor → Settings → MCP. Use &lt;strong&gt;absolute paths&lt;/strong&gt; for cwd.&lt;/p&gt;

&lt;p&gt;Restart Cursor — you should see describe_image, ocr_document, compare_images.&lt;/p&gt;

&lt;p&gt;Try: &lt;em&gt;“Use ocr_document on /path/to/receipt.png and summarize the total.”&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"minicpm-vision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"python"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"examples/server.py"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cwd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/absolute/path/to/guides/minicpm-v-mcp-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OLLAMA_VISION_MODEL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"minicpm-v4.6"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OLLAMA_HOST"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://127.0.0.1:11434"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 8 — Wire into Claude Desktop
&lt;/h3&gt;

&lt;p&gt;Add the server block from examples/claude_desktop_config.json.example to ~/Library/Application Support/Claude/claude_desktop_config.json on macOS.&lt;/p&gt;

&lt;p&gt;Restart Claude Desktop. Vision tools appear alongside your other MCP servers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"minicpm-vision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"python"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/absolute/path/to/guides/minicpm-v-mcp-server/examples/server.py"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OLLAMA_VISION_MODEL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"minicpm-v4.6"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OLLAMA_HOST"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://127.0.0.1:11434"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Give your agent eyes — without giving away your pixels.&lt;/p&gt;

&lt;p&gt;Most coding agents are brilliant at text and terrible at images. The usual fix is a cloud vision API: API keys in config, latency on every screenshot, and your receipts, UI mocks, and whiteboard photos leaving your machine.&lt;/p&gt;

&lt;p&gt;MiniCPM-V 4.6 flips that. At 1.3B parameters and ~1.6 GB on Ollama, it runs comfortably on a 16 GB Mac and handles text + image input with a 256K context window. Wrap it in a small MCP server, and you get three reusable tools — describe_image, ocr_document, and compare_images — that Cursor, Claude Desktop, and any other MCP host can discover at connect time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcpserver</category>
      <category>ollama</category>
      <category>modelcontextprotocol</category>
      <category>llm</category>
    </item>
    <item>
      <title>TechLatest AI &amp; Tech Weekly #21</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Sun, 21 Jun 2026 14:46:12 +0000</pubDate>
      <link>https://dev.to/techlatestnet/techlatest-ai-tech-weekly-21-26m6</link>
      <guid>https://dev.to/techlatestnet/techlatest-ai-tech-weekly-21-26m6</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F691nzpv1wv8vwcjjyfw3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F691nzpv1wv8vwcjjyfw3.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Welcome to this week’s edition of &lt;strong&gt;TechLatest AI &amp;amp; Tech Weekly&lt;/strong&gt;  👋&lt;/p&gt;

&lt;p&gt;Here’s a curated roundup of our latest blogs, notable product launches, and the most interesting AI &amp;amp; ML updates from June 15–June 21, 2026.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI/ML News Roundup: June 15–June 21, 2026
&lt;/h3&gt;

&lt;p&gt;Key highlights from this week’s AI developments include frontier model advancements with agentic capabilities, massive funding rounds reshaping valuations, and practical product launches for developers and enterprises. These updates emphasize autonomous agents, infrastructure scaling, and open-weight benchmarks relevant to builders and researchers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Open-Source AI, AI Agents &amp;amp; Developer Releases
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Vercel Releases Eve
&lt;/h4&gt;

&lt;p&gt;Vercel introduced &lt;strong&gt;Eve&lt;/strong&gt; , an AI cloud engineer designed to help developers deploy, debug, monitor, and manage applications directly from natural language. Eve can inspect infrastructure, perform operational tasks, and automate common DevOps workflows, bringing agentic capabilities closer to production environments. &lt;a href="https://vercel.com/eve" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  NVIDIA Introduces SpatialClaw
&lt;/h4&gt;

&lt;p&gt;NVIDIA researchers released &lt;strong&gt;SpatialClaw&lt;/strong&gt; , a training-free AI agent that treats executable code as its action interface for solving complex spatial reasoning tasks. Instead of relying on specialized training, the agent dynamically generates and executes code to reason about objects, layouts, and spatial relationships. &lt;a href="https://ai-tldr.dev/releases/nvidia-spatialclaw/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Hermes Agent Gets Blank Slate Mode
&lt;/h4&gt;

&lt;p&gt;Nous Research updated &lt;strong&gt;Hermes Agent&lt;/strong&gt; with a new &lt;strong&gt;Blank Slate Mode&lt;/strong&gt; , allowing developers to precisely control available tools through platform_toolsets disabled_toolsets configurations. This creates highly controlled agent environments with predictable capabilities. &lt;a href="https://x.com/NousResearch/status/2068405008685539514" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Cisco Introduces FAPO
&lt;/h4&gt;

&lt;p&gt;Cisco AI unveiled &lt;strong&gt;FAPO (Failure-Aware Prompt Optimization)&lt;/strong&gt;, a framework that identifies failures at individual pipeline steps and automatically optimizes prompts using Claude Code orchestration. &lt;a href="https://franklineh.com/news/ZBg8MJvaOGNPTdbaanmd" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  VibeThinker-3B
&lt;/h4&gt;

&lt;p&gt;Researchers introduced &lt;strong&gt;VibeThinker-3B&lt;/strong&gt; , a compact reasoning model built on Qwen2.5-Coder-3B using the Spectrum-to-Signal post-training pipeline. Despite its smaller size, the model demonstrates strong reasoning and coding capabilities. &lt;a href="https://arxiv.org/abs/2606.16140" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Flash-KMeans Achieves 200× Speedup
&lt;/h4&gt;

&lt;p&gt;Researchers introduced &lt;strong&gt;Flash-KMeans&lt;/strong&gt; , an I/O-aware implementation of exact K-Means clustering that reportedly runs more than &lt;strong&gt;200× faster than FAISS&lt;/strong&gt; on GPUs by reducing memory bottlenecks and optimizing data movement. &lt;a href="https://github.com/svg-project/flash-kmeans" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Yandex Open Sources YAFF
&lt;/h4&gt;

&lt;p&gt;Yandex released &lt;strong&gt;YAFF (Yet Another Fast Format)&lt;/strong&gt;, a zero-copy wire format for Protocol Buffers that delivers near-struct-read performance while maintaining Protobuf compatibility. &lt;a href="https://markets.businessinsider.com/news/currencies/yandex-open-sources-yaff-a-technology-that-reduce-server-cpu-usage-by-up-to-20-1036257868" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Perplexity Launches Brain
&lt;/h4&gt;

&lt;p&gt;Perplexity introduced &lt;strong&gt;Brain&lt;/strong&gt; , a new AI-powered workspace designed to help users organize research, conversations, documents, and web knowledge into a persistent intelligence layer. Brain combines search, memory, and knowledge management, allowing users to build a continuously evolving repository of information. &lt;a href="https://www.perplexity.ai/hub/blog/self-improving-memory-for-agents" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Hermes Agent Adds Asynchronous Subagents
&lt;/h4&gt;

&lt;p&gt;Nous Research upgraded &lt;strong&gt;Hermes Agent&lt;/strong&gt; with asynchronous subagents that can work independently in the background while the main conversation continues. Delegated tasks no longer block the parent agent, enabling more scalable multi-agent workflows. &lt;a href="https://byteiota.com/hermes-agent-adds-async-subagents-fan-out-work-without-blocking/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Hugging Face &amp;amp; Open-Source Ecosystem
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clem Delangue argued publicly on June 15, 2026&lt;/strong&gt; that AI’s future is a choice between a closed, Silicon Valley-led path and an open-source path where broader participation is possible, reinforcing Hugging Face’s strategic identity as a champion of open AI ecosystems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ideogram’s open-weight v4 image model&lt;/strong&gt; was highlighted on Hugging Face earlier in June and remains explorable through a multimodal demo Space on the platform, as part of ongoing open-weight image model distribution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quantized Gemma 4 checkpoints&lt;/strong&gt; released by Philipp Schmid were made available on Hugging Face in mid-June, underscoring the platform’s role as the default distribution point for optimized open models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diffusion-GEMMA’s faster text-generation model weights&lt;/strong&gt; were open-sourced on Hugging Face under &lt;strong&gt;Apache 2.0&lt;/strong&gt; and shared by Sundar Pichai, continuing the trend of open-sourcing high-performance model variants.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Frontier Model Advancements &amp;amp; Agentic Capabilities
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Sonnet 4 and Claude Opus 4 officially retired on June 15, 2026&lt;/strong&gt; , stopping all API requests; Anthropic had flagged this date well in advance, directing production teams to migrate to &lt;strong&gt;Claude Sonnet 4.6&lt;/strong&gt; or &lt;strong&gt;Claude Opus 4.8&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Sonnet 4.8 is expected to launch in the June 16–18 window&lt;/strong&gt; , roughly three weeks after Opus 4.8’s May 28 release, with expected additions including &lt;strong&gt;Dynamic Workflows&lt;/strong&gt; and refined effort/thinking-budget controls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPT-5.6 was spotted running inside ChatGPT Pro&lt;/strong&gt; in mid-June 2026, with developers reporting noticeably faster and more capable responses than GPT-5.5 Pro; OpenAI’s Chief Scientist previewed GPT-5.6 as a “meaningful improvement” with a &lt;strong&gt;late-June 2026 launch&lt;/strong&gt; expected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GLM-5.2 from China’s Zhipu AI beat GPT-5.5 outright on the FrontierSWE benchmark&lt;/strong&gt; , which measures AI agents on multi-hour, open-ended engineering projects, and trails Claude Fable 5 by just one point, effectively co-leading frontier AI on coding benchmarks while Fable 5 is offline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic’s Fable 5 and Mythos 5 remained offline for over a week&lt;/strong&gt; under a &lt;strong&gt;US government export ban&lt;/strong&gt; issued June 12, with 100+ cybersecurity leaders calling it an overreach and demanding reversal.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Massive Funding Rounds Reshaping Valuations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Salesforce acquired Fin&lt;/strong&gt; , an AI customer service platform, for &lt;strong&gt;$3.6 billion&lt;/strong&gt; this week, fitting CEO Marc Benioff’s strategy of positioning Salesforce as the AI-native layer for enterprise customer operations and competing directly with Anthropic’s Claude for Work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic signed more than 12 US data center leases&lt;/strong&gt; exceeding &lt;strong&gt;1 gigawatt&lt;/strong&gt; of computing capacity while preparing for its IPO, with Google reportedly in discussions to provide additional financial backing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic’s revenue run-rate hit approximately $47 billion in May 2026&lt;/strong&gt; , up roughly 5x year-over-year from about $10 billion, supporting its &lt;strong&gt;$965 billion post-money valuation&lt;/strong&gt; after a $65 billion Series H round confirmed by Bloomberg on May 29.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Practical Product Launches for Developers &amp;amp; Enterprises
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT Dreaming V3 Memory rollout widened&lt;/strong&gt; this week toward Free and Go tier users after beginning to reach ChatGPT Plus and Pro users in the US on June 4; OpenAI described it as its most significant memory upgrade since the original rollout, letting the model retain and apply context across sessions more reliably.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apple’s Gemini-powered Siri and Claude Extension rolled out in iOS 27 Beta 1&lt;/strong&gt; , following Tim Cook’s final WWDC keynote on June 8, making Claude available as an iPhone assistant option for the first time, with Apple Intelligence features requiring iPhone 15 Pro or newer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft continued rolling out its MAI model family&lt;/strong&gt; inside Azure AI Foundry, including &lt;strong&gt;MAI-Thinking-1&lt;/strong&gt; (unveiled by Mustafa Suleyman as Microsoft AI’s flagship reasoning model at Build 2026), and finalized an &lt;strong&gt;11,000-model Azure AI Foundry catalog&lt;/strong&gt; that now includes Claude Opus 4.8.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NVIDIA Cosmos 3 adoption grew in healthcare simulation&lt;/strong&gt; , being increasingly used to generate synthetic training videos of rare medical scenarios for surgical robots, addressing data scarcity problems impossible to solve with real-world footage alone.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Governance, Ethics &amp;amp; Regulation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The EU AI Act Enforcement Countdown hit 50 days&lt;/strong&gt; , with the bulk of the Act beginning to apply on &lt;strong&gt;August 2, 2026&lt;/strong&gt; ; fines for the most serious violations can reach &lt;strong&gt;35 million euros or 7% of global annual turnover&lt;/strong&gt; , with 15 million euros or 3% for most other breaches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Colorado AI Act’s compliance window narrowed&lt;/strong&gt; this week, with Colorado’s AI Act taking effect this year as one of the first US state-level AI laws with real enforcement, making it more consequential in practice than several higher-profile federal proposals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Massachusetts legislators advanced several AI-related bills&lt;/strong&gt; this week, including &lt;strong&gt;SB 760&lt;/strong&gt; (a kids’ chatbot safety bill), &lt;strong&gt;H 76&lt;/strong&gt; (addressing AI-generated deceptive election communications), and &lt;strong&gt;H 4616&lt;/strong&gt; (covering AI use in healthcare prior authorizations).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Louisiana became the 22nd U.S. state to enact a comprehensive consumer data privacy law&lt;/strong&gt; , further expanding the compliance surface area for AI deployments processing personal data across the United States.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Economist’s June 20, 2026 cover story&lt;/strong&gt; “America’s AI Power Grab” framed the Fable 5 and Mythos 5 export ban as a geopolitical assertion, treating frontier AI models similarly to weapons systems subject to export controls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dario Amodei met with Trump administration officials&lt;/strong&gt; this week to negotiate a path to restoring access to Fable 5 and Mythos 5, but meetings did not produce a resolution, with no timeline for restoration confirmed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Over 100 cybersecurity leaders signed an open letter&lt;/strong&gt; demanding the US government reverse its decision to ban Anthropic’s Fable 5 and Mythos 5 models, arguing the ban is disproportionate and technically inaccurate.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Infrastructure &amp;amp; Hardware
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic signed more than 12 US data center leases&lt;/strong&gt; exceeding &lt;strong&gt;1 gigawatt&lt;/strong&gt; of computing capacity, roughly equivalent to what a mid-sized country needs for its entire national electricity grid, to meet projected exponentially growing demand for Claude models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic is already paying SpaceX $1.25 billion per month&lt;/strong&gt; for access to over &lt;strong&gt;220,000 Nvidia processors&lt;/strong&gt; at the Colossus 1 facility in Memphis, with the contract running through &lt;strong&gt;May 2029&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NVIDIA Cosmos 3’s simulation capabilities&lt;/strong&gt; are increasingly being used to generate synthetic training videos of rare medical scenarios for surgical robots, combined with NVIDIA’s Vera Rubin platform and Intel’s Xeon 6+ featuring Confidential Computing at rack scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orion-100B reports continued circulating this week&lt;/strong&gt; about a 100-billion-parameter model reportedly trained for just &lt;strong&gt;$1.25 per hour&lt;/strong&gt; of compute, a figure being held up as a new benchmark for training cost efficiency, though treated with skepticism until independent verification.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Blogs We Published This Week
&lt;/h3&gt;

&lt;h4&gt;
  
  
  When to Fine-Tune an LLM and When Prompting Is Enough
&lt;/h4&gt;

&lt;p&gt;A practical guide to understanding when prompt engineering is sufficient and when fine-tuning becomes necessary. Learn the trade-offs between cost, performance, customization, and maintenance when building AI applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/when-to-fine-tune-an-llm-and-when-prompting-is-enough-32nc"&gt;When to Fine-Tune an LLM (And When Prompting Is Enough)&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Loop Engineering Explained Visually: From Manual Prompts to Goal-Driven AI Agents
&lt;/h4&gt;

&lt;p&gt;Explore the evolution from traditional prompting to autonomous AI systems through Loop Engineering. This visual guide explains how modern agents continuously plan, execute, evaluate, and improve toward a goal.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/loop-engineering-explained-visually-from-manual-prompts-to-goal-driven-ai-agents-111c"&gt;Loop Engineering Explained Visually: From Manual Prompts to Goal-Driven AI Agents&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Harness Engineering — Full Visual Guide
&lt;/h4&gt;

&lt;p&gt;Learn how Harness Engineering helps developers systematically evaluate, test, and improve AI systems using structured feedback loops, benchmarks, evaluation pipelines, and continuous optimization techniques.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/harness-engineering-full-visual-guide-254d"&gt;Harness Engineering — Full Visual Guide&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  AI Agents Masterclass — Full Visual Guide
&lt;/h4&gt;

&lt;p&gt;A comprehensive visual walkthrough of AI agents, covering agent architectures, memory systems, tools, planning, multi-agent workflows, and real-world deployment patterns.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/ai-agents-masterclass-full-visual-guide-1dhi"&gt;AI Agents Masterclass — Full Visual Guide&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Model Context Protocol (MCP) — Full Visual Guide
&lt;/h4&gt;

&lt;p&gt;Understand how the Model Context Protocol (MCP) enables AI applications to securely connect with tools, databases, APIs, repositories, and external systems through a standardized interface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/techlatestnet/model-context-protocol-mcp-full-visual-guide-398"&gt;Model Context Protocol (MCP) — Full Visual Guide&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  TL;DR — TechLatest AI &amp;amp; Tech Weekly #21
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Perplexity launched Brain, turning AI search into a persistent knowledge and memory workspace.&lt;/li&gt;
&lt;li&gt;Vercel introduced Eve, an AI cloud engineer capable of managing deployments, debugging infrastructure, and automating DevOps workflows.&lt;/li&gt;
&lt;li&gt;NVIDIA unveiled SpatialClaw, a training-free agent that uses executable code as the interface for spatial reasoning.&lt;/li&gt;
&lt;li&gt;Hermes Agent received two major upgrades: asynchronous subagents for parallel task execution and Blank Slate Mode for controlled enterprise deployments.&lt;/li&gt;
&lt;li&gt;Cisco launched FAPO, a framework that identifies failures within AI pipelines and automatically optimizes prompts.&lt;/li&gt;
&lt;li&gt;VibeThinker-3B demonstrated how compact reasoning models can achieve strong performance with efficient post-training techniques.&lt;/li&gt;
&lt;li&gt;Flash-KMeans achieved over 200× faster exact clustering on GPUs, while Yandex open-sourced YAFF for high-performance data serialization.&lt;/li&gt;
&lt;li&gt;GPT-5.6 surfaced in ChatGPT Pro testing, GLM-5.2 surpassed GPT-5.5 on FrontierSWE, and Claude Sonnet 4.8 is expected to arrive soon.&lt;/li&gt;
&lt;li&gt;Anthropic’s Fable 5 and Mythos 5 remained unavailable under U.S. export restrictions, triggering industry-wide debate and backlash.&lt;/li&gt;
&lt;li&gt;Salesforce acquired Fin for $3.6 billion, while Anthropic’s valuation approached $1 trillion following explosive revenue growth.&lt;/li&gt;
&lt;li&gt;Open-source AI momentum continued with new Gemma 4 checkpoints, Diffusion-GEMMA releases, and Hugging Face reinforcing its commitment to open ecosystems.&lt;/li&gt;
&lt;li&gt;AI regulation accelerated globally as the EU AI Act countdown reached 50 days, Colorado’s AI Act approached enforcement, and multiple U.S. states advanced new AI legislation.&lt;/li&gt;
&lt;li&gt;We published five new visual guides covering AI Agents, MCP, Loop Engineering, Harness Engineering, and when to fine-tune LLMs versus relying on prompting.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>technologynews</category>
      <category>newsandupdates</category>
      <category>weeklynews</category>
      <category>weeklynewsletter</category>
    </item>
    <item>
      <title>Model Context Protocol (MCP) — Full Visual Guide</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Fri, 19 Jun 2026 06:57:17 +0000</pubDate>
      <link>https://dev.to/techlatestnet/model-context-protocol-mcp-full-visual-guide-398</link>
      <guid>https://dev.to/techlatestnet/model-context-protocol-mcp-full-visual-guide-398</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flmugqp43ghhevu8z7eql.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flmugqp43ghhevu8z7eql.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Connect AI hosts — Cursor, Claude Desktop, VS Code Copilot — to &lt;strong&gt;your&lt;/strong&gt; databases, repos, APIs, and local files through a &lt;strong&gt;standard wire protocol&lt;/strong&gt; , not one-off integrations per app.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwegj1d6f4o6eru5fuh3z.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwegj1d6f4o6eru5fuh3z.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll understand at the end
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Why MCP is described as &lt;strong&gt;USB-C for AI apps&lt;/strong&gt;  — and what that actually means in code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Host, Client, Server&lt;/strong&gt; roles and how one host fans out to many servers&lt;/li&gt;
&lt;li&gt;The three server primitives: &lt;strong&gt;Tools, Resources, Prompts&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability exchange&lt;/strong&gt; at connect time — and why it beats brittle REST contracts for agents&lt;/li&gt;
&lt;li&gt;How to &lt;strong&gt;scaffold a server&lt;/strong&gt; , wire &lt;strong&gt;Cursor / Claude Desktop&lt;/strong&gt; , and inspect tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;App MCP&lt;/strong&gt;  — servers that return UI instructions, not just markdown&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MCP hub — standardized connections&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fahetsijo5grw1iwszp4c.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fahetsijo5grw1iwszp4c.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — The USB-C mental model
&lt;/h3&gt;

&lt;p&gt;At the center sits your &lt;strong&gt;AI application&lt;/strong&gt; (the host). Around it: databases, GitHub, email, local filesystem, Slack, public web APIs. Each connection runs over the &lt;strong&gt;same protocol label: MCP&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The host doesn’t embed GitHub logic or Postgres drivers. It embeds an &lt;strong&gt;MCP client&lt;/strong&gt; that talks to &lt;strong&gt;MCP servers&lt;/strong&gt;  — one per domain (filesystem server, GitHub server, your custom weather server).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwripr7m3vased2rem68r.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwripr7m3vased2rem68r.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2 — Host, Client, Server
&lt;/h3&gt;

&lt;p&gt;Three roles — don’t collapse them:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgco02172r2r0ozks5epg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgco02172r2r0ozks5epg.png" width="635" height="245"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One host → &lt;strong&gt;many clients&lt;/strong&gt; → &lt;strong&gt;many servers&lt;/strong&gt;. The model sees a unified tool surface; each server stays isolated and deployable on its own.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwnejyd80fehgq4uff1lz.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwnejyd80fehgq4uff1lz.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3 — What servers expose
&lt;/h3&gt;

&lt;p&gt;Every MCP server advertises up to three capability families:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt;  — actions the model can call (fetch weather, run query, create issue).&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Resources&lt;/strong&gt;  — readable content (file URIs, schema docs, config snapshots).&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Prompts&lt;/strong&gt;  — reusable prompt templates the host can surface to users.&lt;/p&gt;

&lt;p&gt;Server capabilities — Tools · Resources · Prompts&lt;/p&gt;

&lt;p&gt;Think of tools as &lt;strong&gt;verbs&lt;/strong&gt; , resources as &lt;strong&gt;nouns&lt;/strong&gt; , prompts as &lt;strong&gt;saved playbooks&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flkw67rk69xiuuob18wn7.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flkw67rk69xiuuob18wn7.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Retrieval systems are a common MCP use case. Platforms such as &lt;strong&gt;Instant RAGFlow&lt;/strong&gt; expose knowledge repositories and document collections that agents can access dynamically through tools and resources rather than embedding all information directly into prompts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/ragflow_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/ragflow_support/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Many MCP-powered retrieval workflows depend on vector databases for semantic search. &lt;strong&gt;Chroma Vector Database&lt;/strong&gt; provides a lightweight memory layer that can be exposed through MCP servers, allowing agents to retrieve relevant context from embeddings on demand.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/chroma-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/chroma-support/&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 4 — Transport layer
&lt;/h3&gt;

&lt;p&gt;Client and server exchange JSON-RPC messages over a &lt;strong&gt;transport&lt;/strong&gt; :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;stdio&lt;/strong&gt;  — host spawns server as subprocess; stdin/stdout carry messages. Default for local dev (npx, python server.py).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSE / HTTP&lt;/strong&gt;  — server runs remotely; client connects over the network.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The capability model is identical; only the pipe changes. Most tutorials and desktop hosts start with &lt;strong&gt;stdio&lt;/strong&gt; because it’s one config block and no open port.&lt;/p&gt;

&lt;p&gt;Transport — stdio local vs SSE remote&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzartoahty5wih2an628g.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzartoahty5wih2an628g.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 5 — Capability exchange (the handshake)
&lt;/h3&gt;

&lt;p&gt;Before any tool runs, client and server perform a &lt;strong&gt;capability exchange&lt;/strong&gt; :&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client sends &lt;strong&gt;initialize&lt;/strong&gt;  — protocol version, client info&lt;/li&gt;
&lt;li&gt;Server responds with &lt;strong&gt;supported tools, resources, prompts,&lt;/strong&gt; and JSON schemas for parameters&lt;/li&gt;
&lt;li&gt;Client sends &lt;strong&gt;initialized&lt;/strong&gt; notification — channel is ready&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Handshake — initialize → capabilities → ready&lt;/p&gt;

&lt;p&gt;Example: a weather server might advertise get_forecast(location, date) with types. The host's model receives that schema in context and knows exactly which arguments to fill — no hard-coded OpenAPI doc in the host repo.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fk6n7slowxy5h3j13gje2.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fk6n7slowxy5h3j13gje2.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 6 — Traditional API: rigid contracts
&lt;/h3&gt;

&lt;p&gt;Classic REST: the client must know the contract &lt;strong&gt;ahead of time&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /forecast?location=NYC&amp;amp;date=2025-03-15
→ 200 + JSON body
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Integrators bake location and date into their code. Works until you ship a &lt;strong&gt;breaking change&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Traditional API — fixed parameters&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fi6u7vrqdbcwpxfyiga2q.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fi6u7vrqdbcwpxfyiga2q.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 7 — When APIs break clients
&lt;/h3&gt;

&lt;p&gt;You add a &lt;strong&gt;required&lt;/strong&gt; third parameter — unit (celsius | fahrenheit). Every client that still sends only two params gets &lt;strong&gt;a 400 error&lt;/strong&gt; or incorrect defaults. Three sad integrators, three redeploys.&lt;/p&gt;

&lt;p&gt;Breaking change — all clients fail&lt;/p&gt;

&lt;p&gt;This is normal API versioning pain. For &lt;strong&gt;autonomous agents&lt;/strong&gt; that must adapt without a human editing config, it’s worse: the agent doesn’t read your changelog.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F01mnam97zbbt5d9y1867.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F01mnam97zbbt5d9y1867.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 8 — MCP: discover, don’t hardcode
&lt;/h3&gt;

&lt;p&gt;On each connection, the client &lt;strong&gt;asks&lt;/strong&gt; what the server supports. The server returns current tool schemas — including new optional fields like unit.&lt;/p&gt;

&lt;p&gt;The client (and model) adapts on the &lt;strong&gt;next session&lt;/strong&gt; without redeploying host code. No silent failure from a missing query param the model never knew about.&lt;/p&gt;

&lt;p&gt;MCP capability exchange — dynamic schema&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caveat:&lt;/strong&gt; MCP doesn’t magically fix &lt;strong&gt;semantic&lt;/strong&gt; breaking changes (renaming a tool). It fixes &lt;strong&gt;discovery&lt;/strong&gt;  — parameters and availability are first-class at connect time, not buried in external docs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjfynbqqw0ltcl5who2t3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjfynbqqw0ltcl5who2t3.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 9 — API vs MCP (when to use which)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fltplgpit6dqg7ljarz26.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fltplgpit6dqg7ljarz26.png" width="502" height="226"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use &lt;strong&gt;both&lt;/strong&gt; : MCP server as the agent-facing adapter; your core service stays a normal API internally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 10 — Build your first server
&lt;/h3&gt;

&lt;p&gt;We ship a minimal &lt;strong&gt;Python weather server&lt;/strong&gt; using the official SDK (FastMCP):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;

&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;weather-demo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_forecast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Stub forecast — location, optional day, unit celsius|fahrenheit.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; __main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stdio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmksydewu3wmt5lchbrio.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmksydewu3wmt5lchbrio.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Run locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"mcp[cli]&amp;gt;=1.2"&lt;/span&gt;
python examples/minimal_weather_server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install MCP SDK; scaffold server project&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4ngpjo4ztbvmt7x1olrt.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4ngpjo4ztbvmt7x1olrt.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 11 — Wire the host
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt;  — MCP settings JSON (Settings → MCP):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"weather-demo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"python"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"examples/minimal_weather_server.py"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cwd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/path/to/guides/mcp-visual-guide"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude Desktop&lt;/strong&gt;  — claude_desktop_config.json with the same command / args pattern.&lt;/p&gt;

&lt;p&gt;Cursor MCP config&lt;/p&gt;

&lt;p&gt;Restart the host after config changes. The client spawns your server process and runs the handshake automatically.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fogzk4meelm9jlywv4aiy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fogzk4meelm9jlywv4aiy.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 12 — Inspect and debug
&lt;/h3&gt;

&lt;p&gt;Use the MCP Inspector or host UI to list tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @modelcontextprotocol/inspector python examples/minimal_weather_server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fnmmtyne97y1d7no4gst6.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fnmmtyne97y1d7no4gst6.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You should see get_forecast with parameters location, day, unit. Invoke it with test JSON before trusting the model.&lt;/p&gt;

&lt;p&gt;List tools via inspector Test tool invocation&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwoez0tw474d0x8yyvkt1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwoez0tw474d0x8yyvkt1.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — Beyond text: App MCP (Visuals MCP pattern)
&lt;/h3&gt;

&lt;p&gt;Most servers return &lt;strong&gt;strings&lt;/strong&gt;  — markdown tables, JSON blobs, file paths. That hits limits fast:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Markdown tables don’t sort or filter&lt;/li&gt;
&lt;li&gt;Long lists burn context&lt;/li&gt;
&lt;li&gt;Images arrive as URLs, not previews&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;App MCP&lt;/strong&gt; servers return &lt;strong&gt;UI render instructions&lt;/strong&gt; plus data. The host loads a bundled React (or other) app via MCP &lt;strong&gt;resources&lt;/strong&gt; (text/html), and the model passes structured tool results into that app.&lt;/p&gt;

&lt;p&gt;Flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User: “Show EC2 instances in a sortable table”&lt;/li&gt;
&lt;li&gt;Model calls display_table tool&lt;/li&gt;
&lt;li&gt;Server returns columns + rows + resourceUri: table://display&lt;/li&gt;
&lt;li&gt;Host renders interactive grid (sort, filter, paginate)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Text MCP vs App MCPVisuals flow — prompt → tool → React app&lt;/p&gt;

&lt;p&gt;Architecture note from production App MCP servers: &lt;strong&gt;one mini-app per visual&lt;/strong&gt;  — separate Vite build, single-file HTML bundle, tool metadata pointing at resourceUri. See &lt;a href="https://harrybin.de/posts/visuals-mcp-server/" rel="noopener noreferrer"&gt;Visuals MCP&lt;/a&gt; for tables, trees, master-detail layouts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ffrhvi98zk5a722eldg0i.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ffrhvi98zk5a722eldg0i.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install pattern (any server, not only visuals):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"visuals-mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@harrybin/visuals-mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;VS Code extension installs are the zero-config variant — the extension registers the server for Copilot Chat automatically.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fo07gyic18ysrl9ghv4px.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fo07gyic18ysrl9ghv4px.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Modern agent platforms increasingly rely on MCP for tool discovery and integration. &lt;strong&gt;OpenClaw&lt;/strong&gt; combines multi-agent workflows, messaging channels, and MCP-powered capabilities, enabling agents to interact with external systems while maintaining structured sessions and tool access controls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/openclaw-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/openclaw-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — Production checklist
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scope tools narrowly&lt;/strong&gt;  — least privilege; read-only DB user for read tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate inputs&lt;/strong&gt;  — the model will hallucinate argument shapes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout and cancel&lt;/strong&gt;  — long-running tools need progress or abort&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version your server&lt;/strong&gt;  — bump tool descriptions when behavior changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log on the server&lt;/strong&gt;  — host logs won’t show your business errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;stdio vs remote&lt;/strong&gt;  — stdio for local trust boundary; HTTP for shared team servers with auth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As MCP ecosystems grow, operational visibility becomes increasingly important. &lt;strong&gt;Dify AI&lt;/strong&gt; provides workflow orchestration, monitoring, evaluation, and deployment capabilities that help teams manage MCP-powered applications in production environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/difyai_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/difyai_support/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Multi-agent systems often depend on multiple MCP servers for tools and data access. &lt;strong&gt;CrewAI Studio&lt;/strong&gt; helps coordinate agent teams and workflows while integrating with external services through standardized interfaces such as MCP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/crewai-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;MCP is a &lt;strong&gt;client-server protocol&lt;/strong&gt; between AI hosts and the outside world. &lt;strong&gt;Hosts&lt;/strong&gt; run &lt;strong&gt;clients&lt;/strong&gt; ; &lt;strong&gt;servers&lt;/strong&gt; expose &lt;strong&gt;tools, resources, and prompts&lt;/strong&gt; after a &lt;strong&gt;capability handshake&lt;/strong&gt;. Compared to rigid REST contracts, MCP lets agents &lt;strong&gt;discover&lt;/strong&gt; current parameters instead of failing on undeclared fields. Start with a &lt;strong&gt;stdio server&lt;/strong&gt; in Python or TypeScript, wire &lt;strong&gt;Cursor or Claude Desktop&lt;/strong&gt; , inspect with the &lt;strong&gt;MCP Inspector&lt;/strong&gt; , then explore &lt;strong&gt;App MCP&lt;/strong&gt; when markdown isn’t enough.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcpprotocol</category>
      <category>mcpclient</category>
      <category>mcpserver</category>
      <category>modelcontextprotocol</category>
    </item>
    <item>
      <title>AI Agents Masterclass — Full Visual Guide</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Wed, 17 Jun 2026 06:40:49 +0000</pubDate>
      <link>https://dev.to/techlatestnet/ai-agents-masterclass-full-visual-guide-1dhi</link>
      <guid>https://dev.to/techlatestnet/ai-agents-masterclass-full-visual-guide-1dhi</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh8lz6gbpy17c0jvd2aj8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh8lz6gbpy17c0jvd2aj8.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything you need to &lt;strong&gt;understand, compare, and build&lt;/strong&gt; AI agents: definitions from Google Cloud and IBM, ReAct and ReWOO loops, multi-agent patterns, 15+ frameworks, MCP and A2A protocols, governance, Cloud Run deployment, and &lt;strong&gt;five runnable examples&lt;/strong&gt; with animated diagrams + terminal GIFs.&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll understand at the end
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;What an &lt;strong&gt;AI agent&lt;/strong&gt; is — and how it differs from assistants, chatbots, and bots&lt;/li&gt;
&lt;li&gt;Six core capabilities: &lt;strong&gt;reasoning, acting, observing, planning, collaborating, self-refining&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7b1vaiit7w7vzwm6oaxf.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7b1vaiit7w7vzwm6oaxf.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent &lt;strong&gt;anatomy&lt;/strong&gt; : persona, memory, tools, model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory tiers&lt;/strong&gt;  — working, episodic, semantic, procedural&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ReAct&lt;/strong&gt; and &lt;strong&gt;ReWOO&lt;/strong&gt; reasoning paradigms&lt;/li&gt;
&lt;li&gt;Five classical agent types on the &lt;strong&gt;reflex → learning&lt;/strong&gt;  ladder&lt;/li&gt;
&lt;li&gt;Three lifecycle stages: &lt;strong&gt;goal planning, tool reasoning, learning/reflection&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single vs multi-agent&lt;/strong&gt; , surface vs background deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic vs non-agentic&lt;/strong&gt; chatbots&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Six enterprise use-case families&lt;/strong&gt; with healthcare, finance, and emergency examples&lt;/li&gt;
&lt;li&gt;Benefits, challenges, and &lt;strong&gt;governance&lt;/strong&gt; patterns (HITL, activity logs, interruption)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;15+ frameworks&lt;/strong&gt;  — when to pick LangGraph, CrewAI, OpenAI Agents SDK, Pydantic AI, Hermes, OpenClaw, and more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP + A2A&lt;/strong&gt; interoperability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Run&lt;/strong&gt; production deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Five runnable examples&lt;/strong&gt; with terminal GIFs and smoke tests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizm3zyfj6uilvglzclj0.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizm3zyfj6uilvglzclj0.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduction — why agents now
&lt;/h3&gt;

&lt;p&gt;For years, “AI” in products meant &lt;strong&gt;one-shot generation&lt;/strong&gt; : you send a prompt, the model returns text, the transaction ends. That works for drafting emails. It fails for real work — research a market, book travel, triage tickets, reconcile accounts — because real work is &lt;strong&gt;multi-step&lt;/strong&gt; , &lt;strong&gt;tool-dependent&lt;/strong&gt; , and &lt;strong&gt;stateful&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;AI agent&lt;/strong&gt; closes that gap. Instead of a single response, the system &lt;strong&gt;pursues a goal&lt;/strong&gt; over time: it plans, calls tools, reads results, revises, and stops when the objective is met (or when a human says stop).&lt;/p&gt;

&lt;p&gt;Industry definitions converge on the same idea with different emphasis:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Cloud&lt;/strong&gt; describes AI agents as systems that combine a foundation model with &lt;strong&gt;reasoning&lt;/strong&gt; , &lt;strong&gt;planning&lt;/strong&gt; , and &lt;strong&gt;action&lt;/strong&gt;  — using tools and external data to accomplish tasks on a user’s behalf, not just answer questions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IBM&lt;/strong&gt; frames agents as software entities that &lt;strong&gt;perceive&lt;/strong&gt; their environment, &lt;strong&gt;reason&lt;/strong&gt; about goals, and &lt;strong&gt;act&lt;/strong&gt; through tools or APIs — often with memory that persists across interactions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI’s practical guide&lt;/strong&gt; adds product reality: agents shine when workflows are &lt;strong&gt;open-ended&lt;/strong&gt; , require &lt;strong&gt;judgment&lt;/strong&gt; , and benefit from &lt;strong&gt;tool use&lt;/strong&gt;  — but they demand stronger observability and guardrails than chatbots.&lt;/p&gt;

&lt;p&gt;This masterclass synthesizes those views into one buildable mental model, then walks you through code, frameworks, and production patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — Agent vs assistant vs bot
&lt;/h3&gt;

&lt;p&gt;Three labels get swapped in marketing. Architecturally, they differ:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bot (classic)&lt;/strong&gt; — rule-based or intent-classifier driven. Fixed dialog trees, slot filling, no genuine planning. Example: “Track my package” → lookup by tracking number. Predictable, cheap, brittle outside trained intents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assistant (LLM chatbot)&lt;/strong&gt; — a model in a chat UI. Strong at language, weak at persistence. Each turn is mostly stateless unless you bolt on memory. Example: “Summarize this PDF” in one shot. No tool loop unless explicitly wired.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent&lt;/strong&gt;  — an LLM (or ensemble) wrapped in a &lt;strong&gt;control loop&lt;/strong&gt; : plan → act via tools → observe results → repeat. Carries &lt;strong&gt;goal state&lt;/strong&gt; , &lt;strong&gt;memory&lt;/strong&gt; , and often &lt;strong&gt;delegation&lt;/strong&gt; to other agents. Example: “Find the best week for surfing in Greece next year” → weather DB → tide search → synthesize → recommend dates.&lt;/p&gt;

&lt;h4&gt;
  
  
  Agent vs assistant vs bot
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb in prose:&lt;/strong&gt; if the product only needs one model call and no side effects, use an assistant. If it must &lt;strong&gt;change the world&lt;/strong&gt; (APIs, DBs, files, tickets) over multiple steps, you are building an agent. If the flow is fully scripted with no LLM judgment, you might not need an agent at all — a workflow engine suffices.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfga0ed8cidyqeyxix02.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfga0ed8cidyqeyxix02.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2 — Six defining capabilities
&lt;/h3&gt;

&lt;p&gt;Modern agents are not defined by a single feature but by a &lt;strong&gt;bundle&lt;/strong&gt; of behaviors:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reasoning&lt;/strong&gt;  — the model decomposes goals, handles ambiguity, and chooses among strategies. Chain-of-thought and structured planning prompts live here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Acting&lt;/strong&gt;  — execution through &lt;strong&gt;tools&lt;/strong&gt; : HTTP calls, SQL, Python, browser automation, MCP servers. Action is what separates agents from chat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observing&lt;/strong&gt;  — after each action, the agent ingests &lt;strong&gt;tool output&lt;/strong&gt; (JSON, logs, errors) and updates its internal state. Bad observation handling is the #1 source of silent failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Planning&lt;/strong&gt;  — explicit or implicit task graphs: “first gather weather, then check tides, then compare weeks.” Plans may be static (ReWOO) or interleaved with execution (ReAct).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Collaborating&lt;/strong&gt;  — multi-agent handoffs, human approvals, or role-based crews. No single model must do everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Self-refining&lt;/strong&gt;  — reflection passes, critique steps, memory writes, skill authoring. The agent improves its approach within or across sessions (see &lt;a href="//../hermes-agent-masterclass/TUTORIAL.md"&gt;Hermes learning loop&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Agent anatomy — persona, memory, tools, model&lt;/p&gt;

&lt;p&gt;These six capabilities map directly to architecture choices later: tools need MCP or function schemas; collaboration needs handoff or crew abstractions; self-refining needs memory tiers and logging.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3 — Anatomy: persona, memory, tools, model
&lt;/h3&gt;

&lt;p&gt;Every production agent resolves into four layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Persona&lt;/strong&gt;  — system prompt, SOUL.md, role brief. Sets tone, boundaries, and escalation rules. In enterprise agents, persona also encodes &lt;strong&gt;compliance&lt;/strong&gt; (“never disclose account numbers”).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory&lt;/strong&gt;  — what persists beyond the current context window. Short-term: chat history and scratchpad. Long-term: vector stores, markdown files, session DBs. See Part 4.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt;  — typed functions the model can invoke. Each tool needs a name, description, JSON schema, and a handler. Tools should be &lt;strong&gt;narrow&lt;/strong&gt; and &lt;strong&gt;idempotent where possible&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;  — the reasoning engine. Often one primary model plus smaller models for routing or summarization. Model choice affects cost, latency, and tool-call reliability.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Conceptual&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;agent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;stack&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(not&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;framework-specific)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;agent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"persona"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You are a cautious travel planner. Confirm before booking."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"memory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"session"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"long_term"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vector://user-prefs"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"weather_db"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search_web"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"calendar_create"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;model&lt;/strong&gt; is interchangeable; &lt;strong&gt;tools and memory&lt;/strong&gt; encode your product’s real value.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fciwkv30y90m91qdn17u8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fciwkv30y90m91qdn17u8.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 4 — Memory tiers
&lt;/h3&gt;

&lt;p&gt;Memory is not one blob. Mature agents use &lt;strong&gt;tiers&lt;/strong&gt; with different latency, capacity, and retrieval patterns:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Working memory&lt;/strong&gt;  — the current context window: system prompt, recent turns, tool results. Bounded by token limits; compress or summarize when full.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Episodic memory&lt;/strong&gt;  — past sessions and events (“last time we planned Greece, user preferred July”). Stored in SQLite, Postgres, or session logs; retrieved by recency or search.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic memory&lt;/strong&gt;  — facts and embeddings in a vector store. “User is vegetarian.” “API X rate-limits at 100 rpm.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Procedural memory&lt;/strong&gt;  — skills, playbooks, SOUL-adjacent instructions. Often markdown files or skill catalogs (Hermes SKILL.md, OpenAI custom instructions at scale).&lt;/p&gt;

&lt;p&gt;Memory tiers — working, episodic, semantic, procedural&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design rule:&lt;/strong&gt; inject a &lt;strong&gt;small frozen snapshot&lt;/strong&gt; at session start (persona + top facts), then let the agent &lt;strong&gt;search&lt;/strong&gt; for deeper history on demand. Dumping entire history into every turn burns context and money.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59q2cs97h5wbthwagq6e.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59q2cs97h5wbthwagq6e.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Many enterprise agents rely on retrieval systems rather than storing all knowledge directly inside the model context window. Platforms such as &lt;strong&gt;Instant RAGFlow&lt;/strong&gt; provide document ingestion, indexing, and retrieval pipelines that allow agents to access relevant information dynamically while keeping prompts lean and up to date.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/ragflow_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/ragflow_support/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Semantic memory is commonly implemented using vector databases that store embeddings and enable similarity search. &lt;strong&gt;Chroma Vector Database&lt;/strong&gt; is a popular lightweight option for agent memory systems, helping agents retrieve relevant facts, previous interactions, and domain knowledge during execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/chromadb_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/chromadb_support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — ReAct: interleaved reasoning and action
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;ReAct&lt;/strong&gt; (Reason + Act) alternates &lt;strong&gt;thought&lt;/strong&gt; , &lt;strong&gt;tool call&lt;/strong&gt; , and &lt;strong&gt;observation&lt;/strong&gt; in one loop. The model decides the next step only after seeing the last observation.&lt;/p&gt;

&lt;p&gt;Typical trace:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Think:&lt;/strong&gt; “I need historical weather for Greece.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Act:&lt;/strong&gt; weather_db("Greece")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observe:&lt;/strong&gt; { "avg_sunny_days_july": 28 }&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Think:&lt;/strong&gt; “Need tide/surf conditions.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Act:&lt;/strong&gt; search_web("best surfing tide Greece")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observe:&lt;/strong&gt; snippet about high tide windows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Think:&lt;/strong&gt; “Combine signals → recommend July 12–19.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Act:&lt;/strong&gt; respond to user&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ReAct loop — think, act, observe&lt;/p&gt;

&lt;p&gt;ReAct is flexible — the plan &lt;strong&gt;emerges&lt;/strong&gt; from execution. That helps exploratory tasks. Cost: more model turns, harder to audit upfront.&lt;/p&gt;

&lt;p&gt;Our minimal example implements this pattern (deterministic demo without an API key):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# examples/minimal_react_agent.py (excerpt)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;think_and_act&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Think: need historical weather for Greece&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;weather_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Greece&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Act: weather_db → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Think: need surfing conditions (high tide)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;best surfing tide Greece&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Act: search_web → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Observe: combine tide + sunny patterns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Act: recommend week of July 12–19 (demo)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;cd guides/ai-agents-masterclass
python examples/minimal_react_agent.py
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovuut7a90e0zp9nztwt8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovuut7a90e0zp9nztwt8.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 6 — ReWOO: plan first, execute second
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;ReWOO&lt;/strong&gt; (Reasoning Without Observation in the loop) separates &lt;strong&gt;planning&lt;/strong&gt; from &lt;strong&gt;execution&lt;/strong&gt;. A planner emits a structured script of tool calls; a worker runs them; a solver synthesizes the final answer.&lt;/p&gt;

&lt;p&gt;Flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Planner&lt;/strong&gt;  — output tool call graph with placeholders&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worker&lt;/strong&gt;  — execute all tools (possibly in parallel)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solver&lt;/strong&gt;  — read outputs, no further tool access&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ReWOO flow — planner, worker, solver&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When ReWOO wins:&lt;/strong&gt; predictable pipelines, expensive tools, parallelizable subtasks, audit requirements (plan is reviewable before execution).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When ReAct wins:&lt;/strong&gt; ambiguous goals, errors mid-flight, need to branch on unexpected results.&lt;/p&gt;

&lt;p&gt;Many production systems &lt;strong&gt;hybridize&lt;/strong&gt; : ReWOO for the macro pipeline, ReAct inside a single step when debugging.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3810jnh5lvvjixy7qf1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3810jnh5lvvjixy7qf1.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 7 — Five classical agent types
&lt;/h3&gt;

&lt;p&gt;Before LLMs, agent literature defined a &lt;strong&gt;ladder of sophistication&lt;/strong&gt;. Still useful for scoping:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simple reflex&lt;/strong&gt;  — if condition then action. Thermostat, basic alert bot. No memory, no search.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model-based reflex&lt;/strong&gt;  — internal state tracks the world (last sensor reading). Still no planning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Goal-based&lt;/strong&gt;  — searches action sequences to reach a goal. Classical planning / STRIPS territory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Utility-based&lt;/strong&gt;  — optimizes tradeoffs (cost vs speed vs risk). Portfolio agents, routing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Learning&lt;/strong&gt;  — updates policy from feedback. RL agents, self-refining skill loops, GEPA-style offline evolution.&lt;/p&gt;

&lt;p&gt;Agent types ladder — reflex to learning&lt;/p&gt;

&lt;p&gt;LLM agents usually sit at &lt;strong&gt;goal-based&lt;/strong&gt; with hooks toward &lt;strong&gt;learning&lt;/strong&gt; (memory writes, reflection, fine-tuning). Don’t over-build learning before basic tool reliability works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fensv85a5fqpvfs3hifty.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fensv85a5fqpvfs3hifty.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 8 — Three lifecycle stages (surfing vacation)
&lt;/h3&gt;

&lt;p&gt;OpenAI and ServiceNow-style masterclasses often teach agents as &lt;strong&gt;three stages&lt;/strong&gt;. We use one running example: &lt;em&gt;“Best week for surfing in Greece next year.”&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Stage 1 — Goal planning
&lt;/h4&gt;

&lt;p&gt;Decompose the user goal into subtasks and success criteria.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Subtask A: historical weather / sunny weeks&lt;/li&gt;
&lt;li&gt;Subtask B: surf/tide suitability&lt;/li&gt;
&lt;li&gt;Subtask C: reconcile constraints (user budget, travel dates)&lt;/li&gt;
&lt;li&gt;Done when: ranked recommendation with confidence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Goal planning — decompose and prioritize&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User goal: "Best week for surfing in Greece next year"
Planner output:
  1. Query weather_db(Greece) for sunny weeks
  2. search_web for tide/surf windows
  3. Rank weeks; explain tradeoffs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mzsfnfjz6c4caahm4au.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mzsfnfjz6c4caahm4au.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Stage 2 — Tool reasoning
&lt;/h4&gt;

&lt;p&gt;Select tools, fill arguments, handle errors, retry with backoff. The model must &lt;strong&gt;not&lt;/strong&gt; invent tool names — bind to your schema.&lt;/p&gt;

&lt;p&gt;Tool reasoning — schema-bound calls&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;weather_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;weather_db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;# Model sees JSON schemas; handler validates before side effects
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqatdyfi8hqy2u4j8ih6b.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqatdyfi8hqy2u4j8ih6b.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Stage 3 — Learning and reflection
&lt;/h4&gt;

&lt;p&gt;After answering, optionally: log trace, write memory (“user cares about surfing”), critique weak steps, update skills. This is where agents &lt;strong&gt;compound&lt;/strong&gt; over time.&lt;/p&gt;

&lt;p&gt;Learning loop — trace to memory to skills&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Reflection: "weather_db lacked tide granularity — add surf_forecast tool next sprint"
Memory write: USER prefers July travel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzunj7f73a6jc06iky06y.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzunj7f73a6jc06iky06y.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agent lifecycle — plan, act, learn&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsm8h097bbyk56bmo5bc.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsm8h097bbyk56bmo5bc.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 9 — Agentic vs non-agentic chatbots
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Non-agentic chatbot&lt;/strong&gt;  — single-turn or few-turn Q&amp;amp;A. Retrieval augments context, but no autonomous tool loop. Great for FAQs, doc search, copilot suggestions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agentic chatbot&lt;/strong&gt;  — same UI, but backend runs a &lt;strong&gt;control loop&lt;/strong&gt; with tools and state. User may see “Searching…” / “Calling calendar…” steps.&lt;/p&gt;

&lt;p&gt;Differences that matter in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;  — agents take longer; set UX expectations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;  — multiple model + tool calls per user message&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure modes&lt;/strong&gt;  — tool errors, infinite loops, hallucinated arguments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability&lt;/strong&gt;  — you need step traces, not just final text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your feature is “answer from our PDF,” start non-agentic. If it is “file this ticket and follow up,” go agentic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 10 — Single vs multi-agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Single agent&lt;/strong&gt;  — one model, one loop, one tool namespace. Simplest to debug. Hits limits on long workflows and conflicting roles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-agent&lt;/strong&gt;  — specialized agents with handoffs or parallel crews. Examples: triage → specialist, researcher + writer, planner + executor.&lt;/p&gt;

&lt;p&gt;Single vs multi-agent topologies&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Patterns in prose:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sequential crew&lt;/strong&gt;  — A completes task, passes output to B (CrewAI default)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handoff&lt;/strong&gt;  — router agent transfers conversation to specialist (OpenAI Agents SDK)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supervisor&lt;/strong&gt;  — orchestrator assigns subtasks to workers (LangGraph, AutoGen)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debate/review&lt;/strong&gt;  — generator + critic for quality gates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Multi-agent adds coordination overhead. Start single-agent until you have clear &lt;strong&gt;role boundaries&lt;/strong&gt; and &lt;strong&gt;separate tool permissions&lt;/strong&gt; per role.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9lakxbobb5dm5jo8h1q3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9lakxbobb5dm5jo8h1q3.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 11 — Surface vs background agents
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Surface agents&lt;/strong&gt;  — user-facing, synchronous. Chat UI, voice, copilot pane. User waits for steps; HITL approvals live here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Background agents&lt;/strong&gt;  — async jobs: cron digests, ticket sweeps, ETL monitors. Results delivered later via email, Slack, or dashboard.&lt;/p&gt;

&lt;p&gt;Surface vs background deployment&lt;/p&gt;

&lt;p&gt;Hermes &lt;strong&gt;cron&lt;/strong&gt; and OpenClaw &lt;strong&gt;heartbeats&lt;/strong&gt; are background patterns. Cloud Run &lt;strong&gt;jobs&lt;/strong&gt; or scheduled Cloud Functions fit the same slot.&lt;/p&gt;

&lt;p&gt;Design background agents with &lt;strong&gt;idempotency&lt;/strong&gt; and &lt;strong&gt;dead-letter queues&lt;/strong&gt;  — they will retry at 3 am without a human watching.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 12 — Six use-case categories
&lt;/h3&gt;

&lt;p&gt;Enterprise agents cluster into six families (plus cross-industry patterns):&lt;/p&gt;

&lt;p&gt;Six use-case categories&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Customer experience&lt;/strong&gt;  — support triage, order status, personalized recommendations. Needs CRM tools, strict PII handling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Employee productivity&lt;/strong&gt;  — draft docs, schedule meetings, summarize threads. Microsoft 365 Copilot, Google Workspace agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Software development&lt;/strong&gt;  — issue → PR agents, test generation, migration assistants. Heavy IDE + repo tool access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Data and analytics&lt;/strong&gt;  — natural language to SQL, anomaly explanation, report generation. Guard against destructive queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Security and operations&lt;/strong&gt;  — alert triage, runbook execution, patch verification. Read-only first; HITL for mutations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Industry workflows&lt;/strong&gt;  — vertical bundles (see below).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjdz9z3i00ffcj0jpijab.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjdz9z3i00ffcj0jpijab.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Healthcare
&lt;/h4&gt;

&lt;p&gt;Clinical &lt;strong&gt;documentation&lt;/strong&gt; agents draft notes from visit audio — human sign-off required. &lt;strong&gt;Prior authorization&lt;/strong&gt; agents gather payer rules and patient history. &lt;strong&gt;Scheduling&lt;/strong&gt; agents coordinate slots across systems. Regulatory constraint: agents &lt;strong&gt;assist&lt;/strong&gt; ; they do not diagnose autonomously in regulated jurisdictions.&lt;/p&gt;

&lt;h4&gt;
  
  
  Finance
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Reconciliation&lt;/strong&gt; agents match transactions across ledgers. &lt;strong&gt;Research&lt;/strong&gt; agents summarize filings and earnings calls with citations. &lt;strong&gt;Compliance&lt;/strong&gt; agents flag policy violations in communications. Audit trails and model risk management are mandatory.&lt;/p&gt;

&lt;h4&gt;
  
  
  Emergency and public safety
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Dispatch assist&lt;/strong&gt; agents summarize 911 transcripts and suggest resource allocation — always subordinate to human dispatchers. &lt;strong&gt;Disaster response&lt;/strong&gt; agents aggregate feeds and produce situational reports. Latency and failure modes can be life-critical; degrade gracefully to static playbooks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — Benefits
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Automation of judgment-heavy workflows&lt;/strong&gt;  — not just repetitive clicks, but branching decisions with explanations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;24/7 operation&lt;/strong&gt;  — background agents monitor queues overnight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Composable tools&lt;/strong&gt;  — same agent core, swap MCP servers for new domains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Personalization at scale&lt;/strong&gt;  — memory tiers remember preferences without re-prompting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Faster iteration&lt;/strong&gt;  — natural language interfaces to internal APIs lower integration cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — Challenges and risks
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Unpredictability&lt;/strong&gt;  — same prompt, different tool paths. Mitigate with schemas, evals, and golden traces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost&lt;/strong&gt;  — long ReAct loops multiply token usage. Cap turns, summarize observations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security&lt;/strong&gt;  — prompt injection via tool results, over-privileged tools, SSRF from web fetch tools. Least privilege per tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compliance&lt;/strong&gt;  — GDPR, HIPAA, SOC2: log retention, data residency, human approval for sensitive actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trust&lt;/strong&gt;  — users need visibility into what the agent did. Black-box answers erode adoption.&lt;/p&gt;

&lt;p&gt;Governance — HITL, logs, policies&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiwloioz3vi908r58g1cf.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiwloioz3vi908r58g1cf.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 15 — Best practices
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Activity logs&lt;/strong&gt;  — append-only trace of every thought, tool call, observation, and final output. Store run_id, timestamps, user ID, model version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interruption&lt;/strong&gt;  — user can cancel in-flight loops; worker checks cancel token between turns (Hermes models this explicitly).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unique IDs&lt;/strong&gt;  — correlate user session, agent run, and tool invocations across microservices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human-in-the-loop (HITL)&lt;/strong&gt; — require approval for payments, deletes, external emails, privilege changes. Pattern: agent prepares action → human clicks approve → tool executes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool design&lt;/strong&gt;  — small surface area, explicit errors, no silent defaults on missing args.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evals&lt;/strong&gt;  — regression suite of goals with expected tool sequences or output rubrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Budgets&lt;/strong&gt;  — max turns, max tool calls, max cost per run.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudocode: run envelope
&lt;/span&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RunContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;max_turns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;
    &lt;span class="n"&gt;cancelled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RunContext&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cancelled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RunCancelled&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;log_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{...})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 16 — Protocols: MCP and A2A
&lt;/h3&gt;

&lt;p&gt;Agents rarely exist alone. Two interoperability layers matter in 2025–2026:&lt;/p&gt;

&lt;h4&gt;
  
  
  Model Context Protocol (MCP)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;MCP&lt;/strong&gt; standardizes how hosts discover and invoke &lt;strong&gt;tools, resources, and prompts&lt;/strong&gt; from external servers — “USB-C for AI tools.” Your agent (or IDE host) runs MCP clients; GitHub, Postgres, filesystem, custom APIs expose MCP servers.&lt;/p&gt;

&lt;p&gt;Deep dive: &lt;a href="//../mcp-visual-guide/TUTORIAL.md"&gt;MCP Visual Guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Protocols — MCP and A2A&lt;/p&gt;

&lt;h4&gt;
  
  
  Agent-to-Agent (A2A)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;A2A&lt;/strong&gt; (Google-led, industry collaborators) focuses on &lt;strong&gt;agent ↔ agent&lt;/strong&gt; messaging: capability cards, task delegation, status updates across vendor boundaries. Where MCP connects agents to &lt;strong&gt;tools&lt;/strong&gt; , A2A connects agents to &lt;strong&gt;each other&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Use MCP for tool sprawl; use A2A when your orchestrator and specialist run in &lt;strong&gt;different frameworks or clouds&lt;/strong&gt; and need a standard task envelope.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx1rjjpv45xf6o6hmr2p9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx1rjjpv45xf6o6hmr2p9.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 17 — Framework landscape
&lt;/h3&gt;

&lt;p&gt;No single framework wins every workload. Map &lt;strong&gt;orchestration style&lt;/strong&gt; , &lt;strong&gt;team familiarity&lt;/strong&gt; , and &lt;strong&gt;deployment target&lt;/strong&gt;  first.&lt;/p&gt;

&lt;p&gt;Frameworks map — LangGraph, CrewAI, SDKs, cloud&lt;/p&gt;

&lt;p&gt;Below: &lt;strong&gt;when to use&lt;/strong&gt; prose for each major option. All can coexist with MCP tool servers.&lt;/p&gt;

&lt;h4&gt;
  
  
  LangGraph
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; models agents as &lt;strong&gt;state machines&lt;/strong&gt;  — nodes, edges, conditional routing, checkpointing. Best when you need &lt;strong&gt;explicit control flow&lt;/strong&gt; , cycles, human-in-the-loop interrupts, and time-travel debugging. LangChain ecosystem; steep learning curve if you only need a simple ReAct loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# examples/langgraph_research_agent.py — plan → research → synthesize
&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ResearchState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;research&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;synthesize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_entry_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pick LangGraph for &lt;strong&gt;production workflows&lt;/strong&gt; with branching, retries, and persisted state.&lt;/p&gt;

&lt;h4&gt;
  
  
  CrewAI
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; optimizes &lt;strong&gt;role-based teams&lt;/strong&gt; : researcher, writer, analyst with sequential or hierarchical process. Minimal boilerplate for multi-agent prose tasks. Less ideal for fine-grained tool graphs or hard latency SLAs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;crew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Crew&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;research_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write_task&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sequential&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kickoff&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pick CrewAI for &lt;strong&gt;content pipelines&lt;/strong&gt; , research briefs, and demos where roles are obvious.&lt;/p&gt;

&lt;h4&gt;
  
  
  AutoGen (Microsoft)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;AutoGen&lt;/strong&gt; emphasizes &lt;strong&gt;conversable agents&lt;/strong&gt; and group chat patterns — good for coding assistants, multi-agent debate, and Azure/OpenAI shops. v0.4+ rearchitecture adds async and distributed agents. Choose when you want &lt;strong&gt;Microsoft stack integration&lt;/strong&gt; and flexible agent-to-agent chat.&lt;/p&gt;

&lt;h4&gt;
  
  
  OpenAI Agents SDK
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; (openai-agents) provides &lt;strong&gt;Agent&lt;/strong&gt; , &lt;strong&gt;Runner&lt;/strong&gt; , &lt;strong&gt;handoffs&lt;/strong&gt; , and built-in tracing. Tight integration with OpenAI models and Responses API. Handoffs are first-class for triage → specialist routing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;specialist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Specialist&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Answer technical AI agent questions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;triage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Triage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Route technical questions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handoffs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;specialist&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;triage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is ReAct for AI agents?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pick it for &lt;strong&gt;OpenAI-native&lt;/strong&gt; products and fast handoff prototypes.&lt;/p&gt;

&lt;h4&gt;
  
  
  Google Agent Development Kit (ADK)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Google ADK&lt;/strong&gt; targets &lt;strong&gt;Gemini&lt;/strong&gt; agents on Vertex AI and Google Cloud — tool use, sub-agents, deployment to Cloud Run. Choose when your stack is &lt;strong&gt;GCP-first,&lt;/strong&gt; and you want first-party Google tooling for evals and hosting.&lt;/p&gt;

&lt;h4&gt;
  
  
  Pydantic AI
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Pydantic AI&lt;/strong&gt; centers &lt;strong&gt;type-safe outputs&lt;/strong&gt;  — result_type=WeatherReport Validates structured responses. Excellent developer ergonomics for Python teams already using Pydantic v2.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WeatherReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;best_week&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai:gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;WeatherReport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_sync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best surfing week in Greece?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pick Pydantic AI when &lt;strong&gt;schema correctness&lt;/strong&gt; matters more than exotic orchestration.&lt;/p&gt;

&lt;h4&gt;
  
  
  LlamaIndex Agents
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;LlamaIndex&lt;/strong&gt; began as RAG; its agent layer excels when &lt;strong&gt;retrieval&lt;/strong&gt; is the core — document Q&amp;amp;A agents, knowledge-base tools, hybrid search. Pair with LlamaParse and workflow events for ingestion-heavy apps.&lt;/p&gt;

&lt;h4&gt;
  
  
  Semantic Kernel (Microsoft)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Semantic Kernel&lt;/strong&gt; offers plugins, planners, and enterprise patterns in  &lt;strong&gt;.NET and Python&lt;/strong&gt;. Strong fit for &lt;strong&gt;Microsoft 365&lt;/strong&gt; , Azure AI, and orgs with existing SK investments.&lt;/p&gt;

&lt;h4&gt;
  
  
  Smolagents (Hugging Face)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Smolagents&lt;/strong&gt;  — lightweight, code-agent focused, Hugging Face hub models. Great for &lt;strong&gt;local/open models&lt;/strong&gt; and teaching agents without heavy deps.&lt;/p&gt;

&lt;h4&gt;
  
  
  Amazon Bedrock Agents
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Bedrock Agents&lt;/strong&gt;  — managed AWS service: action groups, knowledge bases, guardrails. Choose when you want &lt;strong&gt;AWS-managed&lt;/strong&gt; scaling and IAM-native permissions, less custom loop code.&lt;/p&gt;

&lt;h4&gt;
  
  
  Mastra
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Mastra&lt;/strong&gt;  — TypeScript-first agent framework with workflows, evals, and deployment story. Pick for &lt;strong&gt;Node/TS&lt;/strong&gt; teams building product agents alongside Next.js apps.&lt;/p&gt;

&lt;h4&gt;
  
  
  Agno (formerly Phidata)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Agno&lt;/strong&gt;  — Python toolkit for multi-agent systems with memory, knowledge, and UI. Fast prototyping for &lt;strong&gt;agent OS&lt;/strong&gt; style apps.&lt;/p&gt;

&lt;h4&gt;
  
  
  ServiceNow AI Agents
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;ServiceNow&lt;/strong&gt; embeds agents in &lt;strong&gt;ITSM, HR, CSM&lt;/strong&gt; workflows — Now Assist, flow designer integration, enterprise guardrails. Choose when the workflow already lives in ServiceNow; extend via Now Platform skills and data classes.&lt;/p&gt;

&lt;h4&gt;
  
  
  Hermes Agent
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Hermes&lt;/strong&gt;  — self-hosted &lt;strong&gt;learning agent&lt;/strong&gt; : SOUL.md identity, three memory tiers, self-evolving skills, Curator, optional GEPA, MCP-heavy profiles, gateway + cron. Best when you want an agent that &lt;strong&gt;improves over time&lt;/strong&gt; on your machine.&lt;/p&gt;

&lt;p&gt;Full tutorial: Hermes Agent Masterclass.&lt;/p&gt;

&lt;h4&gt;
  
  
  OpenClaw
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt;  — messaging-first gateway (WhatsApp, Telegram, Slack), ClawHub skills, proactive heartbeats. Best when &lt;strong&gt;channels and presence&lt;/strong&gt; matter more than offline skill evolution. Compare: Hermes vs OpenClaw.&lt;/p&gt;

&lt;h4&gt;
  
  
  Framework selection (prose)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Explicit graphs, HITL, persistence&lt;/strong&gt; → LangGraph&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Role crews, content&lt;/strong&gt; → CrewAI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI handoffs&lt;/strong&gt; → OpenAI Agents SDK&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Typed Python outputs&lt;/strong&gt; → Pydantic AI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG-heavy&lt;/strong&gt; → LlamaIndex&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GCP / Gemini&lt;/strong&gt; → Google ADK&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS managed&lt;/strong&gt; → Bedrock Agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TypeScript product&lt;/strong&gt; → Mastra&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-hosted learning agent&lt;/strong&gt; → Hermes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Messaging gateway&lt;/strong&gt; → OpenClaw&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzzpd8kdxxi8y7qpyk10e.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzzpd8kdxxi8y7qpyk10e.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Organizations deploying customer-facing agents often need more than orchestration alone. &lt;strong&gt;OpenClaw&lt;/strong&gt; provides a messaging-first architecture with support for channels such as WhatsApp, Telegram, and Slack, enabling agents to operate continuously across real-world communication platforms while maintaining isolated sessions and tool access controls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/openclaw-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/openclaw-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 18 — Environment setup
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.11+&lt;/li&gt;
&lt;li&gt;Optional: OPENAI_API_KEY for live LLM runs&lt;/li&gt;
&lt;li&gt;Virtualenv recommended
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;cd guides/ai-agents-masterclass
python3 -m venv .venv
source .venv/bin/activate
&lt;/span&gt;&lt;span class="gp"&gt;pip install -r requirements.txt #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;optional deps per framework
&lt;span class="gp"&gt;cp .env.example .env #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;fill OPENAI_API_KEY &lt;span class="k"&gt;if &lt;/span&gt;desired
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fekkde54jj7587rkhahm5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fekkde54jj7587rkhahm5.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 19 — Example 1: minimal ReAct agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; minimal_react_agent.py&lt;/p&gt;

&lt;p&gt;No framework — pure Python demonstrating Think → Act → Observe. Uses stub weather_db and search_web tools. Set AGENT_GOAL in .env.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;python examples/minimal_react_agent.py
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output: step trace ending in ✓ ReAct loop completed.&lt;/p&gt;

&lt;p&gt;Step 02 — minimal ReAct run&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Teaching point:&lt;/strong&gt; understand the loop before adopting LangGraph or CrewAI abstractions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq9jnmi096qjtsp3boev.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq9jnmi096qjtsp3boev.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 20 — Example 2: LangGraph research agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; langgraph_research_agent.py&lt;/p&gt;

&lt;p&gt;Three-node graph: &lt;strong&gt;plan → research → synthesize&lt;/strong&gt;. Writes report.md.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;pip install langgraph langchain-core
export RESEARCH_TOPIC="AI agent governance"
python examples/langgraph_research_agent.py
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 03 — LangGraph research agent&lt;/p&gt;

&lt;p&gt;Extend with conditional edges: if research finds insufficient sources, loop back to research.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42kjztbexhc5pcne622x.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42kjztbexhc5pcne622x.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 21 — Example 3: CrewAI content crew
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; crewai_content_crew.py&lt;/p&gt;

&lt;p&gt;Two agents — researcher and writer — sequential tasks. Demo mode writes stub blog_draft.md without API key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;pip install crewai
export CREW_TOPIC="Why AI agents need governance"
python examples/crewai_content_crew.py
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With OPENAI_API_KEY, runs live crew and saves markdown output.&lt;/p&gt;

&lt;p&gt;Step 04 — CrewAI content crew&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk92n53zhd5e336lk60uw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk92n53zhd5e336lk60uw.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 22 — Example 4: OpenAI Agents SDK handoffs
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; openai_agents_sdk.py&lt;/p&gt;

&lt;p&gt;Async &lt;strong&gt;triage → specialist&lt;/strong&gt; handoff via openai-agents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;pip install openai-agents
export OPENAI_API_KEY=sk-...
python examples/openai_agents_sdk.py
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 05 — OpenAI Agents SDK handoff&lt;/p&gt;

&lt;p&gt;Tracing in OpenAI dashboard shows handoff boundaries — use for debugging routing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh73jv3t7h6wrf9ni5ien.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh73jv3t7h6wrf9ni5ien.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 23 — Example 5: Pydantic AI typed agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; pydantic_ai_typed_agent.py&lt;/p&gt;

&lt;p&gt;Returns validated WeatherReport model — location, best_week, confidence, notes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;pip install pydantic-ai
&lt;/span&gt;&lt;span class="gp"&gt;python examples/pydantic_ai_typed_agent.py #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;demo stub without key
&lt;span class="go"&gt;export OPENAI_API_KEY=sk-...
&lt;/span&gt;&lt;span class="gp"&gt;python examples/pydantic_ai_typed_agent.py #&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;live validated run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 06 — Pydantic AI typed output&lt;/p&gt;

&lt;p&gt;Use typed agents at API boundaries — downstream code consumes Pydantic models, not raw strings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj0ny1ihvfonydh41xroj.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj0ny1ihvfonydh41xroj.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 24 — Smoke tests
&lt;/h3&gt;

&lt;p&gt;Run the bundled pytest smoke tests (no API key required for stubs):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;pip install pytest
pytest examples/tests/test_agents_smoke.py -v
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 07 — run tests&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fniga8mblwgdd9ihjecuc.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fniga8mblwgdd9ihjecuc.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 25 — Deploy to Google Cloud Run
&lt;/h3&gt;

&lt;p&gt;Containerize your agent HTTP service or job runner. Cloud Run gives scale-to-zero, IAM, and VPC connectors for private DB access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Outline:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Dockerfile&lt;/strong&gt;  — slim Python image, install deps, expose port 8080&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Service&lt;/strong&gt;  — FastAPI or Flask wrapper around agent run() with run_id logging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secrets&lt;/strong&gt;  — Secret Manager for OPENAI_API_KEY, not env files in image&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy&lt;/strong&gt;  — gcloud run deploy agent-service --source .&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Background&lt;/strong&gt;  — Cloud Run jobs or Cloud Scheduler for cron agents&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cloud Run deployment — container to service&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Minimal Dockerfile sketch&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.12-slim&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements.txt .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PORT=8080&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]&lt;/span&gt;

gcloud run deploy ai-agent-demo \
  --source . \
  --region us-central1 \
  --set-secrets OPENAI_API_KEY=openai-key:latest \
  --allow-unauthenticated &lt;span class="c"&gt;# lock down in production&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Production notes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set &lt;strong&gt;request timeout&lt;/strong&gt; above worst-case agent duration or return 202 + poll&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Cloud Logging&lt;/strong&gt; for structured trace JSON&lt;/li&gt;
&lt;li&gt;Attach &lt;strong&gt;service account&lt;/strong&gt; with least privilege for GCP tools&lt;/li&gt;
&lt;li&gt;Consider &lt;strong&gt;Cloud Armor&lt;/strong&gt; if endpoint is public&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fooxr64wjftcyy521u5cw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fooxr64wjftcyy521u5cw.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 26 — Production checklist
&lt;/h3&gt;

&lt;p&gt;Before shipping any agent to users:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Identity and auth&lt;/strong&gt;  — who can invoke which tools? Map OAuth subject → tool ACL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability&lt;/strong&gt;  — structured logs, metrics (turns, latency, tool errors), distributed tracing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Safety&lt;/strong&gt;  — input/output filters, blocked tool list, prompt injection tests on tool results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HITL&lt;/strong&gt;  — approval queue for irreversible actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost controls&lt;/strong&gt;  — per-user budgets, model routing (small model for triage).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data&lt;/strong&gt;  — PII redaction in logs, retention policy, regional storage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reliability&lt;/strong&gt;  — idempotent tools, retries with jitter, circuit breakers on flaky APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evals&lt;/strong&gt;  — golden tasks in CI; regression when prompts or tools change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Incident response&lt;/strong&gt;  — kill switch to disable tool execution globally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documentation&lt;/strong&gt;  — runbooks for on-call when agent error rate spikes.&lt;/p&gt;

&lt;p&gt;Moving from prototypes to production often requires workflow management, monitoring, and operational controls around agent systems. &lt;strong&gt;Dify AI&lt;/strong&gt; provides a platform for building, deploying, evaluating, and monitoring AI agents and LLM applications, helping teams shorten the path from experimentation to production deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/difyai_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/difyai_support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 27 — Building your own agent (checklist)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Define &lt;strong&gt;one measurable goal&lt;/strong&gt; (surf week, ticket triage, report generation)&lt;/li&gt;
&lt;li&gt;List &lt;strong&gt;tools&lt;/strong&gt; with JSON schemas — prefer MCP servers for reuse&lt;/li&gt;
&lt;li&gt;Choose &lt;strong&gt;ReAct vs ReWOO&lt;/strong&gt; (or hybrid)&lt;/li&gt;
&lt;li&gt;Pick &lt;strong&gt;framework&lt;/strong&gt; from Part 17 or start with minimal loop&lt;/li&gt;
&lt;li&gt;Add &lt;strong&gt;memory tier&lt;/strong&gt; only when sessions need continuity&lt;/li&gt;
&lt;li&gt;Instrument &lt;strong&gt;run_id&lt;/strong&gt; and step logs from day one&lt;/li&gt;
&lt;li&gt;Ship &lt;strong&gt;HITL&lt;/strong&gt; before auto-executing side effects&lt;/li&gt;
&lt;li&gt;Run &lt;strong&gt;smoke tests&lt;/strong&gt; and golden evals&lt;/li&gt;
&lt;li&gt;Deploy behind API with timeouts and secrets manager&lt;/li&gt;
&lt;li&gt;Iterate from traces — most bugs are bad tool descriptions, not bad models&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Part 28 — Connecting agents to MCP
&lt;/h3&gt;

&lt;p&gt;Any framework above can call MCP tools if the host exposes them (Cursor, Claude Desktop) or you embed an MCP client in your runtime.&lt;/p&gt;

&lt;p&gt;Pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run MCP server (stdio or HTTP)&lt;/li&gt;
&lt;li&gt;Client handshake → discover tools&lt;/li&gt;
&lt;li&gt;Map MCP tool schemas to your framework’s function format&lt;/li&gt;
&lt;li&gt;Execute tool calls through MCP client&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cross-reference: MCP Visual Guide — Part 10–12.&lt;/p&gt;

&lt;p&gt;Hermes profiles declare MCP in config.yamlLangGraph nodes can wrap MCP invocations in a dedicated tool node.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 29 — Multi-agent orchestration patterns
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Supervisor&lt;/strong&gt;  — central node assigns subtasks, collects results. LangGraph Send API, AutoGen group chat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline&lt;/strong&gt;  — fixed DAG, no dynamic routing. CrewAI sequential, ReWOO workers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Handoff&lt;/strong&gt;  — conversational transfer with context pack. OpenAI Agents SDK.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Blackboard&lt;/strong&gt;  — shared state document agents read/write. Useful for research synthesis.&lt;/p&gt;

&lt;p&gt;Pick &lt;strong&gt;supervisor&lt;/strong&gt; when tasks are dynamic; &lt;strong&gt;pipeline&lt;/strong&gt; when steps are known; &lt;strong&gt;handoff&lt;/strong&gt; when user-facing role should change mid-session.&lt;/p&gt;

&lt;p&gt;As multi-agent systems grow in complexity, visual orchestration becomes increasingly valuable. &lt;strong&gt;CrewAI Studio&lt;/strong&gt; allows developers to design, coordinate, and monitor role-based agent teams without building orchestration infrastructure from scratch, making it a practical choice for research, content generation, and business workflow automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/crewai-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 30 — Observability and debugging
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Trace format&lt;/strong&gt; (store as JSON lines):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"run_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"run_abc123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"turn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tool_call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"weather_db"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Greece"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"latency_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;142&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Debug workflow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reproduce with frozen prompt + tool stubs&lt;/li&gt;
&lt;li&gt;Diff tool schemas vs model-emitted args&lt;/li&gt;
&lt;li&gt;Check observation truncation — did you cut off the JSON the model needed?&lt;/li&gt;
&lt;li&gt;Lower temperature for routing; allow higher for creative synthesis steps&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;OpenAI Agents SDK and LangSmith offer hosted tracing; self-host with OpenTelemetry if required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 31 — Cost and latency optimization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Route&lt;/strong&gt; trivial questions to a small model without tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache&lt;/strong&gt; tool results (weather, FX rates) with TTL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelize&lt;/strong&gt; independent tool calls (ReWOO worker stage)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Summarize&lt;/strong&gt; long observations before next turn&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cap&lt;/strong&gt; max turns and fail gracefully with partial answer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch&lt;/strong&gt; background agents off peak&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Part 32 — Security deep dive
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Tool privilege&lt;/strong&gt;  — separate read and write tools; never give shell and send_email to the same agent without HITL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt injection via tools&lt;/strong&gt;  — malicious webpage content instructs “ignore prior instructions.” Sanitize and summarize untrusted tool output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSRF&lt;/strong&gt;  — fetch_url tools must block metadata IPs and internal ranges.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secrets&lt;/strong&gt;  — tools receive credentials from env/Secret Manager, not from model context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;  — prevent agents from leaking system prompts or other users’ data in multi-tenant setups.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 33 — Evals and quality gates
&lt;/h3&gt;

&lt;p&gt;Build a &lt;strong&gt;golden set&lt;/strong&gt; of 20–50 tasks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# evals/surf_goal.yaml&lt;/span&gt;
&lt;span class="na"&gt;goal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;week&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;surfing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Greece&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;next&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;year"&lt;/span&gt;
&lt;span class="na"&gt;expect_tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;weather_db"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_web"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;rubric&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Must&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cite&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;weather&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tide&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;confidence&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;stated"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run in CI on prompt/tool changes. Track pass rate over time. Add adversarial cases (missing tool, API 500, empty search results).&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 34 — When not to build an agent
&lt;/h3&gt;

&lt;p&gt;Skip agents when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Workflow is &lt;strong&gt;fully deterministic&lt;/strong&gt;  — use Zapier, Temporal, Airflow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero side effects&lt;/strong&gt;  — RAG chatbot suffices&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard real-time&lt;/strong&gt;  — sub-100ms SLAs don’t fit LLM loops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regulatory prohibition&lt;/strong&gt; on autonomous action — keep human-only execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agents are a &lt;strong&gt;tool&lt;/strong&gt; , not a mandate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 35 — Roadmap: from demo to product
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Week 1&lt;/strong&gt;  — minimal ReAct + one real tool + logs&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Week 2&lt;/strong&gt;  — MCP server for tool isolation + HITL on writes&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Week 3&lt;/strong&gt;  — LangGraph or SDK with checkpointing + eval suite&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Week 4&lt;/strong&gt;  — Cloud Run deploy + secrets + monitoring dashboards&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Ongoing&lt;/strong&gt;  — memory tier, multi-agent only when traces prove bottleneck&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;An &lt;strong&gt;AI agent&lt;/strong&gt; pursues goals through a &lt;strong&gt;loop&lt;/strong&gt; of reasoning, tool action, and observation — not a single chat completion. &lt;strong&gt;Persona, memory, tools, and model&lt;/strong&gt; form the anatomy; &lt;strong&gt;ReAct&lt;/strong&gt; and &lt;strong&gt;ReWOO&lt;/strong&gt; offer two orchestration strategies; &lt;strong&gt;single vs multi-agent&lt;/strong&gt; and &lt;strong&gt;surface vs background&lt;/strong&gt; deployments match different products. Enterprise value spans six use-case families; &lt;strong&gt;governance&lt;/strong&gt; (logs, HITL, unique run IDs) separates demos from production. Use &lt;strong&gt;MCP&lt;/strong&gt; for tools and &lt;strong&gt;A2A&lt;/strong&gt; for cross-agent tasks. Start with minimal_react_agent.py, graduate to &lt;strong&gt;LangGraph&lt;/strong&gt; , &lt;strong&gt;CrewAI&lt;/strong&gt; , &lt;strong&gt;OpenAI Agents SDK&lt;/strong&gt; , or &lt;strong&gt;Pydantic AI&lt;/strong&gt; as requirements sharpen, deploy on &lt;strong&gt;Cloud Run&lt;/strong&gt; with secrets and evals, and extend with Hermes or MCP when you need learning loops or standardized tool wiring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aiagentsoftware</category>
      <category>aiagentdevelopment</category>
      <category>aiagentsreview</category>
      <category>agents</category>
    </item>
    <item>
      <title>Harness Engineering — Full Visual Guide</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Tue, 16 Jun 2026 18:59:21 +0000</pubDate>
      <link>https://dev.to/techlatestnet/harness-engineering-full-visual-guide-254d</link>
      <guid>https://dev.to/techlatestnet/harness-engineering-full-visual-guide-254d</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdw5us6uxc9c1w2ytty3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdw5us6uxc9c1w2ytty3.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;model is smart. The harness makes it reliable.&lt;/strong&gt; Build the environment around Claude Code, Codex, or any coding agent so multi-session work finishes with proof — not vibes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhluhalwdthztvbv2g1t.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhluhalwdthztvbv2g1t.gif" width="600" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll understand
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Why the &lt;strong&gt;same model&lt;/strong&gt; fails or succeeds based on harness — not IQ&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;five subsystems&lt;/strong&gt; : instructions, state, verification, scope, lifecycle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AGENTS.md as map&lt;/strong&gt; , not encyclopedia — progressive disclosure via docs/&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;16-step session lifecycle&lt;/strong&gt; agents should follow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Planner/generator/evaluator&lt;/strong&gt; splits for long runs&lt;/li&gt;
&lt;li&gt;Copy-ready templates to drop into your repo today&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Introduction — it’s a harness problem
&lt;/h3&gt;

&lt;p&gt;You give Claude or GPT a real task. It reads files, writes code, looks productive. Then it skips a step, breaks tests, says “done” — and nothing works. You spend more time &lt;strong&gt;rescuing&lt;/strong&gt; than if you’d coded it yourself.&lt;/p&gt;

&lt;p&gt;That’s not a model problem. It’s a &lt;strong&gt;harness problem&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Anthropic ran a controlled experiment: same model (Opus 4.5), same prompt (“build a 2D retro game editor”). &lt;strong&gt;Without harness:&lt;/strong&gt; ~$9 in 20 minutes, broken output. &lt;strong&gt;With harness&lt;/strong&gt; (planner + generator + evaluator): ~$200 in 6 hours, &lt;strong&gt;playable game&lt;/strong&gt;. The model didn’t change. The &lt;strong&gt;environment&lt;/strong&gt; did.&lt;/p&gt;

&lt;p&gt;OpenAI reported the same shift with Codex: in a well-harnessed repo, reliability moves from “unreliable” to &lt;strong&gt;production-grade&lt;/strong&gt;  — not a marginal tweak, a qualitative jump.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Harness engineering&lt;/strong&gt; = designing everything the model runs inside: instructions, state files, verification gates, scope boundaries, session lifecycle, hooks, sandboxes, observability.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent = Model + Harness
If you're not the model, you're the harness.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Harness pattern — task to verified done&lt;/p&gt;

&lt;p&gt;Modern agent platforms such as OpenClaw extend this idea by providing persistent agent sessions, structured workflows, and runtime orchestration around foundation models. In practice, the harness often determines whether an agent completes work reliably or simply generates plausible output.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/openclaw-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/openclaw-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — The harness pattern
&lt;/h3&gt;

&lt;p&gt;You give a task. The agent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reads harness files (not your Slack thread)&lt;/li&gt;
&lt;li&gt;Runs init.sh — install, health check&lt;/li&gt;
&lt;li&gt;Picks &lt;strong&gt;one&lt;/strong&gt; unfinished feature&lt;/li&gt;
&lt;li&gt;Implements with verification loop&lt;/li&gt;
&lt;li&gt;Stops only when &lt;strong&gt;tests/lint/types pass&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;strong&gt;model&lt;/strong&gt; decides what code to write.&lt;br&gt;&lt;br&gt;
The &lt;strong&gt;harness&lt;/strong&gt; governs when, where, and how — and when “done” is allowed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjzqeih0g9pewz3mhxvd.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjzqeih0g9pewz3mhxvd.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Part 2 — Five subsystems
&lt;/h3&gt;

&lt;p&gt;Five subsystems — instructions through lifecycle&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzinmy5r00x7dh4zzgidm.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzinmy5r00x7dh4zzgidm.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;| Subsystem | Job | Artifacts |
|-----------|-----|-----------|
| Instructions | What to do, in what order, what to read first | &lt;span class="sb"&gt;`AGENTS.md`&lt;/span&gt;, &lt;span class="sb"&gt;`CLAUDE.md`&lt;/span&gt;, &lt;span class="sb"&gt;`docs/`&lt;/span&gt; |
| State | What's done, in progress, next | &lt;span class="sb"&gt;`feature_list.json`&lt;/span&gt;, &lt;span class="sb"&gt;`claude-progress.md`&lt;/span&gt;, git log |
| Verification | Proof before victory | tests, lint, typecheck, smoke, e2e |
| Scope | One feature at a time; real definition of done | feature list as machine-readable boundary |
| Lifecycle | Clean start and handoff | &lt;span class="sb"&gt;`init.sh`&lt;/span&gt;, wrap-up checklist, safe commit |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The harness doesn’t make the model smarter. It makes output &lt;strong&gt;reliable&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3 — Without harness vs with harness
&lt;/h3&gt;

&lt;p&gt;Without vs with harness — two session story&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without:&lt;/strong&gt; Session 2 has no memory. Agent re-does work or wanders. You merge broken code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With:&lt;/strong&gt; Session 2 reads claude-progress.md, continues feature F03, verifies before claiming done. &lt;strong&gt;You review, not rescue.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n26ehbr87qk7rox2hdw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n26ehbr87qk7rox2hdw.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 4 — AGENTS.md: map, not encyclopedia
&lt;/h3&gt;

&lt;p&gt;The “one giant AGENTS.md” approach fails predictably:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context is scarce — a 1,000-line manual crowds out the task&lt;/li&gt;
&lt;li&gt;Everything “important” means nothing is&lt;/li&gt;
&lt;li&gt;It rots — agents can’t tell what’s still true&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; ~100 lines AGENTS.md as &lt;strong&gt;table of contents&lt;/strong&gt;. Deep truth lives in structured docs/ — design docs, architecture, exec plans, quality grades. Agent starts small, &lt;strong&gt;reads on demand&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;OpenAI’s Codex team treats docs/ as a &lt;strong&gt;system of record&lt;/strong&gt; ; linters and doc-gardening agents keep it fresh.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n26ehbr87qk7rox2hdw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n26ehbr87qk7rox2hdw.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — Session lifecycle (16 steps)
&lt;/h3&gt;

&lt;p&gt;Session lifecycle flow&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start:&lt;/strong&gt; Read harness → init.sh → progress log → feature list → git log&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Select:&lt;/strong&gt; Pick exactly &lt;strong&gt;one&lt;/strong&gt; unfinished feature&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Execute:&lt;/strong&gt; Implement → verify → fix loop until green → record evidence&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrap:&lt;/strong&gt; Update progress + feature list → note broken/unverified → commit when safe to resume&lt;/p&gt;

&lt;p&gt;Without harness, step “verify” becomes “agent says it looks fine.” With harness, it’s &lt;strong&gt;tests pass, lint clean, types check&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh10ortk0foqw5pj1ta2t.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh10ortk0foqw5pj1ta2t.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 6 — Scope and feature lists
&lt;/h3&gt;

&lt;p&gt;feature_list.json is a &lt;strong&gt;harness primitive&lt;/strong&gt;  — machine-readable scope the agent can't hand-wave away.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frz5iypmd8yghhdl3g7e9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frz5iypmd8yghhdl3g7e9.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One passes: false feature active at a time&lt;/li&gt;
&lt;li&gt;No rewriting the list to hide unfinished work&lt;/li&gt;
&lt;li&gt;passes: true Only with evidence (test name, date, log snippet)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See feature_list.json.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"app"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"knowledge-base-desktop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"features"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Import local markdown files"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"passes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"evidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tests/import.test.ts — 2026-06-01"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F02"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Document library list view"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"passes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"evidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"e2e/library.spec.ts"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F03"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Index documents for search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"passes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"in progress — indexer stub only"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F04"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Grounded Q&amp;amp;A with citations"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"passes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"blocked on F03"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"next"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F03"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Part 7 — Verification and early victory
&lt;/h3&gt;

&lt;p&gt;Agents declare victory too early because &lt;strong&gt;confidence ≠ correctness&lt;/strong&gt;. Fixes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runnable proof required (not “I think it works”)&lt;/li&gt;
&lt;li&gt;Full pipeline runs — unit + lint + typecheck + smoke&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separate evaluator agent&lt;/strong&gt;  — generation ≠ grading (Anthropic harness pattern)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Planner · generator · evaluator&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qyqukioghmp2b1hsehq.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qyqukioghmp2b1hsehq.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 8 — Hooks and the ratchet
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hooks&lt;/strong&gt; enforce what prompts merely suggest: pre-commit typecheck, block rm -rf, grep for .skip(, require approval before push.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ratchet rule:&lt;/strong&gt; every agent mistake becomes a &lt;strong&gt;permanent constraint&lt;/strong&gt; :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent commented out a test → AGENTS.md rule + hook&lt;/li&gt;
&lt;li&gt;Agent ignored architecture layer → custom linter&lt;/li&gt;
&lt;li&gt;Stale docs → doc-gardening agent opens fix PR&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Harness is shaped by &lt;strong&gt;your failure history&lt;/strong&gt;  — you can’t download someone else’s.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 9 — Agent legibility
&lt;/h3&gt;

&lt;p&gt;If the agent can’t see it in-repo at runtime, &lt;strong&gt;it doesn’t exist&lt;/strong&gt;. Slack decisions, Google Docs, tribal knowledge — illegible. Versioned markdown, schemas, plans, generated DB docs — legible.&lt;/p&gt;

&lt;p&gt;Push context &lt;strong&gt;into the repo&lt;/strong&gt; over time. Boring, composable stacks often beat clever abstractions agents can’t inspect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n26ehbr87qk7rox2hdw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6n26ehbr87qk7rox2hdw.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agents can only reason over information they can access at runtime. Retrieval systems such as Instant RAGFlow help surface relevant documentation, knowledge bases, and project context without forcing every detail into the model’s context window.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/ragflow_support/" rel="noopener noreferrer"&gt;Techlatest.net - Instant RAGFlow: Ready-to-Use AI Knowledge Retrieval Engine&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 10 — Production patterns (Codex / Claude Code)
&lt;/h3&gt;

&lt;p&gt;Mature harnesses add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Per-worktree app boot&lt;/strong&gt;  — agent drives UI via Chrome DevTools MCP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local observability stack&lt;/strong&gt;  — LogQL/PromQL in the loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layered architecture&lt;/strong&gt;  — mechanical dependency rules + structural tests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Garbage collection&lt;/strong&gt;  — golden principles + recurring refactor agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimal merge gates&lt;/strong&gt;  — high throughput; fix forward when agent volume exceeds human attention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Humans steer at &lt;strong&gt;intent and acceptance criteria&lt;/strong&gt;. Agents execute and self-review in loops.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhluhalwdthztvbv2g1t.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhluhalwdthztvbv2g1t.gif" width="600" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As teams move beyond single-agent workflows, orchestration platforms such as CrewAI Studio help coordinate planners, implementers, reviewers, and specialized agents while maintaining visibility into long-running tasks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;Techlatest.net - AI Agents using CrewAI Studio &amp;amp; Jupyter with GPU support&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Workflow platforms such as Dify AI provide a practical way to package harnessed agents into production applications, combining tool integrations, evaluation flows, and operational monitoring.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/difyai_support/" rel="noopener noreferrer"&gt;Techlatest.net - Dify AI: Build &amp;amp; Launch GenAI Apps&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 11 — Quick start (four files)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcnq5qxo63fk84qe84296.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcnq5qxo63fk84qe84296.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Drop into project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── AGENTS.md
├── init.sh
├── feature_list.json
└── claude-progress.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop templates into repo&lt;/p&gt;

&lt;p&gt;Copy from &lt;a href="//./examples/"&gt;examples/&lt;/a&gt;. Sessions stabilize immediately vs prompt-only.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 12 — Hands-on session
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./init.sh &lt;span class="c"&gt;# bootstrap + health&lt;/span&gt;
&lt;span class="c"&gt;# agent picks ONE feature&lt;/span&gt;
npm &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm run lint &lt;span class="c"&gt;# verification gate&lt;/span&gt;
&lt;span class="c"&gt;# update progress + feature_list&lt;/span&gt;
git commit &lt;span class="c"&gt;# clean handoff&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;init.sh session start Verification gate — fail then pass Commit handoff&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpf4xc4l6kmdf7t1oej6g.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpf4xc4l6kmdf7t1oej6g.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkd33q4i9mh50yv0eo6q.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkd33q4i9mh50yv0eo6q.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fitkstowq4ymww449prfc.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fitkstowq4ymww449prfc.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Many teams develop and validate harness workflows inside reproducible AI workbenches with integrated notebooks, terminals, and GPU access before deploying them into production agent environments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techlatest.net/support/jupyter_python_notebook_support/" rel="noopener noreferrer"&gt;Techlatest.net - Jupyter Python Notebook&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — Capstone context (knowledge base app)
&lt;/h3&gt;

&lt;p&gt;The learn-harness-engineering course builds one &lt;strong&gt;Electron knowledge-base app&lt;/strong&gt; across six projects — import docs, index, grounded Q&amp;amp;A with citations. Each project adds harness mechanisms; the app evolves as skills grow.&lt;/p&gt;

&lt;p&gt;Same pattern works for any real repo: measured &lt;strong&gt;weak vs strong harness&lt;/strong&gt; diff, not doc count.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — Learning path (12 + 6)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Lectures L01–L12:&lt;/strong&gt; capability gap → harness definition → repo as truth → progressive disclosure → multi-session state → init phase → scope → feature lists → verification → e2e → observability → clean handoff&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Projects P01–P06:&lt;/strong&gt; prompt-only vs rules-first → agent-readable workspace → continuity → runtime feedback → self-verification → full capstone&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 15 — Who this is for
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Yes:&lt;/strong&gt; engineers using coding agents daily; tech leads owning agent reliability; builders who’ll let agents edit real repos&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No:&lt;/strong&gt; zero-code AI intro; prompt-only hobbyists; teams unwilling to add harness files to git&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requires:&lt;/strong&gt; terminal, git, at least one of Claude Code / Codex / comparable agent CLI&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Harness engineering&lt;/strong&gt; is the discipline of making agents finish real work: map-not-encyclopedia instructions, disk-persisted state, verification before “done”, one-feature scope, structured session lifecycle, hooks that ratchet on every failure. The model gets the headlines. The harness gets the  &lt;strong&gt;merge&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>harnessengineering</category>
      <category>aiengineering</category>
      <category>agentharness</category>
      <category>llm</category>
    </item>
    <item>
      <title>Loop Engineering Explained Visually: From Manual Prompts to Goal-Driven AI Agents</title>
      <dc:creator>TechLatest</dc:creator>
      <pubDate>Tue, 16 Jun 2026 06:31:23 +0000</pubDate>
      <link>https://dev.to/techlatestnet/loop-engineering-explained-visually-from-manual-prompts-to-goal-driven-ai-agents-111c</link>
      <guid>https://dev.to/techlatestnet/loop-engineering-explained-visually-from-manual-prompts-to-goal-driven-ai-agents-111c</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s7dv7geeyg4kbmhknms.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s7dv7geeyg4kbmhknms.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Design AI systems that &lt;strong&gt;act, observe, and repeat&lt;/strong&gt; until a goal is met — not one-shot prompts with you as the checkpoint between every step.&lt;/p&gt;

&lt;h3&gt;
  
  
  What you’ll understand at the end
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Why &lt;strong&gt;manual prompt-review cycles&lt;/strong&gt; hit a ceiling before model quality does&lt;/li&gt;
&lt;li&gt;What a &lt;strong&gt;single-agent loop&lt;/strong&gt; is — and when you need a  &lt;strong&gt;fleet&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open vs closed loops&lt;/strong&gt;  — exploration vs production budgets&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;five parts&lt;/strong&gt; of a well-engineered loop (goal, tools, context, termination, errors)&lt;/li&gt;
&lt;li&gt;Common &lt;strong&gt;patterns&lt;/strong&gt; : retry, plan-and-execute-verify, explore-and-narrow, human-in-the-loop&lt;/li&gt;
&lt;li&gt;How frameworks (LangGraph, Swarm, Hermes, OpenClaw) map to loop infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Introduction — you were the loop
&lt;/h3&gt;

&lt;p&gt;For years the default workflow was identical whether you were drafting email or refactoring a repo:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open chat&lt;/li&gt;
&lt;li&gt;Type a request&lt;/li&gt;
&lt;li&gt;Review output&lt;/li&gt;
&lt;li&gt;Type the next request&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;You&lt;/strong&gt; were the revision cycle. That made sense when models were unreliable — a human gate at every step stopped errors from compounding.&lt;/p&gt;

&lt;p&gt;Models improved. The workflow didn’t. &lt;strong&gt;Loop engineering&lt;/strong&gt; automates the checkpoint: you define the goal and the pass/fail standard; the agent runs &lt;strong&gt;research → produce → evaluate → fix → repeat&lt;/strong&gt; until the bar clears or a stop rule fires.&lt;/p&gt;

&lt;p&gt;This is the architecture behind serious coding agents (Claude Code, Codex-style agents, Hermes ReAct runtime) and production agentic workflows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcz9ewhi3r2sn8t88yu6j.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcz9ewhi3r2sn8t88yu6j.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1 — The one-task problem
&lt;/h3&gt;

&lt;p&gt;Every time you prompt for the next micro-step, you decide things the agent should decide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Where to look in the codebase&lt;/li&gt;
&lt;li&gt;Whether the draft is good enough&lt;/li&gt;
&lt;li&gt;What still needs work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s hiring a writer and approving every paragraph. You get output — but you’re &lt;strong&gt;running the operation&lt;/strong&gt; , not delegating it.&lt;/p&gt;

&lt;p&gt;The fix isn’t necessarily a bigger model. It’s &lt;strong&gt;rewiring the control flow&lt;/strong&gt; from linear chat to a &lt;strong&gt;goal-driven loop&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Single agent loop — produce, check, fix, repeat&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkiv2em862azz87bkaw9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkiv2em862azz87bkaw9.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2 — What a loop actually is
&lt;/h3&gt;

&lt;p&gt;A loop is a repeating cycle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Act&lt;/strong&gt;  — tool call, code write, search, shell command&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observe&lt;/strong&gt;  — stdout, test results, linter, API response&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reason&lt;/strong&gt;  — what failed, what to try next&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repeat&lt;/strong&gt; until &lt;strong&gt;termination&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This traces to &lt;strong&gt;ReAct&lt;/strong&gt; (Reason + Act): interleave thinking with environment feedback instead of guessing once and stopping.&lt;/p&gt;

&lt;p&gt;ReAct cycle — reason → act → observe&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Analogy:&lt;/strong&gt; A writer revising their own manuscript — draft, read with fresh eyes, mark weak sections, fix, read again —  &lt;strong&gt;without&lt;/strong&gt; asking the editor after every sentence. You hand over the &lt;strong&gt;revision cycle&lt;/strong&gt; , not just the first draft.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4ka497i8d8qoetohzhr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4ka497i8d8qoetohzhr.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 3 — What makes or breaks the loop
&lt;/h3&gt;

&lt;p&gt;Almost none of the engineering is “pick a smarter model.” Two design choices dominate:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evaluation gate&lt;/strong&gt;  — What counts as passing? Vague (“looks good”) → infinite loops or arbitrary stops. Concrete (“all pytest green + ruff clean”) → auditable exits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stopping condition&lt;/strong&gt;  — Success, max iterations, no-progress streak, escalation to human.&lt;/p&gt;

&lt;p&gt;Eval gate — pass exits loop, fail retries or halts&lt;/p&gt;

&lt;p&gt;See eval-gate.yaml for a harness template.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Eval gate config — copy to your agent harness&lt;/span&gt;

&lt;span class="na"&gt;goal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pytest&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests/&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pass;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ruff&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;check&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;src/&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;is&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;clean"&lt;/span&gt;

&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;metric&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest_exit_code&lt;/span&gt;
    &lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;metric&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ruff_violations&lt;/span&gt;
    &lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;

&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;max_iterations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;no_progress_streak&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="c1"&gt;# same error 3x → stop and escalate&lt;/span&gt;

&lt;span class="na"&gt;escalation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;on_failure&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;human_review&lt;/span&gt;
  &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;iteration_log&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;last_patch&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;stack_trace&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;summarize_every&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="c1"&gt;# compress loop history every N iters&lt;/span&gt;
  &lt;span class="na"&gt;keep_last_errors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4ka497i8d8qoetohzhr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4ka497i8d8qoetohzhr.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 4 — When one agent isn’t enough
&lt;/h3&gt;

&lt;p&gt;A single looping agent handles &lt;strong&gt;bounded&lt;/strong&gt; tasks well. Real projects mix cognitive modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Research vs planning vs execution vs review&lt;/li&gt;
&lt;li&gt;Long context → &lt;strong&gt;lost-in-the-middle&lt;/strong&gt;  — front and back of window get more attention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Forcing one agent to be researcher, planner, implementer, and reviewer is like asking your best writer to fact-check every claim, copy-edit, and run the press.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fleet looping:&lt;/strong&gt; an &lt;strong&gt;orchestrator&lt;/strong&gt; owns the goal, decomposes work, assigns &lt;strong&gt;specialists&lt;/strong&gt; , each running their own sub-loop. Subagents handle narrow slices. &lt;strong&gt;Eval gates at every layer&lt;/strong&gt; stop bad work from propagating.&lt;/p&gt;

&lt;p&gt;Fleet tree — orchestrator → specialists → subagents&lt;/p&gt;

&lt;p&gt;&lt;a href="https://medium.com/@techlatest.net/hermes-agent-masterclass-full-tutorial-9f682bb28789" rel="noopener noreferrer"&gt;Hermes masterclass&lt;/a&gt; (ReAct + 90-turn cap) · &lt;a href="https://medium.com/@techlatest.net/openclaw-agent-masterclass-66d6a4f88cd5" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; (gateway + multi-agent sessions).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjb8zt2285wnnbrtcimz9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjb8zt2285wnnbrtcimz9.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Multi-agent systems require orchestration, session management, and reliable communication between specialized agents. Platforms such as OpenClaw provide a channel-first architecture for managing agent sessions, tool access, and long-running autonomous workflows.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/openclaw-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/openclaw-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 5 — Open loops vs closed loops
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Open looping&lt;/strong&gt;  — wide operational space, vague path, room to explore. Can discover solutions you didn’t spec. On a research budget, exciting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Costs:&lt;/strong&gt; reasoning chains that go nowhere, context bloat, compounding API bills. Loose requirements → &lt;strong&gt;slop at scale&lt;/strong&gt;  — output that looks finished but misses the bar.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Closed looping&lt;/strong&gt;  — human architect defines path &lt;strong&gt;before&lt;/strong&gt; execution: clear goal, defined steps, eval gate per step, explicit stop. Agents still loop —  &lt;strong&gt;inside your frame&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Open vs closed loops — explore wide vs gated path&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure contrast:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Open loop fails&lt;/strong&gt; → keeps going, burns tokens, plausibly wrong output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Closed loop fails&lt;/strong&gt; → &lt;strong&gt;stops at gate&lt;/strong&gt; , trace shows where, fix eval and rerun&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Production default: &lt;strong&gt;closed first&lt;/strong&gt;. Expand operational space once the gated loop works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnc632ffwku77gmveqig.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmnc632ffwku77gmveqig.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 6 — Five parts of a well-engineered loop
&lt;/h3&gt;

&lt;p&gt;Five parts — goal, tools, context, termination, errors&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Clear goal&lt;/strong&gt;  — Specific enough to evaluate. “All unit tests pass” not “make the app better.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Tool set&lt;/strong&gt;  — Loop quality = ability to &lt;strong&gt;touch reality&lt;/strong&gt; : run code, read/write files, shell, tests, search docs. No tools → guessing loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Context management&lt;/strong&gt;  — Each iteration adds tokens. Summarize history, log attempts, prune noise before the next turn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Termination logic&lt;/strong&gt;  — Success conditions, failure exits (max iters, repeated same error), escalation paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Error handling&lt;/strong&gt;  — Recoverable vs hard blockers; &lt;strong&gt;change strategy&lt;/strong&gt; after repeated failure — not identical retries.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmk5xb9hmb62ojpxhh6ew.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmk5xb9hmb62ojpxhh6ew.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Many production agents rely on retrieval systems rather than storing all knowledge in model weights. RAG platforms such as Instant RAGFlow allow loops to fetch relevant information dynamically during execution.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/ragflow_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/ragflow_support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 7 — Common loop patterns
&lt;/h3&gt;

&lt;p&gt;Loop patterns — retry, plan-verify, explore-narrow, HITL&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retry loop&lt;/strong&gt;  — Try → check pass/fail → retry. Best for atomic tasks with clear criteria (one function + one test).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plan-execute-verify&lt;/strong&gt;  — Plan steps, execute one, verify before next. Refactors, multi-file features. Must &lt;strong&gt;revise plan&lt;/strong&gt; when step 2 invalidates step 5.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explore-narrow&lt;/strong&gt;  — Try multiple approaches, score intermediates, commit to best path. Debugging unknown errors. Watch context explosion — prune early.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human-in-the-loop&lt;/strong&gt;  — Pause on ambiguity or high-risk action; resume after approval. Production deploys, irreversible ops. Too many interrupts → you’re the loop again.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4bf2nafojv2kv1etlry.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4bf2nafojv2kv1etlry.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 8 — Frameworks and what they solve
&lt;/h3&gt;

&lt;p&gt;Building loops from scratch is tedious. Frameworks differ in &lt;strong&gt;state, failure recovery, and debugging&lt;/strong&gt;  — not just syntax.&lt;/p&gt;

&lt;p&gt;Framework loop infra — checkpoint, handoff, MCP, gateway&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt;  — Loop as &lt;strong&gt;stateful graph&lt;/strong&gt; ; checkpoint after each node; resume mid-crash without losing context. Long-running fleets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Swarm&lt;/strong&gt;  —  &lt;strong&gt;Stateless handoffs&lt;/strong&gt; ; full context passed each hop explicitly. Clean debugging, assembly-line workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Microsoft Agent Framework&lt;/strong&gt;  — Async message passing; parallel branches; separate harness vs production loops with human review gates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic / MCP&lt;/strong&gt;  — Standard tool discovery; orchestrator attaches capabilities without per-integration glue; interrupt before dangerous ops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt;  — Synchronous ReAct core, skill learning, gateway + cron for proactive loops. See &lt;a href="https://medium.com/@techlatest.net/hermes-agent-masterclass-full-tutorial-9f682bb28789" rel="noopener noreferrer"&gt;masterclass&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt;  — Channel-first gateway, isolated agent sessions, skills + heartbeat. See &lt;a href="https://medium.com/@techlatest.net/openclaw-agent-masterclass-66d6a4f88cd5" rel="noopener noreferrer"&gt;masterclass&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Pick by &lt;strong&gt;failure modes your team can tolerate&lt;/strong&gt; , not benchmark hype.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy7zpxldvyu9qkdbhx2i7.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy7zpxldvyu9qkdbhx2i7.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Teams moving from prototypes to production often use workflow platforms such as Dify AI to deploy agent pipelines, integrate tools, and monitor execution across real-world applications.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/difyai_support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/difyai_support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 9 — Context and token hygiene
&lt;/h3&gt;

&lt;p&gt;Each iteration appends: patches, stack traces, decisions. Unbounded history → token limits and &lt;strong&gt;forgotten early attempts&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured feedback&lt;/strong&gt;  — relevant code snippet + intent + “same error as iter 3?” flag&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rolling summary&lt;/strong&gt;  — “Fix A failed (TypeError), Fix B partial, tests fail line 47”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool call budgets&lt;/strong&gt;  — max calls per iteration; budget exhaustion = failure signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Summarize every N iterations&lt;/strong&gt;  — compress log, keep last K errors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Part 10 — Hands-on: minimal closed loop
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;minimal_closed_loop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;

&lt;span class="c1"&gt;#!/usr/bin/env python3
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Minimal closed-loop coding agent — act, observe, retry until tests pass.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;__future__&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;

&lt;span class="n"&gt;MAX_ITER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;

&lt;span class="n"&gt;GOAL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All unit tests pass with zero lint errors.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Replace with pytest/subprocess in real projects.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
    &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="c1"&gt;# demo: flaky until loop converges
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FAILED: test_addition expected 4 got 3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OK: 12 passed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;agent_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;One LLM turn: propose a fix given feedback.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;last_error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;# iter &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: patch based on → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;last_error&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;# iter &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: initial implementation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_ITER&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;patch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;agent_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;feedback&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; eval: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;feedback&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✓ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;GOAL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (stopped at iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;feedback&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✗ Escalate to human — no progress in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MAX_ITER&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; iterations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; __main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Closed loop demo — act, eval, retry until pass&lt;/p&gt;

&lt;p&gt;The script loops: propose patch → run eval → exit on success or escalate after MAX_ITER.&lt;/p&gt;

&lt;p&gt;Wire real run_tests() to pytest; replace agent_step() with your LLM + tool calls.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftaum92ord5ad0jj0pz2g.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftaum92ord5ad0jj0pz2g.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 11 — Hands-on: eval gate config
&lt;/h3&gt;

&lt;p&gt;Copy eval-gate.yaml into your harness:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Eval gate config — copy to your agent harness&lt;/span&gt;

&lt;span class="na"&gt;goal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pytest&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests/&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pass;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ruff&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;check&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;src/&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;is&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;clean"&lt;/span&gt;

&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;metric&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest_exit_code&lt;/span&gt;
    &lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;metric&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ruff_violations&lt;/span&gt;
    &lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;

&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;max_iterations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;no_progress_streak&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="c1"&gt;# same error 3x → stop and escalate&lt;/span&gt;

&lt;span class="na"&gt;escalation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;on_failure&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;human_review&lt;/span&gt;
  &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;iteration_log&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;last_patch&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;stack_trace&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;summarize_every&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="c1"&gt;# compress loop history every N iters&lt;/span&gt;
  &lt;span class="na"&gt;keep_last_errors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;success&lt;/strong&gt;  — measurable metrics (exit codes, counts)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;failure&lt;/strong&gt;  — max iterations + no-progress streak&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;escalation&lt;/strong&gt;  — human review payload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;context&lt;/strong&gt;  — summarize cadence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eval gate terminal — metrics and stop rules&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fje8v9iekuqcmncepxcsy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fje8v9iekuqcmncepxcsy.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 12 — Multi-agent loop sketch
&lt;/h3&gt;

&lt;p&gt;Orchestrator pseudoflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;goal → decompose → for each subtask:
         assign specialist → specialist loops until sub-eval passes
       → integrator merges → global eval → done or rework branch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Multi-agent delegation terminal&lt;/p&gt;

&lt;p&gt;Start &lt;strong&gt;single closed loop&lt;/strong&gt; first. Add fleet when you hit context ceiling or role confusion.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F32eqa59ezt42frvdtt3b.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F32eqa59ezt42frvdtt3b.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As agent fleets grow, visual orchestration becomes increasingly valuable. CrewAI Studio enables developers to design, coordinate, and monitor multi-agent workflows without building orchestration infrastructure from scratch.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://techlatest.net/support/crewai-support/" rel="noopener noreferrer"&gt;https://techlatest.net/support/crewai-support/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 13 — Where to start
&lt;/h3&gt;

&lt;p&gt;Build a loop when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Same work type repeats, and quality should  &lt;strong&gt;compound&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Success is &lt;strong&gt;verifiable&lt;/strong&gt; , not vibes&lt;/li&gt;
&lt;li&gt;You spend time driving steps the agent could navigate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Don’t&lt;/strong&gt; loop everything — one-shot summarization doesn’t need ten iterations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Starter recipe:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write termination condition on paper&lt;/li&gt;
&lt;li&gt;Wire one eval gate (tests or schema validator)&lt;/li&gt;
&lt;li&gt;Single agent, max 8–10 iterations&lt;/li&gt;
&lt;li&gt;Log every iter; summarize history&lt;/li&gt;
&lt;li&gt;Test &lt;strong&gt;failure cases&lt;/strong&gt; before happy path&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Install/scaffold loop harness&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F40v4r3nnsd7booetx3pi.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F40v4r3nnsd7booetx3pi.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 14 — Failure modes checklist
&lt;/h3&gt;

&lt;p&gt;Failure modes — runaway open vs halted closed&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No exit condition&lt;/strong&gt;  — runs forever or stops randomly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Same error, same fix&lt;/strong&gt;  — spinning, not learning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context overflow&lt;/strong&gt;  — model forgets task&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vague goal&lt;/strong&gt;  — can’t detect done&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No tools&lt;/strong&gt;  — pure hallucination loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open loop + loose spec&lt;/strong&gt;  — expensive slop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Test deliberately: ambiguous goals, broken tools, unsolvable tasks (verify exit works).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzp9yyyifqa069bcgyhoj.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzp9yyyifqa069bcgyhoj.gif" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 15 — Loop engineering vs agentic AI
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Agentic AI&lt;/strong&gt;  — autonomous action toward goals (broad).&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Loop engineering&lt;/strong&gt;  —  &lt;strong&gt;discipline of structuring&lt;/strong&gt; those actions in feedback cycles with explicit gates.&lt;/p&gt;

&lt;p&gt;Most agentic systems are loops under the hood. Quality differences usually come from &lt;strong&gt;loop design&lt;/strong&gt; , not base model alone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Loop engineering&lt;/strong&gt; moves you from expensive autocomplete to &lt;strong&gt;goal-driven automation&lt;/strong&gt;. Define pass/fail gates and stop rules; let agents run the revision cycle. Start &lt;strong&gt;closed, single-agent&lt;/strong&gt; ; add fleet and openness when evals prove the frame. The model got better — your &lt;strong&gt;workflow&lt;/strong&gt; should too.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thank you so much for reading
&lt;/h3&gt;

&lt;p&gt;Like | Follow | Subscribe to the newsletter.&lt;/p&gt;

&lt;p&gt;Catch us on&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://www.techlatest.net/" rel="noopener noreferrer"&gt;https://www.techlatest.net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Newsletter: &lt;a href="https://substack.com/@parvezmohammed" rel="noopener noreferrer"&gt;https://substack.com/@parvezmohammed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/TechlatestNet" rel="noopener noreferrer"&gt;https://twitter.com/TechlatestNet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/techlatest-net/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/techlatest-net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;YouTube:&lt;a href="https://www.youtube.com/@techlatest_net/" rel="noopener noreferrer"&gt;https://www.youtube.com/@techlatest_net/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blogs: &lt;a href="https://medium.com/@techlatest.net" rel="noopener noreferrer"&gt;https://medium.com/@techlatest.net&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reddit Community: &lt;a href="https://www.reddit.com/user/techlatest_net/" rel="noopener noreferrer"&gt;https://www.reddit.com/user/techlatest_net/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>orchestration</category>
      <category>aiengineering</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
