<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: rokoss21</title>
    <description>The latest articles on DEV Community by rokoss21 (@rokoss21).</description>
    <link>https://dev.to/rokoss21</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2535718%2F19462206-7c85-48c9-83e7-be552633b923.jpeg</url>
      <title>DEV Community: rokoss21</title>
      <link>https://dev.to/rokoss21</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rokoss21"/>
    <language>en</language>
    <item>
      <title>IOSM CLI: AI Engineering Runtime. Not Another Chat Wrapper.</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Wed, 11 Mar 2026 09:52:54 +0000</pubDate>
      <link>https://dev.to/rokoss21/iosm-cli-ai-engineering-runtime-not-another-chat-wrapper-44km</link>
      <guid>https://dev.to/rokoss21/iosm-cli-ai-engineering-runtime-not-another-chat-wrapper-44km</guid>
      <description>&lt;p&gt;Chat agents hit a ceiling.&lt;/p&gt;

&lt;p&gt;You feel it around week three of using any of them — Claude Code, Gemini CLI, Cursor, Aider. The first sessions are impressive. Then you hit a real task: a cross-cutting refactor, a parallel migration, a codebase you've touched across a hundred sessions. And the tool starts to break at the seams. Re-explain the context. Manually merge competing edits. Hope the rollback works.&lt;/p&gt;

&lt;p&gt;Engineering requires structure. Chat doesn't have it.&lt;/p&gt;

&lt;p&gt;IOSM CLI is the runtime for that structure.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"AI without a methodology is just faster improvisation."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the sentence that drives every design decision in &lt;code&gt;iosm-cli&lt;/code&gt;. Not a tagline — an architectural constraint. A coding agent that can't measure its own outcomes, can't coordinate parallel work, and can't remember decisions across sessions is not an engineering tool. It's a search engine that writes code.&lt;/p&gt;

&lt;p&gt;AI adoption is no longer the advantage. &lt;strong&gt;Systematic AI engineering is.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🛠️ Why This Exists
&lt;/h2&gt;

&lt;p&gt;I built agent systems for production codebases for years. The pattern was always the same: impressive demos, fragile execution at scale. When the task grew beyond a single context window — spanning multiple modules, multiple agents, multiple sessions — chat-based tools collapsed into manual coordination overhead.&lt;/p&gt;

&lt;p&gt;The missing piece wasn't a better model. It was a missing runtime layer: something that enforces methodology, tracks outcomes, coordinates agents, and survives session boundaries. That's what IOSM CLI is.&lt;/p&gt;




&lt;h2&gt;
  
  
  👥 Who Is This For
&lt;/h2&gt;

&lt;p&gt;Three types of engineers use IOSM CLI, and they come for different reasons.&lt;/p&gt;

&lt;h3&gt;
  
  
  The solo developer who wants a real coding agent
&lt;/h3&gt;

&lt;p&gt;You've tried the other tools. You're tired of re-explaining your project every session. You want something that already knows your architectural decisions, your banned dependencies, your team conventions — and actually executes tasks autonomously.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;iosm-cli&lt;/code&gt;, you run &lt;code&gt;iosm&lt;/code&gt;, type your task, and the agent works. It reads your files, runs your tests, handles rollbacks. Persistent memory means session 10 knows everything session 1 learned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Time to productive first result: under 5 minutes.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The senior engineer running complex refactors
&lt;/h3&gt;

&lt;p&gt;180K-line monolith. Extract the payment service, migrate auth to OAuth2, keep CI green throughout. One sequential agent on a 15-hour task is not a plan.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;/orchestrate&lt;/code&gt;, you spin up parallel agents with dependency ordering, file lock guarantees, and git worktree isolation. You get a coordinated team in one command, with a consolidated result you can actually review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The kind of work you previously couldn't safely delegate to AI.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The team lead operationalizing AI coding
&lt;/h3&gt;

&lt;p&gt;You need engineering workflows that are auditable, reproducible, safe for shared codebases. Every cycle should leave traces: what changed, why, what the metrics were before and after.&lt;/p&gt;

&lt;p&gt;With IOSM cycles, every run captures baseline metrics, hypothesis cards, and outcome deltas in &lt;code&gt;.iosm/cycles/&lt;/code&gt;. The next engineer resumes from the same artifact state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI coding as a team engineering system, not a solo productivity hack.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ Barrier to Entry: Minimal
&lt;/h2&gt;

&lt;p&gt;The tool is layered. You start at the bottom and unlock depth only when you need it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 1 — Three commands to a working agent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; iosm-cli
&lt;span class="nb"&gt;cd &lt;/span&gt;your-project
iosm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/login     → guided API key setup (30 seconds)
/model     → pick provider + model
your task  → start immediately
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No YAML. No config files. No methodology training. The default &lt;code&gt;full&lt;/code&gt; profile gives you a capable coding agent with full filesystem access, shell tooling, and semantic search — ready out of the box.&lt;/p&gt;

&lt;h3&gt;
  
  
  Week 1 — Unlock depth when you need it
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Shift+Tab           → switch to iosm profile
/init               → bootstrap IOSM workspace
/iosm 0.95          → run your first structured cycle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Entirely optional. Stay in &lt;code&gt;full&lt;/code&gt; profile forever if it works. The IOSM layer appears when you need measurable, auditable improvement cycles — not before.&lt;/p&gt;

&lt;h3&gt;
  
  
  No provider lock-in
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;      &lt;span class="c"&gt;# Claude models&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;         &lt;span class="c"&gt;# OpenAI GPT models&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;         &lt;span class="c"&gt;# Google Gemini models&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENROUTER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;     &lt;span class="c"&gt;# 100+ models via OpenRouter&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Node.js &lt;code&gt;&amp;gt;=20.6.0&lt;/code&gt; is the only hard requirement.&lt;/strong&gt; Everything else is optional.&lt;/p&gt;




&lt;h2&gt;
  
  
  🆚 Honest Positioning vs Other Tools
&lt;/h2&gt;

&lt;p&gt;This isn't a "we win every cell" table. It's a map so you can pick the right tool.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Gemini CLI&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;OpenCode&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;IOSM CLI&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Provider&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude-native&lt;/td&gt;
&lt;td&gt;Gemini-native&lt;/td&gt;
&lt;td&gt;Any (IDE)&lt;/td&gt;
&lt;td&gt;Any (75+ providers)&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Terminal&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;td&gt;IDE&lt;/td&gt;
&lt;td&gt;Terminal&lt;/td&gt;
&lt;td&gt;Terminal runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IOSM methodology&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parallel agent orchestration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Partial ¹&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Structured checkpoint / rollback&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Partial ²&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Partial ³&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Persistent cross-session memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Via CLAUDE.md&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Via Automations&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Semantic / vector code search&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agentic ⁴&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅ IDE-native&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅ terminal-native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SDK / JSON-RPC mode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Free tier&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ paid only ⁵&lt;/td&gt;
&lt;td&gt;✅ 1000 req/day&lt;/td&gt;
&lt;td&gt;✅ Hobby (limited)&lt;/td&gt;
&lt;td&gt;✅ open-source&lt;/td&gt;
&lt;td&gt;✅ open-source&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Notes (to keep this honest):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;¹ Claude Code supports parallel subagents via &lt;code&gt;/batch&lt;/code&gt; and skills, but without dependency DAGs, file locks, or worktree isolation.&lt;br&gt;&lt;br&gt;
² Claude Code introduced checkpoints for exploration in 2025, but without structured rollback to named states.&lt;br&gt;&lt;br&gt;
³ Cursor has a "Restore Checkpoint" UI button within a session, but not an explicit CLI-level &lt;code&gt;/checkpoint&lt;/code&gt; + &lt;code&gt;/rollback&lt;/code&gt; workflow.&lt;br&gt;&lt;br&gt;
⁴ Claude Code does deep agentic codebase search (reads and understands files in-context with a 200K token window) — not vector embeddings, but highly capable for many use cases.&lt;br&gt;&lt;br&gt;
⁵ Claude Code requires a paid Pro ($20/mo) or Max ($100+/mo) plan. The free claude.ai web interface handles general coding questions but is not the Claude Code CLI agent.&lt;/p&gt;

&lt;p&gt;The pattern: &lt;strong&gt;Claude Code&lt;/strong&gt; and &lt;strong&gt;Gemini CLI&lt;/strong&gt; are go-to choices for their respective native models. &lt;strong&gt;Cursor&lt;/strong&gt; excels at IDE-integrated flows. &lt;strong&gt;OpenCode&lt;/strong&gt; is the best fully open-source lightweight option. &lt;strong&gt;IOSM CLI&lt;/strong&gt; is the only terminal runtime that combines structured methodology, coordinated parallel execution with dependency ordering, and a full platform layer — across any provider.&lt;/p&gt;

&lt;p&gt;Different tools for different jobs. IOSM CLI is not a "better chat" — it is a &lt;strong&gt;different category&lt;/strong&gt;: an engineering runtime.&lt;/p&gt;


&lt;h2&gt;
  
  
  🏗️ Three Architectural Layers
&lt;/h2&gt;

&lt;p&gt;The runtime is layered. Each layer adds capabilities. You get value at any level.&lt;/p&gt;


&lt;h3&gt;
  
  
  Layer 1 — Runtime: Agents, Orchestration, Worktrees
&lt;/h3&gt;

&lt;p&gt;The base is a full coding agent with direct filesystem and shell access. Real file reads, real diffs, real test runs — no hallucinated paths.&lt;/p&gt;

&lt;p&gt;For complex work, &lt;code&gt;/orchestrate&lt;/code&gt; turns one agent into a coordinated team:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/orchestrate &lt;span class="nt"&gt;--parallel&lt;/span&gt; &lt;span class="nt"&gt;--agents&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--profiles&lt;/span&gt; iosm_analyst,explore,iosm_verifier,full &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--depends&lt;/span&gt; 3&amp;gt;1,4&amp;gt;2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--locks&lt;/span&gt; schema,config &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--worktree&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  Refactor auth module, verify invariants, document changes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dependency DAG&lt;/strong&gt; (&lt;code&gt;--depends 3&amp;gt;1,4&amp;gt;2&lt;/code&gt;): agent 3 waits for 1, agent 4 waits for 2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File locks&lt;/strong&gt; (&lt;code&gt;--locks schema,config&lt;/code&gt;): zero parallel write collisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git worktrees&lt;/strong&gt; (&lt;code&gt;--worktree&lt;/code&gt;): main branch untouched until merge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is continuous dispatch — tasks launch the moment their dependencies are satisfied, not when an arbitrary wave completes. 3–5× reduction in wall-clock time for parallelizable work.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 2 — Methodology: IOSM Cycles, Metrics, Artifacts
&lt;/h3&gt;

&lt;p&gt;IOSM is &lt;strong&gt;Improve → Optimize → Shrink → Modularize&lt;/strong&gt; — a four-phase iterative loop that turns vague "make this better" requests into measurable engineering decisions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Shift+Tab              &lt;span class="c"&gt;# switch to iosm profile&lt;/span&gt;
/init                  &lt;span class="c"&gt;# bootstrap workspace&lt;/span&gt;
/iosm 0.95 &lt;span class="nt"&gt;--max-iterations&lt;/span&gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;/init&lt;/code&gt; generates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;iosm.yaml&lt;/code&gt; — thresholds, weights, governance policies&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;IOSM.md&lt;/code&gt; — operator + agent playbook&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.iosm/cycles/&lt;/code&gt; — artifact workspace for cycle history&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every cycle run captures: baseline metrics → hypothesis cards → improve/verify/optimize iterations → outcome deltas → artifact write.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Baseline captured
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Planned cycle from team artifacts: simplify auth module
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Running improve -&amp;gt; verify -&amp;gt; optimize loop
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Result: simplicity +18%, modularity +11%, performance +6%
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Artifacts written to .iosm/cycles/2026-03-10-001/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These numbers are real and reproducible. Not vibes. Not impressions. Measurable deltas with full decision log.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Also in this layer:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/memory&lt;/code&gt; — persistent project facts across sessions. Active decisions, anti-patterns, architectural constraints. The agent loads them at startup.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/contract&lt;/code&gt; — hard engineering constraints the agent enforces. "No new dependencies without approval." "Test coverage must stay above 80%."&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/semantic&lt;/code&gt; — intent-based code search. Query by meaning, not tokens. "Find all places handling token expiry" — across renamed variables and different module boundaries.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/singular&lt;/code&gt; — before implementing anything complex, run feasibility analysis across 3 variants. Choose before you build.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Layer 3 — Platform: SDK, JSON-RPC, MCP
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;iosm-cli&lt;/code&gt; is a foundation you build on, not a closed product.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SDK&lt;/strong&gt; — embed the runtime in your own tooling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createAgent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iosm-cli&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sonnet&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iosm&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;read&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;write&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Analyze auth module security posture&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JSON-RPC&lt;/strong&gt; — wire into CI pipelines and custom dashboards:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;iosm &lt;span class="nt"&gt;--json-rpc&lt;/span&gt; &lt;span class="nt"&gt;--port&lt;/span&gt; 3042
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Print mode&lt;/strong&gt; — pipe to other tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;iosm &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Audit src/ for dead code"&lt;/span&gt; &lt;span class="nt"&gt;--output-format&lt;/span&gt; json | jq &lt;span class="s1"&gt;'.findings'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;MCP&lt;/strong&gt; — connect any external tool ecosystem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/mcp    &lt;span class="c"&gt;# interactive MCP server manager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔄 A Full Production Workflow
&lt;/h2&gt;

&lt;p&gt;No demo. A real scenario: refactor an authentication module safely, with verification, measurable outcomes, under 3 hours.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;iosm
&lt;span class="go"&gt;IOSM CLI v0.1.3 [full]

&lt;/span&gt;&lt;span class="gp"&gt;you&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/orchestrate &lt;span class="nt"&gt;--parallel&lt;/span&gt; &lt;span class="nt"&gt;--agents&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="go"&gt;     --profiles iosm_analyst,explore,iosm_verifier,full \
&lt;/span&gt;&lt;span class="gp"&gt;     --depends 3&amp;gt;&lt;/span&gt;1,4&amp;gt;2 &lt;span class="nt"&gt;--locks&lt;/span&gt; schema,config &lt;span class="nt"&gt;--worktree&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="go"&gt;     Refactor auth module, verify security invariants, document changes

&lt;/span&gt;&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Team run started: &lt;span class="c"&gt;#77&lt;/span&gt;
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;agent[1] architecture map &lt;span class="nb"&gt;complete&lt;/span&gt;
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;agent[2] implementation patch &lt;span class="nb"&gt;set &lt;/span&gt;prepared
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;agent[3] verification suite and rollback checks ready
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;agent[4] integration validation passed
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Consolidated patch plan generated
&lt;span class="go"&gt;
→ Shift+Tab (switch to iosm profile)
→ /init
→ /iosm 0.95 --max-iterations 5

&lt;/span&gt;&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Baseline captured
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Planned cycle from team artifacts: simplify auth module
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Running improve -&amp;gt; verify -&amp;gt; optimize loop
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Result: simplicity +18%, modularity +11%, performance +6%
&lt;span class="gp"&gt;iosm&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Artifacts written to .iosm/cycles/2026-03-10-001/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Outcome: completed in ~2.5 hours. Measurable deltas. Full audit trail. Safe to present to the team and repeat next week.&lt;/p&gt;




&lt;h2&gt;
  
  
  📦 Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; iosm-cli
&lt;span class="nb"&gt;cd &lt;/span&gt;your-project
iosm

&lt;span class="c"&gt;# In session:&lt;/span&gt;
/doctor    &lt;span class="c"&gt;# verify model + auth + tools are healthy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For maximum performance on large codebases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;ripgrep fd ast-grep comby jq yq semgrep

&lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; ripgrep fd-find jq yq &lt;span class="nb"&gt;sed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🌐 Open Spec, Open Runtime
&lt;/h2&gt;

&lt;p&gt;The methodology is a separate, versioned specification: &lt;a href="https://github.com/rokoss21/IOSM" rel="noopener noreferrer"&gt;github.com/rokoss21/IOSM&lt;/a&gt; — formal definitions, schemas, artifact templates, quality gate validators.&lt;/p&gt;

&lt;p&gt;The spec is the contract. The CLI is one implementation. Nothing stops you from running IOSM cycles in your CI, your own orchestrator, your custom tooling. The spec is the invariant.&lt;/p&gt;




&lt;h2&gt;
  
  
  One Last Thing
&lt;/h2&gt;

&lt;p&gt;Most teams have already adopted some AI coding tool. Most have hit the ceiling: autocomplete works, quick boilerplate works, but anything requiring real coordination across sessions, files, or agents — breaks down.&lt;/p&gt;

&lt;p&gt;The next gap in engineering teams won't be "do you use AI?" Everyone will. The gap will be &lt;strong&gt;how systematically&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Stop prompting. Start executing.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/rokoss21/iosm-cli" rel="noopener noreferrer"&gt;github.com/rokoss21/iosm-cli&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;npm&lt;/strong&gt;: &lt;code&gt;npm install -g iosm-cli&lt;/code&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href="https://github.com/rokoss21/iosm-cli/tree/main/docs" rel="noopener noreferrer"&gt;github.com/rokoss21/iosm-cli/docs&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;IOSM Spec&lt;/strong&gt;: &lt;a href="https://github.com/rokoss21/IOSM" rel="noopener noreferrer"&gt;github.com/rokoss21/IOSM&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>Swarm-IOSM: Orchestrating Parallel AI Agents with Quality Gates</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Mon, 19 Jan 2026 08:46:24 +0000</pubDate>
      <link>https://dev.to/rokoss21/swarm-iosm-orchestrating-parallel-ai-agents-with-quality-gates-8fk</link>
      <guid>https://dev.to/rokoss21/swarm-iosm-orchestrating-parallel-ai-agents-with-quality-gates-8fk</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Swarm-IOSM is an orchestration engine for &lt;a href="https://claude.ai/code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; that transforms complex development tasks into coordinated parallel work streams. It implements continuous dispatch scheduling (no wave barriers), hierarchical file lock management, and enforces IOSM quality gates before merge. Real-world speedup: &lt;strong&gt;commonly 3-8x faster&lt;/strong&gt; than sequential execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Parallel Agent Problem
&lt;/h2&gt;

&lt;p&gt;You're working on a complex feature. It needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Codebase analysis to understand existing patterns&lt;/li&gt;
&lt;li&gt;Architecture design for the new system&lt;/li&gt;
&lt;li&gt;Implementation across 3 modules (independent)&lt;/li&gt;
&lt;li&gt;Integration tests&lt;/li&gt;
&lt;li&gt;Security audit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach:&lt;/strong&gt; One agent does everything sequentially. &lt;strong&gt;15 hours of wall-clock time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What if&lt;/strong&gt; you could run analysis, design, and implementation in parallel? &lt;strong&gt;4-6 hours.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But here's the catch: parallel AI agents need coordination. They can't all edit the same file. They need to share knowledge. And you need quality guarantees before merging their work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's what Swarm-IOSM solves.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Swarm-IOSM?
&lt;/h2&gt;

&lt;p&gt;Swarm-IOSM is a &lt;a href="https://docs.anthropic.com/claude/docs/skills" rel="noopener noreferrer"&gt;Claude Code Skill&lt;/a&gt; that orchestrates parallel AI agent execution with built-in quality enforcement. It combines:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Continuous Dispatch Loop&lt;/strong&gt; — Tasks launch immediately when dependencies are met (no artificial wave barriers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File Lock Management&lt;/strong&gt; — Hierarchical conflict detection prevents parallel write chaos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PRD-Driven Planning&lt;/strong&gt; — Structured requirements → decomposition → execution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IOSM Quality Gates&lt;/strong&gt; — Automated code quality, performance, and modularity checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-Spawn Protocol&lt;/strong&gt; — Agents discover new work during execution&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Core Model
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Touches → Locks → Gates → Done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A correctness model for parallel agent work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Declare&lt;/strong&gt; what files you touch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Acquire&lt;/strong&gt; locks to prevent conflicts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pass&lt;/strong&gt; quality gates&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ship&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Innovation: Continuous Dispatch
&lt;/h2&gt;

&lt;p&gt;Traditional orchestration waits for entire "waves" to complete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Wave 1: [T01, T02, T03] → Wait for ALL to finish
Wave 2: [T04, T05]      → Can't start until Wave 1 done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Swarm-IOSM uses continuous scheduling:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;T01 done → T04 starts IMMEDIATELY (even if T02, T03 still running)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This eliminates idle time and maximizes parallelism. Here's the dispatch algorithm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;gates_met&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Collect ready tasks (deps satisfied, no conflicts)
&lt;/span&gt;    &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;backlog&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;deps_satisfied&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Classify by mode (background vs foreground)
&lt;/span&gt;    &lt;span class="n"&gt;bg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;can_auto_background&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;fg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;needs_user_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Dispatch batch (max 3-6 tasks)
&lt;/span&gt;    &lt;span class="nf"&gt;launch_parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bg&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;background&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;launch_parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fg&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;foreground&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Monitor &amp;amp; spawn
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;collect_completed&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;spawn_candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse_spawn_candidates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;backlog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;deduplicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;spawn_candidates&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Check gates
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;all_gates_pass&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Tasks launch as soon as they're ready, not when an arbitrary wave completes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Live Example: Adding Redis Caching
&lt;/h2&gt;

&lt;p&gt;Let's walk through a real track from &lt;a href="https://github.com/rokoss21/swarm-iosm/tree/main/examples/demo-track" rel="noopener noreferrer"&gt;&lt;code&gt;examples/demo-track/&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem
&lt;/h3&gt;

&lt;p&gt;API endpoint &lt;code&gt;/api/natal/chart&lt;/code&gt; has 450ms P95 latency. Database CPU at 75% during peak hours.&lt;/p&gt;

&lt;h3&gt;
  
  
  Goal
&lt;/h3&gt;

&lt;p&gt;Add Redis caching to reduce latency to &amp;lt;200ms and achieve 80%+ cache hit rate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Create Track
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/swarm-iosm new-track &lt;span class="s2"&gt;"Add Redis caching to API endpoints"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude generates:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PRD.md&lt;/code&gt; — 10 sections (Problem, Goals, Requirements, Risks, IOSM Targets)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;spec.md&lt;/code&gt; — Technical design with acceptance tests&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;plan.md&lt;/code&gt; — Task breakdown with dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Generated plan (7 tasks):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;T01: Analyze current performance (Explorer, 1h, read-only)
T02: Design caching strategy (Architect, 2h, foreground)
T03: Implement cache service (Implementer-A, 3h, background)
T04: Add caching to /natal endpoint (Implementer-B, 2h, background, after T03)
T05: Add caching to /transits endpoint (Implementer-C, 2h, background, after T03)
T06: Integration testing (TestRunner, 2h, background, after T04+T05)
T07: Security audit + merge (Integrator, 1h, foreground, after T06)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Execute Plan
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/swarm-iosm implement
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Orchestrator creates &lt;code&gt;continuous_dispatch_plan.md&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Initial Ready Set&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; T01 (Explorer, background)

&lt;span class="gu"&gt;## Expected Timeline&lt;/span&gt;
Batch 1: T01 → completes in 1h
Batch 2: T02 → completes in 2h (total: 3h)
Batch 3: T03 → completes in 3h (total: 6h)
Batch 4: T04, T05 (PARALLEL) → completes in 2h (total: 8h)
Batch 5: T06 → completes in 2h (total: 10h)
Batch 6: T07 → completes in 1h (total: 11h)

Serial estimate: 13h
Parallel estimate: 11h
Speedup: ~1.2x
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;But wait — T01 discovers an N+1 query issue:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## SpawnCandidates (from T01 report)&lt;/span&gt;

| ID | Subtask | Touches | Effort | Severity |
|----|---------|---------|--------|----------|
| SC-01 | Optimize calculate_aspects N+1 query | &lt;span class="sb"&gt;`backend/core/astro/natal.py`&lt;/span&gt; | M | medium |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Orchestrator auto-spawns SC-01 and adjusts timeline.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Integration &amp;amp; Quality Gates
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/swarm-iosm integrate demo-add-caching
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Generated &lt;code&gt;iosm_report.md&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Gate Evaluation Summary&lt;/span&gt;

| Gate | Target | Final | Status |
|------|--------|-------|--------|
| Gate-I (Code Quality) | ≥0.75 | 0.89 | ✅ PASS |
| Gate-O (Performance) | Tests pass | All pass | ✅ PASS |
| Gate-M (Modularity) | No circular deps | Pass | ✅ PASS |
| Gate-S (Simplicity) | API stable | N/A | ⚪ SKIP |

IOSM-Index: 0.85 ✅ (threshold: 0.80)

&lt;span class="gs"&gt;**Result:**&lt;/span&gt; APPROVED FOR PRODUCTION MERGE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;⚡ &lt;strong&gt;P95 latency:&lt;/strong&gt; 450ms → 180ms (60% improvement)&lt;/li&gt;
&lt;li&gt;🎯 &lt;strong&gt;Cache hit rate:&lt;/strong&gt; 82%&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;All tests passing&lt;/strong&gt; (24 unit + 6 integration)&lt;/li&gt;
&lt;li&gt;🔒 &lt;strong&gt;Zero production errors&lt;/strong&gt; during rollout&lt;/li&gt;
&lt;li&gt;⏱️ &lt;strong&gt;Total time:&lt;/strong&gt; 9.25h parallel vs 16h+ sequential (~1.7x faster)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Technical Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. File Lock Management
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Challenge:&lt;/strong&gt; How do you prevent two agents from editing the same file simultaneously?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Hierarchical lock manager with folder/file awareness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lock rules:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lock_a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lock_b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lock_a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lock_b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Exact match
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="c1"&gt;# Folder contains file
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Lock Plan&lt;/span&gt;

Tasks with overlapping touches (sequential only):
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`backend/core/__init__.py`&lt;/span&gt;: T03, T04 → ❌ Cannot run parallel
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`backend/api/`&lt;/span&gt;: T05, T06 → ❌ Folder conflict

Safe parallel execution:
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`backend/auth.py`&lt;/span&gt; (T02) + &lt;span class="sb"&gt;`backend/payments.py`&lt;/span&gt; (T07) → ✅ No overlap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Read-only tasks:&lt;/strong&gt; Always parallel (no locks needed).&lt;/p&gt;




&lt;h3&gt;
  
  
  2. IOSM Quality Gates
&lt;/h3&gt;

&lt;p&gt;Four gates enforce production-grade quality:&lt;/p&gt;

&lt;h4&gt;
  
  
  Gate-I: Improve (Code Quality)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;semantic_coherence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≥0.95&lt;/span&gt;  &lt;span class="c1"&gt;# Clear naming, no magic numbers&lt;/span&gt;
&lt;span class="na"&gt;duplication_max&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≤0.05&lt;/span&gt;     &lt;span class="c1"&gt;# Max 5% duplicate code&lt;/span&gt;
&lt;span class="na"&gt;invariants_documented&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="c1"&gt;# Pre/post-conditions&lt;/span&gt;
&lt;span class="na"&gt;todos_tracked&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;        &lt;span class="c1"&gt;# All TODOs in issue tracker&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Measured by:&lt;/strong&gt; AST analysis, clone detection, docstring coverage.&lt;/p&gt;

&lt;h4&gt;
  
  
  Gate-O: Optimize (Performance &amp;amp; Resilience)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;latency_ms&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;p50&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≤100&lt;/span&gt;
  &lt;span class="na"&gt;p95&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≤200&lt;/span&gt;
  &lt;span class="na"&gt;p99&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≤500&lt;/span&gt;
&lt;span class="na"&gt;error_budget_respected&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;chaos_tests_pass&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;no_obvious_inefficiencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  &lt;span class="c1"&gt;# N+1 queries, memory leaks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Measured by:&lt;/strong&gt; Load testing (locust, k6), chaos engineering, profiling.&lt;/p&gt;

&lt;h4&gt;
  
  
  Gate-M: Modularize (Clean Boundaries)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;contracts_defined&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;       &lt;span class="c1"&gt;# 100% of modules&lt;/span&gt;
&lt;span class="na"&gt;change_surface_max&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.20&lt;/span&gt;     &lt;span class="c1"&gt;# ≤20% of codebase touched&lt;/span&gt;
&lt;span class="na"&gt;no_circular_deps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;coupling_acceptable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Measured by:&lt;/strong&gt; Dependency graph analysis, interface stability.&lt;/p&gt;

&lt;h4&gt;
  
  
  Gate-S: Shrink (Minimal Complexity)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;api_surface_reduction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≥0.20&lt;/span&gt;  &lt;span class="c1"&gt;# Or justified growth&lt;/span&gt;
&lt;span class="na"&gt;dependency_count_stable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;onboarding_time_minutes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;≤15&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Measured by:&lt;/strong&gt; Public API count, &lt;code&gt;requirements.txt&lt;/code&gt; diff, README clarity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IOSM-Index Calculation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IOSM-Index = (Gate-I + Gate-O + Gate-M + Gate-S) / 4
Production Threshold: ≥ 0.80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Auto-spawn rules:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gate-I &amp;lt; 0.75 → Spawn clarity/duplication fixes&lt;/li&gt;
&lt;li&gt;Gate-O fails → Spawn test/performance fixes&lt;/li&gt;
&lt;li&gt;Gate-M fails → Spawn boundary clarification tasks&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3. Auto-Spawn Protocol
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agents discover issues during execution (e.g., N+1 queries, missing tests).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Structured &lt;code&gt;SpawnCandidates&lt;/code&gt; section in reports.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Format:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## SpawnCandidates&lt;/span&gt;

| ID | Subtask | Touches | Effort | User Input | Severity | Dedup Key | Accept Criteria |
|----|---------|---------|--------|------------|----------|-----------|-----------------|
| SC-01 | Fix missing type annotation | &lt;span class="sb"&gt;`backend/auth.py`&lt;/span&gt; | S | false | medium | auth.py&lt;span class="se"&gt;\|&lt;/span&gt;type-annot | mypy passes |
| SC-02 | Clarify API contract | &lt;span class="sb"&gt;`docs/api_spec.yaml`&lt;/span&gt; | M | true | high | api_spec&lt;span class="se"&gt;\|&lt;/span&gt;contract | Contract approved |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Orchestrator actions:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Parse &lt;code&gt;SpawnCandidates&lt;/code&gt; from completed task reports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deduplicate&lt;/strong&gt; by &lt;code&gt;dedup_key&lt;/code&gt; (prevents duplicate work)&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;needs_user_input=false&lt;/code&gt; and &lt;code&gt;severity != critical&lt;/code&gt; → &lt;strong&gt;auto-spawn&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;needs_user_input=true&lt;/code&gt; → Add to blocked queue&lt;/li&gt;
&lt;li&gt;Run new tasks through planner and dispatch&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Spawn protection:&lt;/strong&gt; Budget limits (default: 20 auto-spawns per track) prevent infinite loops.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Cost Tracking &amp;amp; Model Selection
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Model selection rules:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Cost (per 1M tokens)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Haiku&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read-only analysis&lt;/td&gt;
&lt;td&gt;$0.25 / $1.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sonnet&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Standard implementation&lt;/td&gt;
&lt;td&gt;$3.00 / $15.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Opus&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Architecture, security&lt;/td&gt;
&lt;td&gt;$15.00 / $75.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Budget controls:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Default limit: &lt;strong&gt;$10.00 per track&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;⚠️ &lt;strong&gt;80% usage&lt;/strong&gt; → Warning&lt;/li&gt;
&lt;li&gt;🛑 &lt;strong&gt;100% usage&lt;/strong&gt; → Pause execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Check current spend:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Cost Tracking (from iosm_state.md)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; budget_total: $10.00
&lt;span class="p"&gt;-&lt;/span&gt; spent_so_far: $6.50
&lt;span class="p"&gt;-&lt;/span&gt; remaining: $3.50
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Greenfield Feature (Email Notifications)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Task:&lt;/strong&gt; Add complete email notification system to SaaS app&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plan:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;T01: Design email templates (Architect, foreground)&lt;/li&gt;
&lt;li&gt;T02: Implement SMTP service (Implementer-A, background)&lt;/li&gt;
&lt;li&gt;T03: Add queue system (Implementer-B, background, &lt;strong&gt;parallel with T02&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;T04: Write integration tests (TestRunner, background, after T02+T03)&lt;/li&gt;
&lt;li&gt;T05: Add API endpoints (Implementer-C, background, after T02)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⚡ &lt;strong&gt;~3x faster&lt;/strong&gt; (4-6h parallel vs 12-15h sequential)&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;100% test coverage&lt;/strong&gt; (Gate-O enforcement)&lt;/li&gt;
&lt;li&gt;📉 &lt;strong&gt;Minimal technical debt&lt;/strong&gt; (Gate-I: 0.92)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2. Brownfield Refactoring (Payment Module)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Task:&lt;/strong&gt; Refactor legacy payment processing (5000+ LOC, 3 years old)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Plan mode:&lt;/strong&gt; Explorer analyzes codebase (read-only, safe)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PRD with rollback strategy&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comprehensive regression tests&lt;/strong&gt; (before touching code)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel implementation&lt;/strong&gt; (2 modules refactored simultaneously)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gate-M fails:&lt;/strong&gt; Circular dependency detected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-spawn:&lt;/strong&gt; "Break circular import between Payment and Invoice"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-check Gate-M:&lt;/strong&gt; Pass ✅&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🎯 &lt;strong&gt;Gate-driven quality&lt;/strong&gt; — Forced resolution of hidden issues&lt;/li&gt;
&lt;li&gt;🔒 &lt;strong&gt;Safe refactor&lt;/strong&gt; — All tests passing before merge&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;Measured improvement&lt;/strong&gt; — 40% reduction in module coupling&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3. Multi-Module Feature (Multi-Tenant Architecture)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Task:&lt;/strong&gt; Add multi-tenancy (affects 8 modules)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plan:&lt;/strong&gt; 20+ tasks across 5 waves&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wave 1: T01 Design schema (Architect, critical path)&lt;/li&gt;
&lt;li&gt;Wave 2: T02-T04 Database migrations (&lt;strong&gt;3 parallel implementers&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Wave 3: T05-T10 Update 6 modules (&lt;strong&gt;6 parallel implementers&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Wave 4: T11-T15 Tests (&lt;strong&gt;5 parallel test runners&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Wave 5: T16 Integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Auto-spawn:&lt;/strong&gt; 3 critical tasks discovered during execution&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📈 &lt;strong&gt;High parallelism&lt;/strong&gt; — 6 modules updated simultaneously&lt;/li&gt;
&lt;li&gt;💰 &lt;strong&gt;Budget control&lt;/strong&gt; — $6.50 spent (within $10 limit)&lt;/li&gt;
&lt;li&gt;⏱️ &lt;strong&gt;Time savings&lt;/strong&gt; — ~18h parallel vs 60h+ sequential&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Getting Started (5 Minutes)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone into Claude Code skills directory&lt;/span&gt;
git clone https://github.com/rokoss21/swarm-iosm.git .claude/skills/swarm-iosm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify: type &lt;code&gt;/swarm-iosm&lt;/code&gt; in Claude Code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create Your First Track
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/swarm-iosm new-track &lt;span class="s2"&gt;"Add user authentication with JWT"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ask questions (mode: greenfield/brownfield, priorities, constraints)&lt;/li&gt;
&lt;li&gt;Generate PRD (10 sections)&lt;/li&gt;
&lt;li&gt;Create &lt;code&gt;plan.md&lt;/code&gt; with task breakdown&lt;/li&gt;
&lt;li&gt;Show orchestration plan&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Execute
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/swarm-iosm implement
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch the magic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Parallel agents launch automatically&lt;/li&gt;
&lt;li&gt;Progress tracked in &lt;code&gt;iosm_state.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Reports appear in &lt;code&gt;reports/&lt;/code&gt; directory&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Integrate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/swarm-iosm integrate &amp;lt;track-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Quality gates run automatically. You get &lt;code&gt;iosm_report.md&lt;/code&gt; with pass/fail.&lt;/p&gt;




&lt;h2&gt;
  
  
  Commands Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm setup&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Initialize project context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm new-track "&amp;lt;desc&amp;gt;"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create feature track&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm implement&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Execute plan (auto mode)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check progress&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm watch&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Live monitoring (v1.3)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm simulate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Dry-run with timeline (v1.3)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm resume&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resume after crash (v1.3)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm retry &amp;lt;task-id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retry failed task (v1.2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/swarm-iosm integrate &amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Merge and run gates&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What Swarm-IOSM is NOT
&lt;/h2&gt;

&lt;p&gt;To set clear expectations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ &lt;strong&gt;Not a general-purpose workflow engine&lt;/strong&gt; — Designed specifically for Claude Code agent orchestration&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Not a replacement for CI/CD&lt;/strong&gt; — Complements your pipeline, doesn't replace it&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Not a code generator "autopilot"&lt;/strong&gt; — Requires human oversight and decision-making&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Not safe to run unattended on production repos&lt;/strong&gt; — Always review changes before merge&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────────────────┐
│                    ORCHESTRATOR (Main Claude Agent)                  │
│  ┌─────────────────────────────────────────────────────────────────┐ │
│  │              Continuous Dispatch Loop (v1.1+)                   │ │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────────┐ │ │
│  │  │ Collect  │→ │ Classify │→ │ Conflict │→ │ Dispatch Batch   │ │ │
│  │  │  Ready   │  │  Modes   │  │  Check   │  │ (max 3-6 tasks)  │ │ │
│  │  └──────────┘  └──────────┘  └──────────┘  └──────────────────┘ │ │
│  │       ↑                                           │             │ │
│  │       │        ┌──────────┐  ┌──────────┐         ↓             │ │
│  │       └────────│  IOSM    │←─│ Auto-    │←────────┘             │ │
│  │                │  Gates   │  │ Spawn    │                       │ │
│  │                └──────────┘  └──────────┘                       │ │
│  └─────────────────────────────────────────────────────────────────┘ │
│                                   │                                  │
│               ┌───────────────────┼───────────────────┐              │
│               ↓                   ↓                   ↓              │
│  ┌────────────────────┐ ┌────────────────────┐ ┌─────────────────┐   │
│  │   Subagent (BG)    │ │   Subagent (BG)    │ │  Subagent (FG)  │   │
│  │   Explorer         │ │   Implementer-A    │ │  Architect      │   │
│  │   read-only        │ │   write-local      │ │  needs_user     │   │
│  └────────────────────┘ └────────────────────┘ └─────────────────┘   │
│               │                   │                   │              │
│               ↓                   ↓                   ↓              │
│         reports/T01.md      reports/T02.md      reports/T03.md       │
│         + SpawnCandidates   + SpawnCandidates   + Escalations        │
└──────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  IOSM Framework Integration
&lt;/h2&gt;

&lt;p&gt;Swarm-IOSM implements the &lt;a href="https://github.com/rokoss21/IOSM" rel="noopener noreferrer"&gt;IOSM methodology&lt;/a&gt; (Improve → Optimize → Shrink → Modularize) as an executable system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────────────────────────────────────────────────────────────┐
│                           IOSM FRAMEWORK                                   │
│                   https://github.com/rokoss21/IOSM                         │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                            │
│    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────────────┐    │
│    │ IMPROVE  │ →  │ OPTIMIZE │ →  │  SHRINK  │ →  │   MODULARIZE     │    │
│    │          │    │          │    │          │    │                  │    │
│    │ Clarity  │    │ Speed    │    │ Simplify │    │ Decompose        │    │
│    │ No dups  │    │ Resil.   │    │ Surface  │    │ Contracts        │    │
│    │ Invars   │    │ Chaos    │    │ Deps     │    │ Coupling         │    │
│    └────┬─────┘    └────┬─────┘    └────┬─────┘    └────────┬─────────┘    │
│         │               │               │                   │              │
│    ┌────▼─────┐    ┌────▼─────┐    ┌────▼─────┐    ┌────────▼─────────┐    │
│    │ Gate-I   │    │ Gate-O   │    │ Gate-S   │    │     Gate-M       │    │
│    │ ≥0.85    │    │ ≥0.75    │    │ ≥0.80    │    │     ≥0.80        │    │
│    └──────────┘    └──────────┘    └──────────┘    └──────────────────┘    │
│                                                                            │
│    IOSM-Index = (Gate-I + Gate-O + Gate-S + Gate-M) / 4                    │
│    Production threshold: ≥ 0.80                                            │
└────────────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Version History
&lt;/h2&gt;

&lt;h3&gt;
  
  
  v2.1 (2026-01-19) — Current
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Automated State Management (&lt;code&gt;iosm_state.md&lt;/code&gt; auto-generated)&lt;/li&gt;
&lt;li&gt;Status Sync CLI (&lt;code&gt;--update-task&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Improved Report Conflict Detection&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  v2.0 (2026-01-18)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Inter-Agent Communication (&lt;code&gt;shared_context.md&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Task Dependency Visualization (&lt;code&gt;--graph&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Anti-Pattern Detection&lt;/li&gt;
&lt;li&gt;Template Customization&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  v1.3 (2026-01-17)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Simulation Mode (&lt;code&gt;/swarm-iosm simulate&lt;/code&gt;) with ASCII Timeline&lt;/li&gt;
&lt;li&gt;Live Monitoring (&lt;code&gt;/swarm-iosm watch&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Checkpointing &amp;amp; Resume (&lt;code&gt;/swarm-iosm resume&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  v1.2 (2026-01-16)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Concurrency Limits (Resource Budgets)&lt;/li&gt;
&lt;li&gt;Cost Tracking &amp;amp; Model Selection (Haiku/Sonnet/Opus)&lt;/li&gt;
&lt;li&gt;Intelligent Error Diagnosis &amp;amp; Retry (&lt;code&gt;/swarm-iosm retry&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  v1.1 (2026-01-15)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Continuous Dispatch Loop&lt;/strong&gt; (no wave barriers)&lt;/li&gt;
&lt;li&gt;Gate-Driven Continuation&lt;/li&gt;
&lt;li&gt;Auto-Spawn from SpawnCandidates&lt;/li&gt;
&lt;li&gt;Touches Lock Manager&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Contributing
&lt;/h2&gt;

&lt;p&gt;We welcome contributions! Key areas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gate Automation Scripts&lt;/strong&gt; — Measure IOSM criteria automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI/CD Integration&lt;/strong&gt; — GitHub Actions, GitLab CI examples&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Language-Specific Checkers&lt;/strong&gt; — Python, TypeScript, Rust evaluators&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More Examples&lt;/strong&gt; — Real-world track demonstrations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IDE Integration&lt;/strong&gt; — VS Code extension&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See &lt;a href="https://github.com/rokoss21/swarm-iosm/blob/main/CONTRIBUTING.md" rel="noopener noreferrer"&gt;CONTRIBUTING.md&lt;/a&gt; for guidelines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Swarm-IOSM proves that AI agent orchestration can be both &lt;strong&gt;fast&lt;/strong&gt; (3-8x speedup through parallelism) and &lt;strong&gt;safe&lt;/strong&gt; (quality gates before merge).&lt;/p&gt;

&lt;p&gt;The continuous dispatch model eliminates artificial wave barriers, file lock management prevents conflicts, and IOSM gates enforce production-grade standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key takeaway:&lt;/strong&gt; Don't choose between speed and quality. With proper orchestration, you get both.&lt;/p&gt;

&lt;p&gt;Try it today:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/rokoss21/swarm-iosm.git .claude/skills/swarm-iosm
/swarm-iosm new-track &lt;span class="s2"&gt;"Your next feature"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/rokoss21/swarm-iosm" rel="noopener noreferrer"&gt;github.com/rokoss21/swarm-iosm&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IOSM Methodology:&lt;/strong&gt; &lt;a href="https://github.com/rokoss21/IOSM" rel="noopener noreferrer"&gt;github.com/rokoss21/IOSM&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author:&lt;/strong&gt; &lt;a href="https://github.com/rokoss21" rel="noopener noreferrer"&gt;Emil Rokossovskiy&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;Email: &lt;a href="mailto:ecsiar@gmail.com"&gt;ecsiar@gmail.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Web: &lt;a href="https://rokoss21.tech" rel="noopener noreferrer"&gt;rokoss21.tech&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Related Projects:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/rokoss21/facet-standard" rel="noopener noreferrer"&gt;FACET Standard&lt;/a&gt; — Deterministic Contract Layer for AI&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/rokoss21/facet-compiler" rel="noopener noreferrer"&gt;FACET Compiler&lt;/a&gt; — Reference Implementation (Rust)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/rokoss21/IOSM#real-world-application-astrovisorio" rel="noopener noreferrer"&gt;AstroVisor.io&lt;/a&gt; — Production IOSM Case Study&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Questions? Ideas? Issues?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/rokoss21/swarm-iosm/discussions" rel="noopener noreferrer"&gt;GitHub Discussions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/rokoss21/swarm-iosm/issues" rel="noopener noreferrer"&gt;GitHub Issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Built with ⚡ by &lt;a href="https://github.com/rokoss21" rel="noopener noreferrer"&gt;@rokoss21&lt;/a&gt; | IOSM: Improve → Optimize → Shrink → Modularize&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>automation</category>
      <category>engineering</category>
    </item>
    <item>
      <title>FACET: Contracts + Gates for LLM Systems</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Mon, 19 Jan 2026 08:28:53 +0000</pubDate>
      <link>https://dev.to/rokoss21/facet-contracts-gates-for-llm-systems-pok</link>
      <guid>https://dev.to/rokoss21/facet-contracts-gates-for-llm-systems-pok</guid>
      <description>&lt;p&gt;&lt;em&gt;Stop doing improv theatre in production. Ship agents like software.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Agentic tooling is moving fast: CLIs that edit repositories, frameworks that orchestrate swarms, tool-calling APIs everywhere. And still, most teams that try to run “agents” in production hit the same wall:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;outputs drift between runs&lt;/li&gt;
&lt;li&gt;“structured output” breaks at the worst moment&lt;/li&gt;
&lt;li&gt;tool calls happen at the wrong time, with the wrong shape&lt;/li&gt;
&lt;li&gt;debugging turns into story-time (“it worked yesterday…”)&lt;/li&gt;
&lt;li&gt;trust collapses exactly when you need it most&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The root cause isn’t that models aren’t smart enough.&lt;br&gt;
It’s that we keep shipping &lt;strong&gt;non-contractual behavior&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This post argues a simple thesis:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Reliability in LLM systems doesn’t come from better prompts.&lt;br&gt;
It comes from &lt;strong&gt;contracts&lt;/strong&gt; and &lt;strong&gt;gates&lt;/strong&gt; — with the system holding veto power.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;FACET v2.0 is a compiler-grade, deterministic agent configuration language designed around that thesis: strict AST → type checking (FTS) → reactive compute (R-DAG) → deterministic context packing (Token Box Model) → canonical JSON render.&lt;/p&gt;


&lt;h2&gt;
  
  
  A short failure story: “theatre in production”
&lt;/h2&gt;

&lt;p&gt;A team ships an “agentic PR bot”. It edits code, runs tests, and posts a confident summary.&lt;/p&gt;

&lt;p&gt;One day the bot “fixes” an issue by adding a dependency. Tests pass locally. The PR merges.&lt;br&gt;
In production, a transitive change triggers a locale/timezone edge case. A downstream service fails for a subset of users. Rollback takes hours because nobody can answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Was the agent allowed to introduce new dependencies?&lt;/li&gt;
&lt;li&gt;Which tool calls did it run, with what arguments, in what order?&lt;/li&gt;
&lt;li&gt;Can we replay the run?&lt;/li&gt;
&lt;li&gt;What evidence exists beyond “agent said it’s fine”?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The bot didn’t “misbehave”. It acted exactly as designed: &lt;strong&gt;it operated without enforceable boundaries&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That’s the pattern: not “bad model”, but &lt;strong&gt;missing veto power&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Contracts and gates: the difference between a demo and a pipeline
&lt;/h2&gt;

&lt;p&gt;Most agent stacks look like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt + JSON hope → model writes → parse fails → retry culture → merge anyway&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A contractual pipeline looks like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contract → validate inputs + permissions → generate artifact → validate artifact → gates → commit (or reject)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two key primitives make this real:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Contracts&lt;/strong&gt;: define what’s allowed and what “valid” means&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gates&lt;/strong&gt;: run reality checks (tests, security, perf) and block state changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FACET makes both primitives first-class — not conventions, not best-effort prompts.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 1 — Contracts in FACET (real examples)
&lt;/h2&gt;

&lt;p&gt;FACET v2.0 treats agent behavior as a compiled spec. That starts with strict structure and typing.&lt;/p&gt;
&lt;h3&gt;
  
  
  1) Tool contracts with &lt;code&gt;@interface&lt;/code&gt; (typed tools, not “tool descriptions”)
&lt;/h3&gt;

&lt;p&gt;In FACET, tools aren’t loose JSON blobs. They are typed interfaces that compile into provider tool schemas.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@interface WeatherAPI
  fn get_current(city: string) -&amp;gt; struct {
    temp: float
    condition: string
  }

@system
  tools: [$WeatherAPI]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a contract:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the tool name exists&lt;/li&gt;
&lt;li&gt;args are typed (&lt;code&gt;city: string&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;return shape is typed (&lt;code&gt;struct { temp: float, condition: string }&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;the compiler can emit canonical provider schemas during render&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, this eliminates a whole class of runtime failures: wrong arg names, wrong types, ambiguous “tool results”.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) Inputs are explicit with &lt;code&gt;@input&lt;/code&gt; (no hidden dependencies)
&lt;/h3&gt;

&lt;p&gt;FACET forces you to declare runtime inputs in &lt;code&gt;@vars&lt;/code&gt; via &lt;code&gt;@input(...)&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@vars
  user_query: @input(type="string")
  user_photo: @input(type="image", max_dim=1024)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;missing input is not “guess it” — it’s an error&lt;/li&gt;
&lt;li&gt;constraints (like image size) are enforced at runtime&lt;/li&gt;
&lt;li&gt;inputs become leaf nodes in the R-DAG (deterministic dependency graph)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is fail-closed engineering: if data isn’t provided, the system does &lt;strong&gt;not&lt;/strong&gt; hallucinate a substitute.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) Variables are reactive, deterministic, and immutable after compute (R-DAG)
&lt;/h3&gt;

&lt;p&gt;FACET variables can depend on other variables. Evaluation happens via R-DAG in topological order; cycles and invalid orders are errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@vars
  raw_query: $user_query |&amp;gt; trim()
  query_lang: $raw_query |&amp;gt; detect_lang()
  normalized: $raw_query |&amp;gt; normalize(lang=$query_lang)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key point: once computed, the variable map becomes immutable.&lt;br&gt;
This makes runs reproducible and debuggable: the same inputs produce the same computed state (in Pure Mode).&lt;/p&gt;
&lt;h3&gt;
  
  
  4) Lenses have trust levels (Pure / Bounded / Volatile)
&lt;/h3&gt;

&lt;p&gt;FACET introduces trust levels for transformations (lenses):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Level 0 — Pure&lt;/strong&gt;: deterministic, no I/O&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Level 1 — Bounded external&lt;/strong&gt;: allowed only with deterministic params, cacheable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Level 2 — Volatile&lt;/strong&gt;: nondeterministic, only in Execution Mode&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A pipeline makes the contract explicit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@vars
  summary: $normalized
    |&amp;gt; summarize(model="gpt-5.2", temperature=0)   # Level 1 (bounded)
    |&amp;gt; to_markdown()                               # Level 0 (pure)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where “determinism is a property of the system” becomes concrete.&lt;br&gt;
If you’re in Pure Mode: you simply cannot smuggle volatility in “because it felt right”.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 2 — Gates in FACET (not vibes, executable checks)
&lt;/h2&gt;

&lt;p&gt;A contract without gates is still fragile. Gates give the system the right to say: &lt;strong&gt;no&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;FACET v2.0 includes a first-class testing system via &lt;code&gt;@test&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  5) Tests as executable gates with mocks and assertions (&lt;code&gt;@test&lt;/code&gt;)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@test "basic greeting"
  vars:
    username: "TestUser"

  mock:
    WeatherAPI.get_current: { temp: 10, condition: "Rain" }

  assert:
    - output contains "umbrella"
    - cost &amp;lt; 0.01
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This is CI thinking applied to agent specs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tests execute the full 5-phase pipeline&lt;/li&gt;
&lt;li&gt;tools can be mocked (deterministic runs)&lt;/li&gt;
&lt;li&gt;assertions can check output and telemetry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words: “agent done” is not a feeling — it’s &lt;strong&gt;passing checks&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 3 — Deterministic context packing (Token Box Model) is a gate too
&lt;/h2&gt;

&lt;p&gt;Even when contracts and tests exist, real systems fail because context is managed ad hoc. Prompts overflow, critical instructions get truncated, and the model “drifts” because the context layout changed.&lt;/p&gt;

&lt;p&gt;FACET treats context like layout, not like concatenated strings.&lt;/p&gt;
&lt;h3&gt;
  
  
  6) Token Box Model: deterministic allocation + critical overflow as a hard failure
&lt;/h3&gt;

&lt;p&gt;The model is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;your prompt is a set of &lt;strong&gt;sections&lt;/strong&gt; (&lt;code&gt;@system&lt;/code&gt;, &lt;code&gt;@user&lt;/code&gt;, history, docs, etc.)&lt;/li&gt;
&lt;li&gt;each section has min/grow/shrink/priority&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;critical sections&lt;/strong&gt; are those with &lt;code&gt;shrink == 0&lt;/code&gt; and must never be dropped or compressed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If critical sections can’t fit, FACET raises a hard error (critical overflow).&lt;br&gt;
This is a &lt;em&gt;gate&lt;/em&gt;: the system refuses to ship an invalid prompt.&lt;/p&gt;

&lt;p&gt;That single decision kills an entire class of “mysterious agent regressions” caused by silent truncation.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 4 — What “enforced before generation” actually means (no magic)
&lt;/h2&gt;

&lt;p&gt;This phrase can sound controversial, so here’s the precise version:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FACET enforces a double barrier:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Before action&lt;/strong&gt; (pre-check):&lt;br&gt;
validate inputs, tool interfaces, allowed operations, budgets, deterministic mode constraints&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Before state change&lt;/strong&gt; (post-check):&lt;br&gt;
validate produced artifacts, run gates, reject if any invariant breaks&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So the flow is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;validate → generate → validate → gate → commit&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is how compilers and CI pipelines behave.&lt;br&gt;
Production agent systems should do the same.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 5 — A small, concrete canonical output artifact
&lt;/h2&gt;

&lt;p&gt;FACET’s final output is a canonical JSON structure (before provider-specific transformations). Here’s a simplified “what your orchestration layer can log and replay” shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"profile"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hypervisor"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pure"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WeatherAPI.get_current"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"input_schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sections_order"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"history"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"what to wear today in Berlin?"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"gates"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"gate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tests_green"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"gate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"critical_overflow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the difference vs typical systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;there is an explicit mode&lt;/li&gt;
&lt;li&gt;tools are typed&lt;/li&gt;
&lt;li&gt;section order is deterministic&lt;/li&gt;
&lt;li&gt;gates and outcomes are visible&lt;/li&gt;
&lt;li&gt;this is loggable and replayable&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Part 6 — Tooling matters: the reference CLI (&lt;code&gt;fct&lt;/code&gt;) makes this operational
&lt;/h2&gt;

&lt;p&gt;FACET isn’t only a philosophy; it specifies tooling expectations. A reference CLI (&lt;code&gt;fct&lt;/code&gt;) is part of the standard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;fct build file.facet&lt;/code&gt; — resolution + type checking&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fct run file.facet --input input.json&lt;/code&gt; — full 5-phase pipeline → canonical JSON&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fct test file.facet&lt;/code&gt; — execute &lt;code&gt;@test&lt;/code&gt; blocks, report failures + telemetry&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fct inspect ...&lt;/code&gt; — introspect AST/R-DAG/context allocation (debuggability)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the language includes these operations, teams stop inventing bespoke glue.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing: stop shipping theatre — ship standards
&lt;/h2&gt;

&lt;p&gt;LLMs are powerful components — but without enforceable boundaries they introduce entropy at the exact moment correctness, security, and reliability matter most.&lt;/p&gt;

&lt;p&gt;Contracts + gates aren’t bureaucracy.&lt;br&gt;
They’re the difference between a cool demo and a shippable system.&lt;/p&gt;

&lt;p&gt;FACET’s core bet is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Treat agent behavior like compiled software:&lt;br&gt;
parse, type-check, compute deterministically, pack context deterministically, render canonical JSON — and never commit state unless gates pass.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Repositories
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;FACET Compiler: &lt;a href="https://github.com/rokoss21/facet-compiler" rel="noopener noreferrer"&gt;https://github.com/rokoss21/facet-compiler&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;FACET Standard: &lt;a href="https://github.com/rokoss21/facet-standard" rel="noopener noreferrer"&gt;https://github.com/rokoss21/facet-standard&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Parallel Agents Are Easy. Shipping Without Chaos Isn’t.</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Sun, 18 Jan 2026 09:14:24 +0000</pubDate>
      <link>https://dev.to/rokoss21/parallel-agents-are-easy-shipping-without-chaos-isnt-1kek</link>
      <guid>https://dev.to/rokoss21/parallel-agents-are-easy-shipping-without-chaos-isnt-1kek</guid>
      <description>&lt;h2&gt;
  
  
  Introducing Swarm-IOSM — a Parallel Subagent Orchestration Engine for Claude Code
&lt;/h2&gt;

&lt;p&gt;Everyone is building multi-agent workflows now.&lt;/p&gt;

&lt;p&gt;Swarm prompts. Agent teams. Tool calling. “Auto-developers”.&lt;/p&gt;

&lt;p&gt;And yet… most of them collapse the moment you try to use them on real codebases.&lt;/p&gt;

&lt;p&gt;Not because the models can’t code.&lt;/p&gt;

&lt;p&gt;Because &lt;strong&gt;parallel development has two hard problems&lt;/strong&gt; that prompt-chains don’t solve:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Safe concurrency&lt;/strong&gt; (two agents writing into the same file is not “parallelism”, it’s a race condition)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stop conditions&lt;/strong&gt; (how do you know the result is shippable, not just “it ran”)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I built &lt;strong&gt;Swarm-IOSM&lt;/strong&gt; to turn agent orchestration into an engineering discipline:&lt;br&gt;
locks, dispatch scheduling, gates, and anti-chaos rules — executable, repeatable, and production-oriented.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/rokoss21/swarm-iosm" rel="noopener noreferrer"&gt;https://github.com/rokoss21/swarm-iosm&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Hidden Failure Mode of “Agent Swarms”
&lt;/h2&gt;

&lt;p&gt;Here’s the truth nobody wants to say out loud:&lt;/p&gt;

&lt;p&gt;Most “agent swarms” are just &lt;strong&gt;concurrency without a correctness model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;They don’t fail spectacularly. They fail &lt;em&gt;quietly&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent A fixes a bug and touches &lt;code&gt;auth.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Agent B adds a feature and also touches &lt;code&gt;auth.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;You merge both and discover behavior drift&lt;/li&gt;
&lt;li&gt;The PR looks large, architecture degrades, confidence drops&lt;/li&gt;
&lt;li&gt;Then the swarm spawns more tasks to “fix” the mess&lt;/li&gt;
&lt;li&gt;Congratulations, you built &lt;strong&gt;a self-replicating backlog generator&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The root cause is simple:&lt;/p&gt;
&lt;h3&gt;
  
  
  “Parallel agents” ≠ Parallel development
&lt;/h3&gt;

&lt;p&gt;Parallel development requires &lt;strong&gt;conflict prevention&lt;/strong&gt;, not conflict resolution.&lt;/p&gt;


&lt;h2&gt;
  
  
  Swarm-IOSM: IOSM Methodology + Execution Engine
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;IOSM&lt;/strong&gt; is the methodology:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Improve → Optimize → Shrink → Modularize&lt;br&gt;
A disciplined loop that forces engineering quality to remain measurable, not performative.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Swarm-IOSM&lt;/strong&gt; is the execution engine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PRD-driven decomposition&lt;/li&gt;
&lt;li&gt;Continuous dispatch scheduling&lt;/li&gt;
&lt;li&gt;File-conflict prevention via lock discipline&lt;/li&gt;
&lt;li&gt;Auto-spawn protocol for discoveries&lt;/li&gt;
&lt;li&gt;Quality gates as stop conditions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s not “a prompt”.&lt;/p&gt;

&lt;p&gt;It’s a &lt;strong&gt;workflow runtime for parallel software development&lt;/strong&gt; inside Claude Code.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Architecture: An Orchestrator That Does Not Implement
&lt;/h2&gt;

&lt;p&gt;Swarm-IOSM is intentionally designed around one rule:&lt;/p&gt;
&lt;h3&gt;
  
  
  The Orchestrator does NOT implement.
&lt;/h3&gt;

&lt;p&gt;The main agent coordinates only.&lt;/p&gt;

&lt;p&gt;All implementation work happens in subagents, each producing a report.&lt;/p&gt;

&lt;p&gt;This is not a style preference — it’s a safety boundary.&lt;/p&gt;

&lt;p&gt;When the orchestrator writes code, it stops being a scheduler and becomes “yet another contributor”, losing global coordination ability.&lt;/p&gt;

&lt;p&gt;So Swarm-IOSM splits responsibilities cleanly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Orchestrator&lt;/strong&gt; = scheduling + gates + conflict check + state tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subagents&lt;/strong&gt; = execution + reports + spawn candidates&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  The Core Engine: Continuous Dispatch (No Wave Barriers)
&lt;/h2&gt;

&lt;p&gt;Most orchestration frameworks work like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Prepare plan → run wave 1 → wait → run wave 2 → wait → merge&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s not how software work actually flows.&lt;/p&gt;

&lt;p&gt;Reality is continuous: tasks unblock tasks every minute.&lt;/p&gt;

&lt;p&gt;Swarm-IOSM implements &lt;strong&gt;continuous dispatch scheduling&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tasks move through states: &lt;code&gt;backlog → ready → running → done&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;as soon as dependencies are satisfied, tasks are eligible to run&lt;/li&gt;
&lt;li&gt;you dispatch ready tasks immediately (no waiting for a “wave boundary”)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is what makes it feel &lt;em&gt;fast&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It maximizes parallelism without turning the repo into a battlefield.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Missing Primitive: “Touches” Lock Manager
&lt;/h2&gt;

&lt;p&gt;This is the centerpiece.&lt;/p&gt;

&lt;p&gt;Swarm-IOSM treats a codebase like a shared memory system.&lt;/p&gt;

&lt;p&gt;If agents are threads, then files are memory regions.&lt;/p&gt;

&lt;p&gt;So Swarm introduces a primitive that classic “agent swarms” ignore:&lt;/p&gt;
&lt;h3&gt;
  
  
  Touches = the set of files/folders a task may modify.
&lt;/h3&gt;

&lt;p&gt;Each task declares:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Touches: auth.py, services/auth/&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Concurrency class:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;read-only&lt;/code&gt; (no locks, always safe)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;write-local&lt;/code&gt; (lock only touches)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;write-shared&lt;/code&gt; (exclusive, sequential)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then Swarm enforces locks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;folder lock blocks everything inside it&lt;/li&gt;
&lt;li&gt;file lock blocks only that file&lt;/li&gt;
&lt;li&gt;read-only tasks remain parallel always&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result:&lt;/p&gt;

&lt;p&gt;✅ real parallelism&lt;br&gt;
✅ predictable merges&lt;br&gt;
✅ no random collisions “because agent decided to edit config too”&lt;/p&gt;


&lt;h2&gt;
  
  
  Auto-Spawn… Without Infinite Task Proliferation
&lt;/h2&gt;

&lt;p&gt;Auto-spawn sounds cool until you actually run it.&lt;/p&gt;

&lt;p&gt;A naive swarm will spawn tasks forever.&lt;/p&gt;

&lt;p&gt;Swarm-IOSM forces auto-spawn to be &lt;strong&gt;bounded and deduplicated&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;spawn budget total&lt;/li&gt;
&lt;li&gt;per-gate budgets&lt;/li&gt;
&lt;li&gt;dedup key: &lt;code&gt;&amp;lt;primary_touch&amp;gt;|&amp;lt;intent_category&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;severity thresholds&lt;/li&gt;
&lt;li&gt;anti-loop counters (max iterations without progress)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is what transforms “agent creativity” into something you can safely run in an engineering process.&lt;/p&gt;


&lt;h2&gt;
  
  
  IOSM Gates: Stop Conditions That Mean Something
&lt;/h2&gt;

&lt;p&gt;Most systems “stop” when tasks finish.&lt;/p&gt;

&lt;p&gt;Swarm-IOSM stops when &lt;strong&gt;quality is achieved&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It tracks four gate families:&lt;/p&gt;
&lt;h3&gt;
  
  
  Gate-I (Improve)
&lt;/h3&gt;

&lt;p&gt;Clarity, invariants, low duplication.&lt;/p&gt;
&lt;h3&gt;
  
  
  Gate-O (Optimize)
&lt;/h3&gt;

&lt;p&gt;Latency budget, error budget, chaos checks, no obvious inefficiencies.&lt;/p&gt;
&lt;h3&gt;
  
  
  Gate-S (Shrink)
&lt;/h3&gt;

&lt;p&gt;Surface area reduction, dependency stability, onboarding time.&lt;/p&gt;
&lt;h3&gt;
  
  
  Gate-M (Modularize)
&lt;/h3&gt;

&lt;p&gt;Contracts, coupling limits, no circular dependencies. &lt;/p&gt;

&lt;p&gt;Swarm is not just “agents executing tasks”.&lt;/p&gt;

&lt;p&gt;It’s agents executing tasks until the system crosses a production threshold.&lt;/p&gt;


&lt;h2&gt;
  
  
  Quick Start (The Happy Path)
&lt;/h2&gt;

&lt;p&gt;Swarm-IOSM lives here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/rokoss21/swarm-iosm" rel="noopener noreferrer"&gt;https://github.com/rokoss21/swarm-iosm&lt;/a&gt;&lt;/strong&gt; &lt;/p&gt;
&lt;h3&gt;
  
  
  1) Install as a Claude Code skill
&lt;/h3&gt;

&lt;p&gt;Project-level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/rokoss21/swarm-iosm.git .claude/skills/swarm-iosm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;User-level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/rokoss21/swarm-iosm.git ~/.claude/skills/swarm-iosm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2) Initialize project context
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/swarm-iosm setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3) Create a feature track
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/swarm-iosm new-track "Add user authentication with JWT"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Swarm generates PRD + plan and returns a track id like:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;2026-01-17-001&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  4) Validate &amp;amp; generate a continuous dispatch plan
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python .claude/skills/swarm-iosm/scripts/orchestration_planner.py &lt;span class="se"&gt;\&lt;/span&gt;
  swarm/tracks/&amp;lt;track-id&amp;gt;/plan.md &lt;span class="nt"&gt;--validate&lt;/span&gt;

python .claude/skills/swarm-iosm/scripts/orchestration_planner.py &lt;span class="se"&gt;\&lt;/span&gt;
  swarm/tracks/&amp;lt;track-id&amp;gt;/plan.md &lt;span class="nt"&gt;--continuous&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5) Execute
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/swarm-iosm implement
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6) Integrate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/swarm-iosm integrate &amp;lt;track-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces integration artifacts and quality gate reporting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Is Different From “Yet Another Agent Framework”
&lt;/h2&gt;

&lt;p&gt;This part matters.&lt;/p&gt;

&lt;p&gt;Swarm-IOSM doesn’t compete with “prompt frameworks” by being smarter.&lt;/p&gt;

&lt;p&gt;It wins by being stricter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Swarm-IOSM treats a repo as a concurrency system.
&lt;/h3&gt;

&lt;p&gt;Locks are not optional.&lt;/p&gt;

&lt;h3&gt;
  
  
  Swarm-IOSM treats quality as a stop condition.
&lt;/h3&gt;

&lt;p&gt;No gates = no ship.&lt;/p&gt;

&lt;h3&gt;
  
  
  Swarm-IOSM treats spawn as a budgeted resource.
&lt;/h3&gt;

&lt;p&gt;Infinite loops are a design bug, not “agent autonomy”.&lt;/p&gt;

&lt;p&gt;You can replace models, providers, or toolchains.&lt;/p&gt;

&lt;p&gt;But you can’t replace engineering discipline with vibes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Fit: Where Swarm-IOSM Shines
&lt;/h2&gt;

&lt;p&gt;Use Swarm-IOSM when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multi-file features require coordination&lt;/li&gt;
&lt;li&gt;brownfield refactoring needs guardrails&lt;/li&gt;
&lt;li&gt;parallel implementation streams are valuable&lt;/li&gt;
&lt;li&gt;acceptance criteria must exist (not “it compiles”)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Avoid Swarm-IOSM when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it’s a single-file change&lt;/li&gt;
&lt;li&gt;you want quick fixes without planning&lt;/li&gt;
&lt;li&gt;you’re doing purely exploratory research&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A hammer is not a screwdriver.&lt;/p&gt;

&lt;p&gt;A swarm is not a substitute for architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Meta-Point: This Is Part of a Bigger Stack
&lt;/h2&gt;

&lt;p&gt;I’m building a full deterministic engineering ecosystem around AI systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;IOSM&lt;/strong&gt; = methodology layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Swarm-IOSM&lt;/strong&gt; = execution/orchestration layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FACET&lt;/strong&gt; = deterministic contract layer for AI behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’ve read my FACET articles, you already know the thesis:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;We don’t need “more prompting”.&lt;br&gt;
We need engineering primitives: contracts, determinism, orchestration rules, replayable artifacts.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Swarm-IOSM is exactly that philosophy applied to parallel agent development.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Swarm-IOSM (GitHub):&lt;/strong&gt; &lt;a href="https://github.com/rokoss21/swarm-iosm" rel="noopener noreferrer"&gt;https://github.com/rokoss21/swarm-iosm&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IOSM Methodology:&lt;/strong&gt; &lt;a href="https://github.com/rokoss21/IOSM" rel="noopener noreferrer"&gt;https://github.com/rokoss21/IOSM&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FACET v2.0 intro:&lt;/strong&gt; &lt;a href="https://dev.to/rokoss21/llms-need-a-contract-layer-introducing-facet-v20-4n1n"&gt;https://dev.to/rokoss21/llms-need-a-contract-layer-introducing-facet-v20-4n1n&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token Box Model:&lt;/strong&gt; &lt;a href="https://dev.to/rokoss21/token-box-model-1kkb"&gt;https://dev.to/rokoss21/token-box-model-1kkb&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Canonical JSON Model:&lt;/strong&gt; &lt;a href="https://dev.to/rokoss21/canonical-json-model-2p8o"&gt;https://dev.to/rokoss21/canonical-json-model-2p8o&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;Parallel agents are not the hard part.&lt;/p&gt;

&lt;p&gt;The hard part is &lt;strong&gt;shipping&lt;/strong&gt; without chaos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no file conflicts&lt;/li&gt;
&lt;li&gt;no accidental coupling&lt;/li&gt;
&lt;li&gt;no architecture collapse&lt;/li&gt;
&lt;li&gt;no infinite spawn loops&lt;/li&gt;
&lt;li&gt;gates that enforce engineering quality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Swarm-IOSM is my answer to that.&lt;/p&gt;

&lt;p&gt;If you’re using Claude Code and you’ve ever tried to scale beyond a single agent — try it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/rokoss21/swarm-iosm" rel="noopener noreferrer"&gt;https://github.com/rokoss21/swarm-iosm&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And if you want the next deep dive, I can write a follow-up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the touches lock hierarchy rules&lt;/li&gt;
&lt;li&gt;a demo track walkthrough&lt;/li&gt;
&lt;li&gt;and how IOSM gates can be automated for CI.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>History and Rationale of FACET</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Wed, 17 Dec 2025 00:01:11 +0000</pubDate>
      <link>https://dev.to/rokoss21/history-and-rationale-of-facet-3cf8</link>
      <guid>https://dev.to/rokoss21/history-and-rationale-of-facet-3cf8</guid>
      <description>&lt;h2&gt;
  
  
  Purpose of This Document
&lt;/h2&gt;

&lt;p&gt;This document records the &lt;strong&gt;historical context, architectural motivations, and rationale&lt;/strong&gt; behind the design decisions of FACET.&lt;/p&gt;

&lt;p&gt;It exists to answer a recurring future question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Why was FACET designed this way, and not differently?&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is &lt;strong&gt;not&lt;/strong&gt; a changelog and &lt;strong&gt;not&lt;/strong&gt; a roadmap.&lt;br&gt;
It is a rationale document intended for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;future maintainers&lt;/li&gt;
&lt;li&gt;standard reviewers&lt;/li&gt;
&lt;li&gt;enterprise architects&lt;/li&gt;
&lt;li&gt;historians of AI infrastructure&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  1. Pre-FACET Era (≈ 2018–2022)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 Prompt Engineering as an Anti-Pattern
&lt;/h3&gt;

&lt;p&gt;Early LLM systems treated prompts as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;opaque strings&lt;/li&gt;
&lt;li&gt;mutable runtime artifacts&lt;/li&gt;
&lt;li&gt;informal contracts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As systems grew, prompt engineering evolved into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;copy-paste templates&lt;/li&gt;
&lt;li&gt;ad-hoc retries&lt;/li&gt;
&lt;li&gt;regex-based JSON extraction&lt;/li&gt;
&lt;li&gt;post-hoc validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Failures were handled &lt;strong&gt;after generation&lt;/strong&gt;, not prevented.&lt;/p&gt;

&lt;p&gt;This era established a false assumption:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;LLM unreliability is inherent and unavoidable.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  1.2 Structured Output Did Not Solve the Core Problem
&lt;/h3&gt;

&lt;p&gt;Later approaches introduced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JSON schemas in prompts&lt;/li&gt;
&lt;li&gt;function / tool calling APIs&lt;/li&gt;
&lt;li&gt;Pydantic-style validators&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;schemas were advisory, not enforced&lt;/li&gt;
&lt;li&gt;providers interpreted constraints differently&lt;/li&gt;
&lt;li&gt;invalid states were still produced&lt;/li&gt;
&lt;li&gt;validation happened &lt;strong&gt;after&lt;/strong&gt; the model responded&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system still allowed invalid intermediate states.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. FACET v1.x (2022–2024): Lessons Learned
&lt;/h2&gt;

&lt;p&gt;FACET v1.x originated as a &lt;strong&gt;deterministic prompt templating system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It introduced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structured blocks&lt;/li&gt;
&lt;li&gt;conditional logic&lt;/li&gt;
&lt;li&gt;early lens pipelines&lt;/li&gt;
&lt;li&gt;canonical JSON output&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2.1 What v1.x Got Right
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;determinism mattered&lt;/li&gt;
&lt;li&gt;canonical JSON enabled caching and diffing&lt;/li&gt;
&lt;li&gt;composition beat monolithic prompts&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2.2 What v1.x Could Not Solve
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;no type system&lt;/li&gt;
&lt;li&gt;no execution model&lt;/li&gt;
&lt;li&gt;no formal notion of invalid state&lt;/li&gt;
&lt;li&gt;no prevention of tool-call failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FACET v1.x reduced chaos, but did not eliminate it.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. The Breaking Point (2024–2025)
&lt;/h2&gt;

&lt;p&gt;By 2024, several systemic failures became unavoidable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multi-tool agents failing nondeterministically&lt;/li&gt;
&lt;li&gt;provider-specific tool-call rules causing silent breakage&lt;/li&gt;
&lt;li&gt;streaming vs non-streaming divergence&lt;/li&gt;
&lt;li&gt;context truncation corrupting logic&lt;/li&gt;
&lt;li&gt;retries masking correctness bugs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At scale, these failures were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;expensive&lt;/li&gt;
&lt;li&gt;non-reproducible&lt;/li&gt;
&lt;li&gt;impossible to audit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The industry response remained reactive:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Add retries. Add validators. Add guardrails.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This approach did not converge.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. The Core Insight
&lt;/h2&gt;

&lt;p&gt;FACET v2.0 is built on a single foundational realization:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;You cannot build reliable systems on top of nondeterministic contracts.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The problem was not LLMs.&lt;br&gt;
The problem was &lt;strong&gt;lack of a contract layer&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. FACET v2.0 (2025): A Structural Reset
&lt;/h2&gt;

&lt;p&gt;FACET v2.0 was intentionally designed as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a compiler, not a template engine&lt;/li&gt;
&lt;li&gt;a contract system, not a helper library&lt;/li&gt;
&lt;li&gt;an execution model, not a runtime patch&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5.1 Determinism as a System Property
&lt;/h3&gt;

&lt;p&gt;FACET does not attempt to make models deterministic.&lt;/p&gt;

&lt;p&gt;Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;invalid states are prevented upstream&lt;/li&gt;
&lt;li&gt;contracts are enforced before execution&lt;/li&gt;
&lt;li&gt;outputs are canonicalized&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Determinism is achieved &lt;strong&gt;by architecture&lt;/strong&gt;, not by probability control.&lt;/p&gt;




&lt;h3&gt;
  
  
  5.2 Canonical JSON as Intermediate Representation
&lt;/h3&gt;

&lt;p&gt;FACET introduced &lt;strong&gt;Canonical JSON&lt;/strong&gt; as its IR:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider-neutral&lt;/li&gt;
&lt;li&gt;hash-stable&lt;/li&gt;
&lt;li&gt;diff-friendly&lt;/li&gt;
&lt;li&gt;replayable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This decouples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;authoring&lt;/li&gt;
&lt;li&gt;execution&lt;/li&gt;
&lt;li&gt;provider rendering&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and prevents vendor lock-in.&lt;/p&gt;




&lt;h3&gt;
  
  
  5.3 Execution Phases and R-DAG
&lt;/h3&gt;

&lt;p&gt;FACET formalized execution into five phases:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Resolution&lt;/li&gt;
&lt;li&gt;Type Checking&lt;/li&gt;
&lt;li&gt;Reactive Compute (R-DAG)&lt;/li&gt;
&lt;li&gt;Layout (Token Box Model)&lt;/li&gt;
&lt;li&gt;Render&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This eliminated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;implicit execution order&lt;/li&gt;
&lt;li&gt;hidden side effects&lt;/li&gt;
&lt;li&gt;runtime guesswork&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5.4 Token Box Model
&lt;/h3&gt;

&lt;p&gt;Context handling was redefined as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a resource allocation problem&lt;/li&gt;
&lt;li&gt;with explicit priorities&lt;/li&gt;
&lt;li&gt;deterministic compression rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This replaced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;truncation heuristics&lt;/li&gt;
&lt;li&gt;"best effort" packing&lt;/li&gt;
&lt;li&gt;silent loss of critical data&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5.5 Adapters as Pure Translators
&lt;/h3&gt;

&lt;p&gt;Adapters were intentionally constrained:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no logic&lt;/li&gt;
&lt;li&gt;no inference&lt;/li&gt;
&lt;li&gt;no recovery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This preserves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;auditability&lt;/li&gt;
&lt;li&gt;replayability&lt;/li&gt;
&lt;li&gt;long-term stability&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6. Rejected Alternatives (By Design)
&lt;/h2&gt;

&lt;p&gt;FACET explicitly rejected:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;probabilistic retries&lt;/li&gt;
&lt;li&gt;self-healing prompts&lt;/li&gt;
&lt;li&gt;adaptive prompt rewriting&lt;/li&gt;
&lt;li&gt;runtime schema repair&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These techniques obscure failure rather than eliminate it.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Long-Term Positioning
&lt;/h2&gt;

&lt;p&gt;FACET is designed to age like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LLVM&lt;/li&gt;
&lt;li&gt;SQL&lt;/li&gt;
&lt;li&gt;JSON Schema&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an agent framework&lt;/li&gt;
&lt;li&gt;a vendor SDK&lt;/li&gt;
&lt;li&gt;a prompt toolkit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is intended to remain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;boring&lt;/li&gt;
&lt;li&gt;strict&lt;/li&gt;
&lt;li&gt;predictable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;for decades.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Historical Attribution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;FACET — Deterministic Contract Layer (since 2025)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Author: Emil Rokossovskiy (rokoss21)&lt;/p&gt;

&lt;p&gt;The central idea predates industry consensus.&lt;/p&gt;

&lt;p&gt;When determinism became urgent, the architecture already existed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;This document is &lt;strong&gt;informative&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It does not define new requirements, but explains why the requirements exist.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;End of document.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
    </item>
    <item>
      <title>FACET Glossary</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:50:17 +0000</pubDate>
      <link>https://dev.to/rokoss21/facet-glossary-16g8</link>
      <guid>https://dev.to/rokoss21/facet-glossary-16g8</guid>
      <description>&lt;p&gt;This glossary defines &lt;strong&gt;normative terminology&lt;/strong&gt; used across the FACET standard, specification, and ecosystem documents.&lt;/p&gt;

&lt;p&gt;All terms listed here are intended to be interpreted consistently across implementations, adapters, documentation, and discussions.&lt;/p&gt;




&lt;h2&gt;
  
  
  FACET
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;FACET&lt;/strong&gt; — A deterministic contract layer and language (NADL) for defining, validating, and executing AI system behavior.&lt;/p&gt;

&lt;p&gt;FACET treats AI behavior as &lt;strong&gt;compiled software&lt;/strong&gt;, not probabilistic improvisation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Determinism
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Determinism&lt;/strong&gt; — The property that identical inputs produce identical outputs.&lt;/p&gt;

&lt;p&gt;In FACET, determinism is defined at the &lt;strong&gt;system level&lt;/strong&gt;, not at the model level.&lt;/p&gt;

&lt;p&gt;Determinism applies to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;execution order&lt;/li&gt;
&lt;li&gt;context layout&lt;/li&gt;
&lt;li&gt;tool-calling semantics&lt;/li&gt;
&lt;li&gt;canonical JSON output&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Contract
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Contract&lt;/strong&gt; — A formally defined, enforced agreement describing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;valid inputs&lt;/li&gt;
&lt;li&gt;valid outputs&lt;/li&gt;
&lt;li&gt;execution constraints&lt;/li&gt;
&lt;li&gt;resource bounds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A contract differs from a schema in that it is &lt;strong&gt;enforced before execution&lt;/strong&gt;, not merely validated after generation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Contract Layer
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Contract Layer&lt;/strong&gt; — The architectural boundary where AI behavior is constrained, validated, and canonicalized.&lt;/p&gt;

&lt;p&gt;The contract layer prevents invalid states from entering execution.&lt;/p&gt;

&lt;p&gt;FACET implements a contract layer via:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;types (FTS)&lt;/li&gt;
&lt;li&gt;interfaces&lt;/li&gt;
&lt;li&gt;execution phases&lt;/li&gt;
&lt;li&gt;Canonical JSON&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  NADL (Neural Architecture Description Language)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;NADL&lt;/strong&gt; — A declarative language used to describe AI system architecture, behavior, and constraints.&lt;/p&gt;

&lt;p&gt;FACET v2.0 is a NADL.&lt;/p&gt;




&lt;h2&gt;
  
  
  Canonical JSON
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Canonical JSON&lt;/strong&gt; — A deterministic, normalized intermediate representation (IR) produced by FACET.&lt;/p&gt;

&lt;p&gt;Canonical JSON is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider-agnostic&lt;/li&gt;
&lt;li&gt;structurally stable&lt;/li&gt;
&lt;li&gt;hashable&lt;/li&gt;
&lt;li&gt;replayable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is the &lt;strong&gt;single source of truth&lt;/strong&gt; for execution, caching, testing, and auditing.&lt;/p&gt;




&lt;h2&gt;
  
  
  IR (Intermediate Representation)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Intermediate Representation (IR)&lt;/strong&gt; — A normalized internal form used between compilation stages.&lt;/p&gt;

&lt;p&gt;Canonical JSON serves as FACET’s IR, analogous to LLVM IR.&lt;/p&gt;




&lt;h2&gt;
  
  
  AST (Abstract Syntax Tree)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AST&lt;/strong&gt; — A structured representation of a parsed FACET document.&lt;/p&gt;

&lt;p&gt;The AST is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;immutable after type checking&lt;/li&gt;
&lt;li&gt;the input to R-DAG construction&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  FTS (Facet Type System)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Facet Type System (FTS)&lt;/strong&gt; — A strict, language-neutral type system used by FACET.&lt;/p&gt;

&lt;p&gt;FTS governs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;variable types&lt;/li&gt;
&lt;li&gt;tool interfaces&lt;/li&gt;
&lt;li&gt;lens signatures&lt;/li&gt;
&lt;li&gt;multimodal values&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Interface (&lt;code&gt;@interface&lt;/code&gt;)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Interface&lt;/strong&gt; — A typed contract defining a callable tool.&lt;/p&gt;

&lt;p&gt;Interfaces specify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;function name&lt;/li&gt;
&lt;li&gt;parameters&lt;/li&gt;
&lt;li&gt;return type&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Interfaces compile into provider-specific tool schemas.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lens
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Lens&lt;/strong&gt; — A transformation function applied to values within FACET.&lt;/p&gt;

&lt;p&gt;Lenses are categorized by trust level:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Level 0 — Pure (fully deterministic)&lt;/li&gt;
&lt;li&gt;Level 1 — Bounded external (deterministic under constraints)&lt;/li&gt;
&lt;li&gt;Level 2 — Volatile (non-deterministic)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  R-DAG (Reactive Dependency Graph)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;R-DAG&lt;/strong&gt; — A directed acyclic graph representing variable dependencies.&lt;/p&gt;

&lt;p&gt;R-DAG guarantees:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no cycles&lt;/li&gt;
&lt;li&gt;deterministic evaluation order&lt;/li&gt;
&lt;li&gt;single execution per node&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Token Box Model
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Token Box Model&lt;/strong&gt; — A deterministic algorithm for context allocation under token budgets.&lt;/p&gt;

&lt;p&gt;It defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;critical vs flexible sections&lt;/li&gt;
&lt;li&gt;compression rules&lt;/li&gt;
&lt;li&gt;drop order&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Adapter
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Adapter&lt;/strong&gt; — A pure translation layer that maps Canonical JSON to provider-specific payloads.&lt;/p&gt;

&lt;p&gt;Adapters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MUST be deterministic&lt;/li&gt;
&lt;li&gt;MUST NOT add logic&lt;/li&gt;
&lt;li&gt;MUST NOT mutate semantics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters are translators, not collaborators.&lt;/p&gt;




&lt;h2&gt;
  
  
  Provider Payload
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Provider Payload&lt;/strong&gt; — The final request format required by a specific AI provider API.&lt;/p&gt;

&lt;p&gt;Provider payloads are derived views of Canonical JSON.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pure Mode
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pure Mode&lt;/strong&gt; — An execution mode in which all behavior is fully deterministic.&lt;/p&gt;

&lt;p&gt;Pure Mode forbids:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;randomness&lt;/li&gt;
&lt;li&gt;unrestricted I/O&lt;/li&gt;
&lt;li&gt;volatile lenses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pure Mode outputs are canonical.&lt;/p&gt;




&lt;h2&gt;
  
  
  Execution Mode
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Execution Mode&lt;/strong&gt; — A permissive mode allowing volatile lenses and external side effects.&lt;/p&gt;

&lt;p&gt;Execution Mode outputs are not canonical.&lt;/p&gt;




&lt;h2&gt;
  
  
  Snapshot Testing (Golden Tests)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Snapshot Testing&lt;/strong&gt; — A testing method where output is compared against a known-good snapshot.&lt;/p&gt;

&lt;p&gt;FACET uses snapshot testing for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Canonical JSON&lt;/li&gt;
&lt;li&gt;adapter outputs&lt;/li&gt;
&lt;li&gt;regression detection&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Vendor Lock-in
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Vendor Lock-in&lt;/strong&gt; — Dependency on a specific provider’s undocumented or unstable behavior.&lt;/p&gt;

&lt;p&gt;FACET mitigates vendor lock-in by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;enforcing provider-agnostic Canonical JSON&lt;/li&gt;
&lt;li&gt;isolating provider logic in adapters&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Reproducibility
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Reproducibility&lt;/strong&gt; — The ability to replay executions and obtain identical results.&lt;/p&gt;

&lt;p&gt;Reproducibility in FACET is defined by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FACET document&lt;/li&gt;
&lt;li&gt;inputs&lt;/li&gt;
&lt;li&gt;execution mode&lt;/li&gt;
&lt;li&gt;Canonical JSON&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Invalid State
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Invalid State&lt;/strong&gt; — Any state that violates a contract, type, constraint, or execution rule.&lt;/p&gt;

&lt;p&gt;FACET prevents invalid states &lt;strong&gt;before execution&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;This glossary defines the shared language of the FACET ecosystem.&lt;/p&gt;

&lt;p&gt;Correct use of these terms is required for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;specification compliance&lt;/li&gt;
&lt;li&gt;adapter implementation&lt;/li&gt;
&lt;li&gt;meaningful technical discussion&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Status:&lt;/strong&gt; Normative reference document&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>architecture</category>
    </item>
    <item>
      <title>FACET vs Existing Approaches</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:43:51 +0000</pubDate>
      <link>https://dev.to/rokoss21/facet-vs-existing-approaches-7e0</link>
      <guid>https://dev.to/rokoss21/facet-vs-existing-approaches-7e0</guid>
      <description>&lt;p&gt;&lt;strong&gt;Status:&lt;/strong&gt; Informative (but engineering-focused)&lt;/p&gt;

&lt;p&gt;This document positions &lt;strong&gt;FACET — Deterministic Contract Layer (since 2025)&lt;/strong&gt; against common industry approaches for structured outputs, tool-calling, and agent orchestration.&lt;/p&gt;

&lt;p&gt;FACET’s core thesis is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Reliability is not a prompt property. It’s a system property.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most stacks attempt to “coerce” reliability &lt;em&gt;after&lt;/em&gt; the model produces an invalid state (validators, retries, repair prompts). FACET enforces validity &lt;em&gt;before&lt;/em&gt; generation through &lt;strong&gt;compilation, typing, canonicalization, deterministic layout, and replayable artifacts&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Executive Map
&lt;/h2&gt;

&lt;p&gt;FACET is best understood as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A standard&lt;/strong&gt; (spec + conformance levels)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A compiler&lt;/strong&gt; (AST → type-check → R-DAG → Token Box → Canonical JSON)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A contract boundary&lt;/strong&gt; (tool schema + deterministic context + replay)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A provider-decoupling layer&lt;/strong&gt; (Canonical JSON as IR; adapters as views)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you already have an agent stack, FACET is not competing with your &lt;em&gt;business logic&lt;/em&gt;.&lt;br&gt;
It competes with the fragile parts: &lt;strong&gt;prompt glue, schema drift, provider quirks, ad hoc truncation, and non-replayable runs&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison Table (High Signal)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;What it actually is&lt;/th&gt;
&lt;th&gt;Strengths&lt;/th&gt;
&lt;th&gt;Failure mode&lt;/th&gt;
&lt;th&gt;What FACET adds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;“JSON schema in prompt”&lt;/td&gt;
&lt;td&gt;Best-effort instruction&lt;/td&gt;
&lt;td&gt;Simple, low overhead&lt;/td&gt;
&lt;td&gt;Model deviates; post-hoc repair&lt;/td&gt;
&lt;td&gt;Compile-time contracts + deterministic rejection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDK tool/function calling&lt;/td&gt;
&lt;td&gt;Vendor tool schema + runtime loop&lt;/td&gt;
&lt;td&gt;Good DX, integrations&lt;/td&gt;
&lt;td&gt;Provider quirks; invalid arguments; streaming drift&lt;/td&gt;
&lt;td&gt;Canonical contracts + deterministic sequencing constraints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pydantic validation (post-hoc)&lt;/td&gt;
&lt;td&gt;Runtime validation of model output&lt;/td&gt;
&lt;td&gt;Strong typing; great errors&lt;/td&gt;
&lt;td&gt;You already paid for a bad sample; retries/repair loops&lt;/td&gt;
&lt;td&gt;Prevent invalid states &lt;em&gt;upstream&lt;/em&gt;; replayable artifacts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instructor / Guardrails / Output Fixers&lt;/td&gt;
&lt;td&gt;Validators + repair prompting&lt;/td&gt;
&lt;td&gt;Practical mitigation&lt;/td&gt;
&lt;td&gt;“Fixing” can mutate semantics; non-deterministic&lt;/td&gt;
&lt;td&gt;Deterministic compilation + stable IR for audits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent frameworks (LangChain, etc.)&lt;/td&gt;
&lt;td&gt;Orchestration + memory + tools&lt;/td&gt;
&lt;td&gt;Fast iteration&lt;/td&gt;
&lt;td&gt;Hidden heuristics; brittle prompt stacks&lt;/td&gt;
&lt;td&gt;Standard contracts + canonical execution model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;“We just retry”&lt;/td&gt;
&lt;td&gt;Operational band-aid&lt;/td&gt;
&lt;td&gt;Sometimes works&lt;/td&gt;
&lt;td&gt;Cost blowups; latency; silent drift&lt;/td&gt;
&lt;td&gt;Deterministic success criteria; lower ops burden&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Note:&lt;/em&gt; FACET does &lt;strong&gt;not&lt;/strong&gt; replace providers, SDKs, or orchestration frameworks. It standardizes the &lt;em&gt;contract boundary&lt;/em&gt; they all currently treat as “best-effort”.&lt;/p&gt;




&lt;h2&gt;
  
  
  1) FACET vs “JSON Schema in the Prompt”
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What the industry does
&lt;/h3&gt;

&lt;p&gt;Teams paste JSON Schema (or a shape description) into system instructions and hope the model follows it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it breaks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A schema in a prompt is &lt;strong&gt;advisory&lt;/strong&gt;, not enforceable.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The model can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;omit required fields&lt;/li&gt;
&lt;li&gt;output wrong types&lt;/li&gt;
&lt;li&gt;hallucinate keys&lt;/li&gt;
&lt;li&gt;emit extra commentary&lt;/li&gt;
&lt;li&gt;violate nested constraints&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;When it fails, systems react with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Try again”&lt;/li&gt;
&lt;li&gt;repair prompts&lt;/li&gt;
&lt;li&gt;regex hacks&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  What FACET does differently
&lt;/h3&gt;

&lt;p&gt;FACET turns schemas into &lt;strong&gt;typed contracts&lt;/strong&gt; enforced by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FTS&lt;/strong&gt; (Facet Type System)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase ordering&lt;/strong&gt; (compile-time checks before render)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Canonical JSON&lt;/strong&gt; (stable structure, explicit nulls)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adapter boundary&lt;/strong&gt; (provider view derived from the same IR)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: a run either produces a &lt;strong&gt;valid canonical state&lt;/strong&gt; or fails &lt;em&gt;before&lt;/em&gt; it pollutes downstream execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  2) FACET vs Vendor SDK Tool Calling
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What the industry does
&lt;/h3&gt;

&lt;p&gt;Use tool/function calling via OpenAI / Anthropic / Gemini SDKs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it still breaks in production
&lt;/h3&gt;

&lt;p&gt;Even when using “structured tools”, you face:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider-specific sequencing constraints&lt;/li&gt;
&lt;li&gt;tool name casing or normalization differences&lt;/li&gt;
&lt;li&gt;streaming vs non-streaming inconsistencies&lt;/li&gt;
&lt;li&gt;serialization and “invisible dict args” class bugs&lt;/li&gt;
&lt;li&gt;subtle incompatibilities between SDK helpers&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What FACET adds
&lt;/h3&gt;

&lt;p&gt;FACET treats provider constraints as &lt;strong&gt;first-class compile-time inputs&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Interfaces (&lt;code&gt;@interface&lt;/code&gt;) are typed and validated.&lt;/li&gt;
&lt;li&gt;Provider constraints are captured during compilation (targeting profiles).&lt;/li&gt;
&lt;li&gt;Adapters are &lt;strong&gt;passive translators&lt;/strong&gt; (no repair).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You still use the provider SDK. FACET makes the tool-calling boundary deterministic and replayable.&lt;/p&gt;




&lt;h2&gt;
  
  
  3) FACET vs Pydantic (Post-hoc Validation)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Pydantic gives you
&lt;/h3&gt;

&lt;p&gt;Pydantic is excellent at validating Python values against types/models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where it cannot help alone
&lt;/h3&gt;

&lt;p&gt;Validation after the model output is already generated means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you still pay latency/cost for invalid samples&lt;/li&gt;
&lt;li&gt;you need retry/repair loops&lt;/li&gt;
&lt;li&gt;behavior diverges across providers/modes&lt;/li&gt;
&lt;li&gt;multi-tool chains fail in the middle&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  FACET’s shift in order-of-operations
&lt;/h3&gt;

&lt;p&gt;Instead of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;generate&lt;/li&gt;
&lt;li&gt;validate&lt;/li&gt;
&lt;li&gt;retry&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;FACET pushes reliability &lt;em&gt;upstream&lt;/em&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;compile contracts&lt;/li&gt;
&lt;li&gt;constrain generation&lt;/li&gt;
&lt;li&gt;reject invalid states before they execute&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Pydantic remains useful &lt;em&gt;inside&lt;/em&gt; the host application — FACET complements it by preventing invalid tool states and making runs replayable.&lt;/p&gt;




&lt;h2&gt;
  
  
  4) FACET vs Guardrails / “Output Fixers” / Repair Prompts
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What these tools do well
&lt;/h3&gt;

&lt;p&gt;They reduce pain quickly by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;validating outputs&lt;/li&gt;
&lt;li&gt;re-asking the model to correct mistakes&lt;/li&gt;
&lt;li&gt;forcing JSON-only responses&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The hidden risk
&lt;/h3&gt;

&lt;p&gt;Repair systems can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mutate meaning while “fixing” structure&lt;/li&gt;
&lt;li&gt;introduce non-determinism (different fix attempts)&lt;/li&gt;
&lt;li&gt;hide root-cause: your contract boundary is porous&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  FACET’s principle
&lt;/h3&gt;

&lt;p&gt;A contract layer should not &lt;em&gt;patch&lt;/em&gt;.&lt;br&gt;
It should &lt;strong&gt;prevent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;FACET is compatible with guardrails, but flips the default: &lt;strong&gt;deterministic compilation and canonicalization first&lt;/strong&gt;; “repair” becomes optional and explicitly non-canonical.&lt;/p&gt;




&lt;h2&gt;
  
  
  5) FACET vs Agent Frameworks
&lt;/h2&gt;

&lt;p&gt;Agent frameworks are essential for orchestration.&lt;br&gt;
FACET does not compete with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;routing&lt;/li&gt;
&lt;li&gt;memory strategies&lt;/li&gt;
&lt;li&gt;tool registries&lt;/li&gt;
&lt;li&gt;business workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FACET competes with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompt sprawl&lt;/li&gt;
&lt;li&gt;undocumented heuristics&lt;/li&gt;
&lt;li&gt;non-replayable execution&lt;/li&gt;
&lt;li&gt;vendor lock-in at the message/tool boundary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;FACET can be used as a contract boundary inside any framework.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6) FACET vs “Just Use Retries”
&lt;/h2&gt;

&lt;p&gt;Retries are the industry’s default reliability strategy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why retries are a tax
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;latency increases non-linearly&lt;/li&gt;
&lt;li&gt;cost becomes unpredictable&lt;/li&gt;
&lt;li&gt;partial failures pollute state&lt;/li&gt;
&lt;li&gt;error handling grows faster than features&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  FACET’s alternative
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;deterministic failure boundaries&lt;/li&gt;
&lt;li&gt;canonical replay&lt;/li&gt;
&lt;li&gt;stable hashing and caching&lt;/li&gt;
&lt;li&gt;snapshot-based regression tests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Operationally, this reduces the “unknown unknowns” that appear at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Canonical JSON as IR
&lt;/h2&gt;

&lt;p&gt;FACET’s Canonical JSON is the IR that makes everything else possible:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Diffability:&lt;/strong&gt; stable diffs between runs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hashing:&lt;/strong&gt; stable cache keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replays:&lt;/strong&gt; deterministic reproduction of incidents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audits:&lt;/strong&gt; exact historical payload reconstruction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor switching:&lt;/strong&gt; adapters render provider payloads as views&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In compiler terms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.facet&lt;/code&gt; = source&lt;/li&gt;
&lt;li&gt;Canonical JSON = IR&lt;/li&gt;
&lt;li&gt;Provider payloads = target-specific codegen&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  LLVM Analogy (For Future Readers)
&lt;/h2&gt;

&lt;p&gt;FACET deliberately follows a compiler architecture familiar to systems engineers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Source language:&lt;/strong&gt; &lt;code&gt;.facet&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Front-end:&lt;/strong&gt; parse → AST → type-check (FTS)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mid-end:&lt;/strong&gt; deterministic evaluation graph (R-DAG)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resource allocator:&lt;/strong&gt; Token Box Model (context algebra)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IR:&lt;/strong&gt; Canonical JSON (stable, hashable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Back-ends:&lt;/strong&gt; provider adapters (OpenAI/Anthropic/Gemini/etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Just as LLVM enabled multiple backends from one IR, FACET enables multiple providers from one canonical contract.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to Use FACET
&lt;/h2&gt;

&lt;p&gt;FACET is strongest when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tool chains are multi-step&lt;/li&gt;
&lt;li&gt;failures are expensive&lt;/li&gt;
&lt;li&gt;you need deterministic replay&lt;/li&gt;
&lt;li&gt;you must support multiple providers&lt;/li&gt;
&lt;li&gt;you need formal governance (tests, audits, compliance)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your use case is a single prompt in a toy script, FACET may be overkill.&lt;br&gt;
If your use case is production agents, it becomes a reliability layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Practical Adoption Paths
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Path A — Contract Boundary Only
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;keep your existing framework&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;introduce FACET only for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;interface contracts&lt;/li&gt;
&lt;li&gt;canonical JSON&lt;/li&gt;
&lt;li&gt;snapshot tests&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Path B — Deterministic Runs for CI
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;use &lt;code&gt;@test&lt;/code&gt; + canonical snapshots&lt;/li&gt;
&lt;li&gt;regress tool schemas and prompts without hitting production&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Path C — Full Hypervisor Profile
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;R-DAG variables&lt;/li&gt;
&lt;li&gt;Token Box Model&lt;/li&gt;
&lt;li&gt;deterministic caching and replay&lt;/li&gt;
&lt;li&gt;adapters per provider&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;FACET is not trying to be “one more wrapper.”&lt;br&gt;
It is a standard and compiler that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;replaces best-effort schemas with enforceable contracts&lt;/li&gt;
&lt;li&gt;makes context layout deterministic&lt;/li&gt;
&lt;li&gt;makes runs replayable and auditable&lt;/li&gt;
&lt;li&gt;isolates vendor churn behind adapters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When reliability becomes urgent, the solution should already be written.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FACET — Deterministic Contract Layer (since 2025)&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Compliance Levels</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:35:38 +0000</pubDate>
      <link>https://dev.to/rokoss21/compliance-levels-4phl</link>
      <guid>https://dev.to/rokoss21/compliance-levels-4phl</guid>
      <description>&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;This document defines &lt;strong&gt;compliance levels&lt;/strong&gt; for FACET-related implementations.&lt;/p&gt;

&lt;p&gt;While the FACET v2.0 specification defines what is &lt;em&gt;correct&lt;/em&gt;, compliance levels define &lt;strong&gt;how completely&lt;/strong&gt; a given component (compiler, adapter, runtime, SDK integration) adheres to the FACET contract model.&lt;/p&gt;

&lt;p&gt;This allows the ecosystem to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;distinguish partial integrations from full implementations&lt;/li&gt;
&lt;li&gt;avoid false claims of determinism&lt;/li&gt;
&lt;li&gt;set clear expectations for enterprise use&lt;/li&gt;
&lt;li&gt;evolve the standard without breaking attribution or trust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compliance levels are &lt;strong&gt;declarative&lt;/strong&gt; and &lt;strong&gt;auditable&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Principle
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Not all FACET integrations are equal — and that must be explicit.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A component MUST declare its compliance level.&lt;/p&gt;

&lt;p&gt;Silently claiming "FACET-compatible" without meeting the requirements of a level is considered &lt;strong&gt;non-compliant&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Compliance Levels Overview
&lt;/h2&gt;

&lt;p&gt;FACET defines four compliance levels:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;L0&lt;/td&gt;
&lt;td&gt;Conceptual&lt;/td&gt;
&lt;td&gt;Documentation / ideas only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L1&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;Canonical JSON &amp;amp; schema adherence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L2&lt;/td&gt;
&lt;td&gt;Deterministic&lt;/td&gt;
&lt;td&gt;Full determinism &amp;amp; reproducibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L3&lt;/td&gt;
&lt;td&gt;Reference&lt;/td&gt;
&lt;td&gt;Spec-complete, reference-grade&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Level 0 — Conceptual Compliance (L0)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Audience:&lt;/strong&gt; blog posts, design docs, experimental prototypes&lt;/p&gt;

&lt;h3&gt;
  
  
  Definition
&lt;/h3&gt;

&lt;p&gt;The implementation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;references FACET concepts (contracts, determinism, Canonical JSON)&lt;/li&gt;
&lt;li&gt;does NOT implement formal compilation or guarantees&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Allowed Claims
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;"FACET-inspired"&lt;/li&gt;
&lt;li&gt;"FACET concepts applied"&lt;/li&gt;
&lt;li&gt;"Contract-based approach"&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Forbidden Claims
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;deterministic execution&lt;/li&gt;
&lt;li&gt;reproducibility guarantees&lt;/li&gt;
&lt;li&gt;FACET-compatible&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Notes
&lt;/h3&gt;

&lt;p&gt;L0 is &lt;strong&gt;not an implementation level&lt;/strong&gt;.&lt;br&gt;
It exists to allow discussion without misleading users.&lt;/p&gt;




&lt;h2&gt;
  
  
  Level 1 — Structural Compliance (L1)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Audience:&lt;/strong&gt; SDK extensions, tooling, lightweight integrations&lt;/p&gt;

&lt;h3&gt;
  
  
  Definition
&lt;/h3&gt;

&lt;p&gt;The implementation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;produces or consumes &lt;strong&gt;Canonical JSON&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;follows canonical ordering and explicit null rules&lt;/li&gt;
&lt;li&gt;enforces schema shape stability&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Required Properties
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;stable key ordering&lt;/li&gt;
&lt;li&gt;explicit &lt;code&gt;null&lt;/code&gt; for missing optional fields&lt;/li&gt;
&lt;li&gt;deterministic serialization&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Non-Requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;full R-DAG execution&lt;/li&gt;
&lt;li&gt;Token Box Model&lt;/li&gt;
&lt;li&gt;strict determinism across runs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Allowed Claims
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;"FACET-compatible (structural)"&lt;/li&gt;
&lt;li&gt;"Canonical JSON compliant"&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;logging / auditing tools&lt;/li&gt;
&lt;li&gt;snapshot testing harnesses&lt;/li&gt;
&lt;li&gt;visualization layers&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Level 2 — Deterministic Compliance (L2)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Audience:&lt;/strong&gt; production agent systems, enterprise deployments&lt;/p&gt;

&lt;h3&gt;
  
  
  Definition
&lt;/h3&gt;

&lt;p&gt;The implementation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fully enforces &lt;strong&gt;deterministic execution&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;produces identical Canonical JSON for identical inputs&lt;/li&gt;
&lt;li&gt;rejects invalid states &lt;em&gt;before&lt;/em&gt; provider execution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Required Properties
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;strict Facet Type System (FTS)&lt;/li&gt;
&lt;li&gt;deterministic R-DAG execution&lt;/li&gt;
&lt;li&gt;deterministic Token Box Model layout&lt;/li&gt;
&lt;li&gt;canonical JSON as the single source of truth&lt;/li&gt;
&lt;li&gt;no retries as a correctness mechanism&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Guarantees
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;reproducible outputs&lt;/li&gt;
&lt;li&gt;stable hashing&lt;/li&gt;
&lt;li&gt;replayable executions&lt;/li&gt;
&lt;li&gt;deterministic failure modes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Allowed Claims
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;"Deterministic"&lt;/li&gt;
&lt;li&gt;"FACET-compliant"&lt;/li&gt;
&lt;li&gt;"Reproducible agent execution"&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Level 3 — Reference Compliance (L3)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Audience:&lt;/strong&gt; standards bodies, auditors, long-term infrastructure&lt;/p&gt;

&lt;h3&gt;
  
  
  Definition
&lt;/h3&gt;

&lt;p&gt;The implementation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;satisfies &lt;strong&gt;all&lt;/strong&gt; FACET v2.0 normative requirements&lt;/li&gt;
&lt;li&gt;passes the official FACET golden test suite&lt;/li&gt;
&lt;li&gt;is suitable as a &lt;strong&gt;reference implementation&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Required Properties
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;full spec coverage (all execution phases)&lt;/li&gt;
&lt;li&gt;golden tests with published fixtures&lt;/li&gt;
&lt;li&gt;strict adapter requirements&lt;/li&gt;
&lt;li&gt;hermetic execution guarantees&lt;/li&gt;
&lt;li&gt;documented versioning and change history&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Privileges
&lt;/h3&gt;

&lt;p&gt;Only L3 implementations may claim:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"FACET Reference Implementation"&lt;/li&gt;
&lt;li&gt;"Spec-complete"&lt;/li&gt;
&lt;li&gt;"FACET Standard"&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Adapters and Compliance
&lt;/h2&gt;

&lt;p&gt;Provider adapters have &lt;strong&gt;their own compliance axis&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An adapter may be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;L1 compliant (structural mapping only)&lt;/li&gt;
&lt;li&gt;L2 compliant (deterministic mapping + golden tests)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters can &lt;strong&gt;never&lt;/strong&gt; be L3 on their own.&lt;br&gt;
They inherit system-level compliance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Misrepresentation Clause
&lt;/h2&gt;

&lt;p&gt;Claiming a higher compliance level than implemented is a &lt;strong&gt;spec violation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Non-compliant claims:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Deterministic" without reproducibility&lt;/li&gt;
&lt;li&gt;"FACET-compatible" without Canonical JSON&lt;/li&gt;
&lt;li&gt;"Standard" without spec coverage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Such claims invalidate trust and interoperability.&lt;/p&gt;




&lt;h2&gt;
  
  
  Rationale
&lt;/h2&gt;

&lt;p&gt;Compliance levels exist to prevent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;marketing-driven overclaims&lt;/li&gt;
&lt;li&gt;partial integrations masquerading as standards&lt;/li&gt;
&lt;li&gt;ecosystem fragmentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A deterministic contract layer only works if trust is explicit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;FACET compliance is not binary.&lt;/p&gt;

&lt;p&gt;It is &lt;strong&gt;tiered, explicit, and enforceable&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If a system does not declare its compliance level, it has none.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;This document defines &lt;strong&gt;normative compliance levels&lt;/strong&gt; for the FACET ecosystem.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Adapter Requirements</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:31:04 +0000</pubDate>
      <link>https://dev.to/rokoss21/adapter-requirements-5de8</link>
      <guid>https://dev.to/rokoss21/adapter-requirements-5de8</guid>
      <description>&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;This document defines &lt;strong&gt;normative requirements&lt;/strong&gt; for FACET-compatible provider adapters.&lt;/p&gt;

&lt;p&gt;Adapters are the only layer allowed to translate &lt;strong&gt;Canonical JSON&lt;/strong&gt; into provider-specific payloads (OpenAI, Anthropic, Gemini, local runtimes, etc.).&lt;/p&gt;

&lt;p&gt;They exist to &lt;strong&gt;map&lt;/strong&gt;, not to interpret, fix, enrich, or re‑decide behavior.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Adapters are translators, not collaborators.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Core Principle
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Adapters MUST be behaviorally passive.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They MUST NOT:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;introduce new logic&lt;/li&gt;
&lt;li&gt;infer missing data&lt;/li&gt;
&lt;li&gt;reorder execution semantics&lt;/li&gt;
&lt;li&gt;apply provider-specific heuristics&lt;/li&gt;
&lt;li&gt;silently recover from invalid states&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All intelligence, validation, and determinism belong &lt;strong&gt;above&lt;/strong&gt; the adapter boundary.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architectural Position
&lt;/h2&gt;

&lt;p&gt;FACET enforces a strict layered architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.facet document
      ↓
Typed AST
      ↓
R-DAG execution
      ↓
Token Box Model
      ↓
Canonical JSON   ← SINGLE SOURCE OF TRUTH
      ↓
Provider Adapter
      ↓
Provider Payload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adapters operate &lt;strong&gt;only&lt;/strong&gt; on Canonical JSON.&lt;br&gt;
They MUST NOT accept partially-compiled or provider-shaped inputs.&lt;/p&gt;


&lt;h2&gt;
  
  
  Mandatory Adapter Properties
&lt;/h2&gt;

&lt;p&gt;A compliant adapter MUST satisfy &lt;strong&gt;all&lt;/strong&gt; of the following.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Deterministic Mapping
&lt;/h3&gt;

&lt;p&gt;Given identical Canonical JSON input:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;adapter output MUST be byte-for-byte identical&lt;/li&gt;
&lt;li&gt;mapping MUST be pure and stateless&lt;/li&gt;
&lt;li&gt;no randomness, clocks, environment state, or I/O allowed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters MUST be referentially transparent functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;output = adapter(canonical_json)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2. No Semantic Repair
&lt;/h3&gt;

&lt;p&gt;Adapters MUST NOT attempt to "fix" provider constraints by modifying semantics.&lt;/p&gt;

&lt;p&gt;Forbidden behaviors include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;renaming tools to match provider casing quirks&lt;/li&gt;
&lt;li&gt;injecting missing fields&lt;/li&gt;
&lt;li&gt;reordering messages to satisfy undocumented rules&lt;/li&gt;
&lt;li&gt;splitting or merging tool calls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If Canonical JSON violates a provider constraint, the adapter MUST fail loudly.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Silent recovery is corruption.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  3. Provider Constraints Are Declarative Inputs
&lt;/h3&gt;

&lt;p&gt;All provider-specific constraints MUST be declared &lt;strong&gt;upstream&lt;/strong&gt;, during compilation.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;required tool-call turn ordering&lt;/li&gt;
&lt;li&gt;serialization restrictions&lt;/li&gt;
&lt;li&gt;streaming limitations&lt;/li&gt;
&lt;li&gt;tool name casing rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters may only &lt;strong&gt;apply&lt;/strong&gt; constraints that were already resolved into Canonical JSON.&lt;/p&gt;

&lt;p&gt;They MUST NOT discover or infer constraints dynamically.&lt;/p&gt;

&lt;p&gt;This requirement implies that provider targeting is an explicit compilation choice&lt;br&gt;
(e.g. &lt;code&gt;target = "gemini"&lt;/code&gt;, &lt;code&gt;profile = "strict_chat"&lt;/code&gt;), not a runtime adaptation.&lt;/p&gt;

&lt;p&gt;Adapters MUST NOT compensate for missing or incorrect target selection.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. One-to-One Structural Mapping
&lt;/h3&gt;

&lt;p&gt;Adapters MUST preserve structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one canonical tool → one provider tool definition&lt;/li&gt;
&lt;li&gt;one canonical message → one provider message&lt;/li&gt;
&lt;li&gt;explicit &lt;code&gt;null&lt;/code&gt; fields MUST remain explicit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters MUST NOT:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;collapse multiple messages&lt;/li&gt;
&lt;li&gt;expand single messages&lt;/li&gt;
&lt;li&gt;drop empty or null fields&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5. Failure Containment
&lt;/h3&gt;

&lt;p&gt;Adapters MUST be a &lt;strong&gt;failure boundary&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If a provider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rejects a payload&lt;/li&gt;
&lt;li&gt;changes undocumented behavior&lt;/li&gt;
&lt;li&gt;introduces breaking changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The failure MUST:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;surface as an adapter error&lt;/li&gt;
&lt;li&gt;NOT mutate Canonical JSON&lt;/li&gt;
&lt;li&gt;NOT poison caches or history&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Canonical JSON remains valid and replayable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Explicit Prohibitions
&lt;/h2&gt;

&lt;p&gt;Adapters MUST be safe to execute in zero-trust environments.&lt;/p&gt;

&lt;p&gt;Adapters MUST NOT:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;perform validation (already done by compiler)&lt;/li&gt;
&lt;li&gt;run type checks&lt;/li&gt;
&lt;li&gt;execute lenses&lt;/li&gt;
&lt;li&gt;call LLMs&lt;/li&gt;
&lt;li&gt;fetch external resources&lt;/li&gt;
&lt;li&gt;access filesystem or network&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters are not execution engines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Versioning Requirements
&lt;/h2&gt;

&lt;p&gt;Adapters MUST:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;declare supported Canonical JSON version(s)&lt;/li&gt;
&lt;li&gt;fail on incompatible versions&lt;/li&gt;
&lt;li&gt;be forward-incompatible by default&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This prevents silent misinterpretation of newer contracts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Testing Requirements
&lt;/h2&gt;

&lt;p&gt;Every adapter implementation MUST include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Golden tests&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Canonical JSON input → exact provider payload snapshot&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Negative tests&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;invalid Canonical JSON → deterministic failure&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Round-trip safety&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;adapter output MUST NOT affect canonical replay hashes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Snapshot tests MUST be stable across environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Relationship to Reproducibility
&lt;/h2&gt;

&lt;p&gt;Adapters MUST NOT compromise reproducibility guarantees.&lt;/p&gt;

&lt;p&gt;Reproducibility is defined entirely by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FACET document&lt;/li&gt;
&lt;li&gt;inputs&lt;/li&gt;
&lt;li&gt;execution mode&lt;/li&gt;
&lt;li&gt;Canonical JSON&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters are excluded from the reproducibility contract.&lt;/p&gt;

&lt;p&gt;They are replaceable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design Rationale
&lt;/h2&gt;

&lt;p&gt;Why adapters are intentionally constrained:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;to prevent vendor lock-in&lt;/li&gt;
&lt;li&gt;to localize API churn&lt;/li&gt;
&lt;li&gt;to enable long-term replay and auditing&lt;/li&gt;
&lt;li&gt;to keep the compiler authoritative&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once adapters are allowed to "help", determinism collapses.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Adapters exist to answer one question only:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How does this Canonical JSON look in &lt;em&gt;this&lt;/em&gt; provider’s dialect?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Anything beyond that violates the contract.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;This document defines &lt;strong&gt;normative requirements&lt;/strong&gt; for FACET-compatible adapters.&lt;/p&gt;

&lt;p&gt;Any adapter violating these rules is &lt;strong&gt;non-compliant by design&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Tool-Calling Failure Modes</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:23:00 +0000</pubDate>
      <link>https://dev.to/rokoss21/tool-calling-failure-modes-1ofm</link>
      <guid>https://dev.to/rokoss21/tool-calling-failure-modes-1ofm</guid>
      <description>&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;This document catalogs &lt;strong&gt;real, recurring failure modes&lt;/strong&gt; observed in LLM tool-calling systems across major providers and agent frameworks.&lt;/p&gt;

&lt;p&gt;Its goal is to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;make failures &lt;em&gt;explicit and enumerable&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;demonstrate that these failures are &lt;strong&gt;systemic&lt;/strong&gt;, not user error&lt;/li&gt;
&lt;li&gt;show why post-hoc validation and retries are structurally insufficient&lt;/li&gt;
&lt;li&gt;define the problem space a deterministic contract layer must solve&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a critique of any single provider.&lt;br&gt;
It is a taxonomy of failure modes that emerge when &lt;strong&gt;probabilistic generation is asked to satisfy implicit contracts&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Observation
&lt;/h2&gt;

&lt;p&gt;Tool calling today fails not because models are weak, but because:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tool contracts are implicit, informal, and enforced only after generation.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;LLMs are expected to infer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;schema shape&lt;/li&gt;
&lt;li&gt;parameter types&lt;/li&gt;
&lt;li&gt;tool names&lt;/li&gt;
&lt;li&gt;sequencing rules&lt;/li&gt;
&lt;li&gt;provider-specific constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…without those constraints being part of the execution model.&lt;/p&gt;

&lt;p&gt;The result is a predictable set of failure classes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Failure Class 1: Schema Shape Violations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;The model produces a tool call whose JSON structure does not match the declared schema.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;missing required fields&lt;/li&gt;
&lt;li&gt;extra unexpected fields&lt;/li&gt;
&lt;li&gt;wrong nesting depth&lt;/li&gt;
&lt;li&gt;arrays where objects are expected&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real-World Symptoms
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Pydantic validation errors&lt;/li&gt;
&lt;li&gt;silent field dropping&lt;/li&gt;
&lt;li&gt;runtime exceptions after generation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Retries Fail
&lt;/h3&gt;

&lt;p&gt;Retries re-sample from the same unconstrained distribution.&lt;br&gt;
They reduce probability but &lt;strong&gt;do not eliminate invalid states&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Failure Class 2: Type Mismatches
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;The model emits values of the wrong type for otherwise valid fields.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;numbers as strings (&lt;code&gt;"42"&lt;/code&gt; instead of &lt;code&gt;42&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;booleans as text (&lt;code&gt;"true"&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;objects serialized as strings&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real-World Symptoms
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;deserialization failures&lt;/li&gt;
&lt;li&gt;silent coercion bugs&lt;/li&gt;
&lt;li&gt;inconsistent behavior across SDKs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Root Cause
&lt;/h3&gt;

&lt;p&gt;Schemas exist only as &lt;em&gt;instructions&lt;/em&gt;, not as &lt;strong&gt;constraints on generation&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Failure Class 3: Tool Name Drift
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;The model references a tool name that does not exactly match the declared identifier.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;casing drift (&lt;code&gt;process_payment&lt;/code&gt; → &lt;code&gt;Process_Payment&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;partial names (&lt;code&gt;search&lt;/code&gt; → &lt;code&gt;search_docs&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;hallucinated tool names&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Impact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;downstream dispatch failure&lt;/li&gt;
&lt;li&gt;silent no-op behavior&lt;/li&gt;
&lt;li&gt;hard-to-debug agent stalls&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Failure Class 4: Parameter Visibility Loss
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;Certain parameter shapes are ignored or dropped by provider APIs or SDK layers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;dict&lt;/code&gt; arguments not visible to OpenAI-powered agents&lt;/li&gt;
&lt;li&gt;binary payloads failing serialization&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Impact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;tools invoked with incomplete inputs&lt;/li&gt;
&lt;li&gt;agents behaving inconsistently between sync and stream modes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Root Cause
&lt;/h3&gt;

&lt;p&gt;Mismatch between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;declared tool schema&lt;/li&gt;
&lt;li&gt;provider transport format&lt;/li&gt;
&lt;li&gt;SDK serialization logic&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Failure Class 5: Sequencing Violations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;The model produces a &lt;em&gt;valid-looking&lt;/em&gt; tool call at an &lt;strong&gt;invalid point in the conversation&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Gemini requiring tool calls immediately after user or tool response turns&lt;/li&gt;
&lt;li&gt;tool calls emitted after assistant messages&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Symptoms
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;provider-side &lt;code&gt;INVALID_ARGUMENT&lt;/code&gt; errors&lt;/li&gt;
&lt;li&gt;conversation reset or termination&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why This Is Fundamental
&lt;/h3&gt;

&lt;p&gt;Sequencing rules are &lt;strong&gt;provider-specific&lt;/strong&gt; and &lt;strong&gt;not visible to the model&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Failure Class 6: Streaming vs Non-Streaming Drift
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;The same agent behaves differently in streaming and non-streaming modes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;tool calls appearing only in one mode&lt;/li&gt;
&lt;li&gt;different output shapes&lt;/li&gt;
&lt;li&gt;missing final tool invocation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Impact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;non-reproducible behavior&lt;/li&gt;
&lt;li&gt;broken production parity&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Failure Class 7: Multi-Tool Chain Collapse
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;Agents fail when chaining multiple tools in a single reasoning flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Symptoms
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;early termination&lt;/li&gt;
&lt;li&gt;partial execution&lt;/li&gt;
&lt;li&gt;invalid intermediate state&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Root Cause
&lt;/h3&gt;

&lt;p&gt;Each tool call compounds uncertainty.&lt;br&gt;
Without contracts, error probability grows multiplicatively.&lt;/p&gt;




&lt;h2&gt;
  
  
  Failure Class 8: Context-Induced Tool Corruption
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;Tool calls degrade as context grows or is truncated heuristically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;truncated tool schema&lt;/li&gt;
&lt;li&gt;partial parameter emission&lt;/li&gt;
&lt;li&gt;hallucinated defaults&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Root Cause
&lt;/h3&gt;

&lt;p&gt;Context overflow handled by truncation, not allocation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Validation and Retries Cannot Fix This
&lt;/h2&gt;

&lt;p&gt;Post-generation validation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;detects invalid states &lt;strong&gt;after they exist&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;cannot prevent invalid intermediate steps&lt;/li&gt;
&lt;li&gt;cannot guarantee convergence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Retries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reduce probability&lt;/li&gt;
&lt;li&gt;increase cost&lt;/li&gt;
&lt;li&gt;do not change the state space&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is equivalent to catching compiler errors at runtime.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Missing Layer
&lt;/h2&gt;

&lt;p&gt;All listed failures share one property:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;They occur because tool contracts are not part of the execution model.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A deterministic system must:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;encode schema, types, and sequencing &lt;em&gt;before generation&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;reject invalid states &lt;em&gt;before emission&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;treat provider constraints as first-class&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the problem space FACET addresses at the contract layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;This document defines an &lt;strong&gt;informative but implementation-grounded taxonomy&lt;/strong&gt; of tool-calling failures.&lt;/p&gt;

&lt;p&gt;It is intended to support:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;adapter design&lt;/li&gt;
&lt;li&gt;contract systems&lt;/li&gt;
&lt;li&gt;future standardization efforts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The failures described here are not hypothetical.&lt;br&gt;
They are observed, reproducible, and systemic.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Token Box Model</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:17:42 +0000</pubDate>
      <link>https://dev.to/rokoss21/token-box-model-1kkb</link>
      <guid>https://dev.to/rokoss21/token-box-model-1kkb</guid>
      <description>&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;The Token Box Model defines a &lt;strong&gt;deterministic context allocation algorithm&lt;/strong&gt; for LLM execution.&lt;/p&gt;

&lt;p&gt;Its purpose is to replace ad-hoc truncation, heuristic compression, and retry-based prompt handling with a &lt;strong&gt;formal, reproducible layout model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In FACET, context is not a side-effect of string concatenation — it is a &lt;strong&gt;compiled artifact&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem Statement
&lt;/h2&gt;

&lt;p&gt;Modern LLM systems fail under context pressure because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;token limits are enforced late (after prompt assembly)&lt;/li&gt;
&lt;li&gt;truncation is implicit and non-deterministic&lt;/li&gt;
&lt;li&gt;critical instructions may be silently dropped&lt;/li&gt;
&lt;li&gt;different runs drop different parts of context&lt;/li&gt;
&lt;li&gt;provider tokenizers behave differently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This leads to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;non-reproducible agent behavior&lt;/li&gt;
&lt;li&gt;debugging instability&lt;/li&gt;
&lt;li&gt;production-only failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Token Box Model addresses this by making &lt;strong&gt;context layout explicit, typed, and deterministic&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Concept
&lt;/h2&gt;

&lt;p&gt;The context is treated as a &lt;strong&gt;finite-capacity container&lt;/strong&gt; with a fixed token budget.&lt;/p&gt;

&lt;p&gt;Each logical block of prompt data is represented as a &lt;strong&gt;Section&lt;/strong&gt; with explicit layout constraints.&lt;/p&gt;

&lt;p&gt;The compiler is responsible for fitting all sections into the available budget &lt;strong&gt;without violating invariants&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Section Definition
&lt;/h2&gt;

&lt;p&gt;Each Section has the following properties:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;priority&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;int&lt;/td&gt;
&lt;td&gt;Removal order (lower = dropped earlier)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;base_size&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;int&lt;/td&gt;
&lt;td&gt;Token count after render&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;min&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;int&lt;/td&gt;
&lt;td&gt;Minimum guaranteed size&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;grow&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;float&lt;/td&gt;
&lt;td&gt;Weight for expansion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;shrink&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;float&lt;/td&gt;
&lt;td&gt;Weight for compression&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;strategy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;LensPipeline&lt;/td&gt;
&lt;td&gt;Compression strategy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Critical Sections
&lt;/h3&gt;

&lt;p&gt;A Section is &lt;strong&gt;Critical&lt;/strong&gt; if:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;shrink == 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Critical sections:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MUST NOT be compressed&lt;/li&gt;
&lt;li&gt;MUST NOT be truncated&lt;/li&gt;
&lt;li&gt;MUST NOT be dropped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If all critical sections do not fit, execution &lt;strong&gt;MUST fail&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deterministic Algorithm
&lt;/h2&gt;

&lt;p&gt;Let:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;S&lt;/code&gt; = all sections&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;B&lt;/code&gt; = token budget&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;size[i]&lt;/code&gt; = base_size of section &lt;code&gt;i&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 1 — Fixed Load
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Critical = { i | shrink[i] == 0 }
FixedLoad = sum(size[i] for i in Critical)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FixedLoad &amp;gt; B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;→ &lt;strong&gt;FAIL&lt;/strong&gt; with &lt;code&gt;ContextCriticalOverflow&lt;/code&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 2 — Free Space
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FreeSpace = B - FixedLoad
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 3 — Expansion (Optional)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Expandable = { i | grow[i] &amp;gt; 0 }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FreeSpace MAY be distributed proportionally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;extra[i] = FreeSpace * (grow[i] / sum(grow))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 4 — Compression
&lt;/h3&gt;

&lt;p&gt;If total size exceeds budget:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Deficit = total_size - B
Flexible = { i | shrink[i] &amp;gt; 0 }
Sort Flexible by (priority ASC, shrink DESC)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For each section:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Apply compression strategy&lt;/li&gt;
&lt;li&gt;Recompute size&lt;/li&gt;
&lt;li&gt;Truncate to &lt;code&gt;min&lt;/code&gt; if needed&lt;/li&gt;
&lt;li&gt;Drop section if still oversized&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Stop when &lt;code&gt;Deficit &amp;lt;= 0&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Determinism Guarantees
&lt;/h2&gt;

&lt;p&gt;Given:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;identical sections&lt;/li&gt;
&lt;li&gt;identical priorities&lt;/li&gt;
&lt;li&gt;identical token budget&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The resulting context layout is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;byte-for-byte identical&lt;/li&gt;
&lt;li&gt;order-stable&lt;/li&gt;
&lt;li&gt;provider-independent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes context &lt;strong&gt;cacheable, diffable, and replayable&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Without a formal layout model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retries hide bugs&lt;/li&gt;
&lt;li&gt;prompt behavior drifts&lt;/li&gt;
&lt;li&gt;context loss is invisible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the Token Box Model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;failures are explicit&lt;/li&gt;
&lt;li&gt;critical instructions are protected&lt;/li&gt;
&lt;li&gt;behavior is reproducible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This turns context handling from a heuristic into an &lt;strong&gt;engineering discipline&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Relationship to FACET Execution
&lt;/h2&gt;

&lt;p&gt;The Token Box Model is executed in:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 4 — Layout&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Inputs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;computed variable values&lt;/li&gt;
&lt;li&gt;rendered sections&lt;/li&gt;
&lt;li&gt;token budget&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;finalized ordered context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any violation aborts execution before provider interaction.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design Principle
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Context is not text.&lt;br&gt;
Context is a resource.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Token Box Model makes that resource explicit, bounded, and deterministic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;This document defines the &lt;strong&gt;normative Token Box Model&lt;/strong&gt; for FACET v2.0 and later.&lt;/p&gt;

&lt;p&gt;All compliant implementations MUST follow this algorithm when performing context layout.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Adapter Philosophy</title>
      <dc:creator>rokoss21</dc:creator>
      <pubDate>Tue, 16 Dec 2025 23:12:05 +0000</pubDate>
      <link>https://dev.to/rokoss21/adapter-philosophy-65h</link>
      <guid>https://dev.to/rokoss21/adapter-philosophy-65h</guid>
      <description>&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;This document defines the &lt;strong&gt;architectural role and strict limitations&lt;/strong&gt; of provider adapters in the FACET ecosystem.&lt;/p&gt;

&lt;p&gt;Adapters exist to translate &lt;strong&gt;Canonical JSON&lt;/strong&gt; into provider-specific payloads. They are not execution engines, not sources of truth, and not places where logic is allowed to accumulate.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Adapters are translators, not decision-makers.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This principle is foundational to FACET’s long-term correctness, reproducibility, and vendor independence.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Rule
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;All semantic decisions MUST be completed before an adapter is invoked.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once Canonical JSON exists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No logic may be added&lt;/li&gt;
&lt;li&gt;No structure may be inferred&lt;/li&gt;
&lt;li&gt;No defaults may be applied&lt;/li&gt;
&lt;li&gt;No recovery heuristics may run&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adapters perform &lt;strong&gt;mechanical transformation only&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Adapters Must Be Dumb
&lt;/h2&gt;

&lt;p&gt;Modern LLM stacks routinely collapse because adapters grow "helpful" behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;filling in missing fields&lt;/li&gt;
&lt;li&gt;renaming tools dynamically&lt;/li&gt;
&lt;li&gt;reordering messages to satisfy undocumented rules&lt;/li&gt;
&lt;li&gt;retrying failed calls with mutated payloads&lt;/li&gt;
&lt;li&gt;patching provider quirks ad hoc&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates systems where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;behavior differs per provider&lt;/li&gt;
&lt;li&gt;bugs cannot be reproduced&lt;/li&gt;
&lt;li&gt;audits become impossible&lt;/li&gt;
&lt;li&gt;fixes introduce new regressions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FACET treats this as an &lt;strong&gt;architectural failure&lt;/strong&gt;, not an implementation detail.&lt;/p&gt;




&lt;h2&gt;
  
  
  Canonical JSON as the Contract Boundary
&lt;/h2&gt;

&lt;p&gt;Adapters consume Canonical JSON and emit provider payloads.&lt;/p&gt;

&lt;p&gt;They MUST treat Canonical JSON as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;immutable&lt;/li&gt;
&lt;li&gt;complete&lt;/li&gt;
&lt;li&gt;authoritative&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If a provider rejects a payload derived from valid Canonical JSON, the adapter MUST fail loudly.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Adapters are not allowed to “make it work”.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Failure is information. Mutation is corruption.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Adapters Are Allowed To Do
&lt;/h2&gt;

&lt;p&gt;Adapters MAY:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rename fields to match provider APIs&lt;/li&gt;
&lt;li&gt;transform message layouts (e.g. roles → blocks)&lt;/li&gt;
&lt;li&gt;map FACET interfaces to provider tool schemas&lt;/li&gt;
&lt;li&gt;attach provider-required metadata&lt;/li&gt;
&lt;li&gt;split or merge fields when explicitly specified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All transformations MUST be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deterministic&lt;/li&gt;
&lt;li&gt;stateless&lt;/li&gt;
&lt;li&gt;reversible in principle&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Adapters Are Forbidden To Do
&lt;/h2&gt;

&lt;p&gt;Adapters MUST NOT:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;infer missing values&lt;/li&gt;
&lt;li&gt;change execution order&lt;/li&gt;
&lt;li&gt;modify tool arguments&lt;/li&gt;
&lt;li&gt;drop or add messages&lt;/li&gt;
&lt;li&gt;reinterpret context priorities&lt;/li&gt;
&lt;li&gt;retry with modified payloads&lt;/li&gt;
&lt;li&gt;apply provider-specific heuristics silently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any of the above breaks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;determinism&lt;/li&gt;
&lt;li&gt;reproducibility&lt;/li&gt;
&lt;li&gt;trust&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Failure Containment Model
&lt;/h2&gt;

&lt;p&gt;FACET intentionally localizes all provider-specific failures to the adapter layer.&lt;/p&gt;

&lt;p&gt;If a provider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;changes an API&lt;/li&gt;
&lt;li&gt;introduces undocumented constraints&lt;/li&gt;
&lt;li&gt;breaks streaming semantics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Canonical JSON remains valid&lt;/li&gt;
&lt;li&gt;stored executions remain replayable&lt;/li&gt;
&lt;li&gt;only the adapter needs updating&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This sharply bounds blast radius and prevents systemic corruption.&lt;/p&gt;




&lt;h2&gt;
  
  
  Adapters vs Frameworks
&lt;/h2&gt;

&lt;p&gt;Most agent frameworks embed logic &lt;em&gt;inside&lt;/em&gt; provider integrations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent Logic
  ↓
Provider Wrapper
  ↓
Model
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FACET inverts this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FACET Compiler
  ↓
Canonical JSON (IR)
  ↓
Adapter (View)
  ↓
Provider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This inversion is what makes determinism possible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Adapters Are Replaceable
&lt;/h2&gt;

&lt;p&gt;Because adapters are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stateless&lt;/li&gt;
&lt;li&gt;mechanical&lt;/li&gt;
&lt;li&gt;non-authoritative&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They can be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;swapped&lt;/li&gt;
&lt;li&gt;versioned independently&lt;/li&gt;
&lt;li&gt;rewritten without touching agent logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vendor lock-in becomes structurally impossible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design Principle
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If fixing a bug requires changing adapter logic, the bug was upstream.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Adapters reveal incompatibilities; they do not hide them.&lt;/p&gt;

&lt;p&gt;This is the only sustainable way to build systems that survive provider churn.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;This document defines the &lt;strong&gt;normative adapter philosophy&lt;/strong&gt; for FACET-based systems.&lt;/p&gt;

&lt;p&gt;Any implementation that embeds decision-making logic inside adapters is &lt;strong&gt;non-compliant by design&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>design</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
