<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Valentyn Solomko</title>
    <description>The latest articles on DEV Community by Valentyn Solomko (@valpere).</description>
    <link>https://dev.to/valpere</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3672579%2F95144aa7-82c1-4468-a137-0a5f885c5604.jpg</url>
      <title>DEV Community: Valentyn Solomko</title>
      <link>https://dev.to/valpere</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/valpere"/>
    <language>en</language>
    <item>
      <title>AI Agent Cost Optimization — Sonnet Hybrid (Haiku + OpenRouter)</title>
      <dc:creator>Valentyn Solomko</dc:creator>
      <pubDate>Mon, 27 Apr 2026 10:56:33 +0000</pubDate>
      <link>https://dev.to/valpere/ai-agent-cost-optimization-sonnet-hybrid-haiku-openrouter-3aa3</link>
      <guid>https://dev.to/valpere/ai-agent-cost-optimization-sonnet-hybrid-haiku-openrouter-3aa3</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9qi6b37gi7v0hhrxly6l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9qi6b37gi7v0hhrxly6l.png" alt="Visualizes the transition from an expensive scheme to a new hybrid architecture and shows the result" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Executive summary
&lt;/h2&gt;

&lt;p&gt;Our subagents (&lt;code&gt;.claude/agents/*.md&lt;/code&gt;) all run on &lt;strong&gt;Sonnet 4.6&lt;/strong&gt; by default at $3 / $15 per 1M (input/output). That's the single biggest line on our AI spend. This doc lays out a hybrid architecture where &lt;strong&gt;most subagents run on Haiku 4.5 ($1 / $5)&lt;/strong&gt; and offload heavy generation/analysis to &lt;strong&gt;OpenRouter&lt;/strong&gt; (paid key, already in &lt;code&gt;.env.local&lt;/code&gt;) via the existing &lt;code&gt;/fix-review&lt;/code&gt; REST plumbing. A handful of high-risk agents stay on Sonnet.&lt;/p&gt;

&lt;p&gt;Expected savings at full rollout: &lt;strong&gt;≈ 70–85 %&lt;/strong&gt; of the current subagent spend, with eq­ui­valent quality for analytical tasks and gated quality for generational tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;What exists today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;11 subagents&lt;/strong&gt; declared in &lt;code&gt;.claude/agents/*.md&lt;/code&gt;, all on Sonnet by default.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/fix-review&lt;/code&gt; skill&lt;/strong&gt; already delegates 3 model rounds to OpenRouter (DeepSeek V3.2, Qwen3 Coder Next, Grok 4.1 Fast) and only calls Sonnet for the Arbiter round. That's the template.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.claude/skills/lib/rest.sh&lt;/code&gt;&lt;/strong&gt; exposes &lt;code&gt;chat_payload&lt;/code&gt; + &lt;code&gt;rest_post&lt;/code&gt; — provider-agnostic REST helpers, already wired for OpenRouter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI agents installed&lt;/strong&gt;: &lt;code&gt;opencode&lt;/code&gt;, &lt;code&gt;cursor-agent&lt;/code&gt;, &lt;code&gt;kilo&lt;/code&gt;, &lt;code&gt;codex&lt;/code&gt;. Each supports headless invocation and &lt;code&gt;--model&lt;/code&gt; selection. We used &lt;code&gt;codex exec&lt;/code&gt; for the 5 Tenancy v3 ship PRs while Claude's sub-agent quota was exhausted — that confirmed the hybrid pattern works.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;OPENROUTER_API_KEY&lt;/code&gt; is &lt;strong&gt;paid&lt;/strong&gt; — no free-tier rate-limit anxiety.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's wasteful today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Subagents like &lt;code&gt;code-simplifier&lt;/code&gt;, &lt;code&gt;docs-maintainer&lt;/code&gt;, &lt;code&gt;pm-issue-writer&lt;/code&gt; do small, structured prose tasks on Sonnet. A Gemini Flash Lite at $0.037 / $0.15 would be ~100× cheaper and equally good for that workload.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;code-generator&lt;/code&gt; spends Sonnet tokens on file IO loops that an OpenRouter-backed CLI (Codex / Opencode) can handle for ~10× less.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Provider/model inventory (verified 2026-04-22)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  OpenRouter (paid, direct REST via &lt;code&gt;lib/rest.sh&lt;/code&gt;)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;$/1M in&lt;/th&gt;
&lt;th&gt;$/1M out&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;deepseek/deepseek-v3.2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0.26&lt;/td&gt;
&lt;td&gt;0.38&lt;/td&gt;
&lt;td&gt;Code generation, refactor, security review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;qwen/qwen3-coder-next&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;0.80&lt;/td&gt;
&lt;td&gt;TS-heavy code, test generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;x-ai/grok-4.1-fast&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;td&gt;Fast parallel review vote&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;google/gemini-2.5-flash&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0.075&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;td&gt;Docs, ADRs, prose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;google/gemini-2.5-flash-lite&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0.037&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;Lint classification, small structured calls&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Baseline comparison:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude &lt;strong&gt;Sonnet 4.6&lt;/strong&gt;: $3 / $15&lt;/li&gt;
&lt;li&gt;Claude &lt;strong&gt;Haiku 4.5&lt;/strong&gt;: $1 / $5&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  CLIs (useful when we need a built-in tool-use loop with file edits)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CLI&lt;/th&gt;
&lt;th&gt;Headless command&lt;/th&gt;
&lt;th&gt;Cheap / free options&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;opencode&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;opencode run --model X "prompt"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;opencode/gpt-5-nano&lt;/code&gt;, &lt;code&gt;opencode/*-free&lt;/code&gt;, or any &lt;code&gt;google/*&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Good for multi-file tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kilo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;kilo run --model X "message"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;kilo/kilo-auto/free&lt;/code&gt;, &lt;code&gt;kilo/openrouter/free&lt;/code&gt;, &lt;code&gt;kilo/x-ai/grok-code-fast-1:optimized:free&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Thin over OpenRouter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cursor-agent&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cursor-agent --model X -p "prompt"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;composer-2-fast&lt;/code&gt; (Cursor Pro), &lt;code&gt;gpt-5.3-codex-*&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Good for large-scale rewrites&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;codex&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;codex exec -C &amp;lt;dir&amp;gt; -s danger-full-access "prompt"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OpenAI &lt;code&gt;gpt-5.4&lt;/code&gt; default&lt;/td&gt;
&lt;td&gt;Needs &lt;code&gt;-s danger-full-access&lt;/code&gt; — bubblewrap sandbox unreliable on this host&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Route CLIs through &lt;code&gt;--model openrouter/&amp;lt;id&amp;gt;&lt;/code&gt; when we want them to hit the same OR quota.&lt;/p&gt;

&lt;h2&gt;
  
  
  Target architecture
&lt;/h2&gt;

&lt;p&gt;Three subagent classes, three patterns:&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern A — Skill replaces subagent (direct OR REST)
&lt;/h3&gt;

&lt;p&gt;For &lt;strong&gt;pure analytical&lt;/strong&gt; subagents whose input is a small artifact (diff, lint output, hook source) and whose output is a structured text blob (report, suggestion list, simplified file).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;parent Claude → Skill &lt;span class="o"&gt;(&lt;/span&gt;thin bash&lt;span class="o"&gt;)&lt;/span&gt; → rest_post to OpenRouter → deterministic output
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No agentic loop, no Sonnet at all. Example: &lt;code&gt;code-simplifier&lt;/code&gt; → &lt;code&gt;/simplify&lt;/code&gt; skill.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern B — Haiku orchestrator + direct OR REST
&lt;/h3&gt;

&lt;p&gt;For subagents that need a &lt;strong&gt;validation gate&lt;/strong&gt;: generate output with a cheap model, then have a small, cheap reasoner verify the shape, run tests / decide the next step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Haiku orchestrator subagent
  → OR REST (e.g. qwen3-coder-next) to generate candidate
  → Haiku validates shape + runs verifier
  → success: commit; failure: retry with better prompt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Applies to &lt;code&gt;test-generator&lt;/code&gt;, &lt;code&gt;security-reviewer&lt;/code&gt; (Haiku + &lt;code&gt;chorus review&lt;/code&gt;), &lt;code&gt;static-analysis&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern C — Haiku orchestrator + CLI worker (tool-use loop)
&lt;/h3&gt;

&lt;p&gt;When the task requires &lt;strong&gt;real file IO and iterative tool use&lt;/strong&gt; (multi-file edits, builds, tests in a loop), delegate to a CLI that has a built-in tool-use loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Haiku orchestrator subagent
  → codex exec / opencode run (routed through openrouter/deepseek-v3.2)
  → Haiku reviews diff + lint + test output before commit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Applies to &lt;code&gt;code-generator&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern D — Keep Sonnet
&lt;/h3&gt;

&lt;p&gt;For subagents where project-specific conventions in &lt;code&gt;CLAUDE.md&lt;/code&gt; matter more than cost, and where mistakes are high-blast-radius.&lt;/p&gt;

&lt;p&gt;Applies to &lt;code&gt;bug-fixer&lt;/code&gt;, &lt;code&gt;migration-generator&lt;/code&gt;, &lt;code&gt;project-devops&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Per-agent decision matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Primary model&lt;/th&gt;
&lt;th&gt;Validation gate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;code-generator&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;openrouter/deepseek/deepseek-v3.2&lt;/code&gt; via &lt;code&gt;codex exec&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Haiku 4.5 reviews diff; runs lint + tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;test-generator&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;&lt;code&gt;openrouter/qwen/qwen3-coder-next&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Haiku 4.5 checks shape; runs &lt;code&gt;npm run test&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;code-simplifier&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A (skill)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;openrouter/deepseek/deepseek-v3.2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;None — output applied directly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;security-reviewer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;chorus review&lt;/code&gt; (3 OR models parallel)&lt;/td&gt;
&lt;td&gt;Haiku 4.5 synthesizes findings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;static-analysis&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;&lt;code&gt;openrouter/google/gemini-2.5-flash-lite&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Haiku 4.5 classifies safe vs unsafe fixes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docs-maintainer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;&lt;code&gt;openrouter/google/gemini-2.5-flash&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Haiku 4.5 lints prose against CLAUDE.md style&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pm-issue-writer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;&lt;code&gt;openrouter/google/gemini-2.5-flash&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Haiku 4.5 checks OB Base compliance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ci-build-agent&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;B (Haiku-only, no OR)&lt;/td&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;Small context; no delegation needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bug-fixer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;D&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Sonnet 4.6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Iterative debugging needs full context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;migration-generator&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;D&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Sonnet 4.6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;RLS / immutability conventions are subtle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;project-devops&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;D&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Sonnet 4.6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SSH / TLS / DB — prod-blast-radius&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Migration order
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;code-simplifier&lt;/code&gt;&lt;/strong&gt; → Pattern A skill. Smallest scope. Baseline measurement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;static-analysis&lt;/code&gt;&lt;/strong&gt; → Pattern B. Similar surface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;docs-maintainer&lt;/code&gt;&lt;/strong&gt; → Pattern B. Prose on Gemini Flash.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;pm-issue-writer&lt;/code&gt;&lt;/strong&gt; → Pattern B. Template work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;security-reviewer&lt;/code&gt;&lt;/strong&gt; → Pattern B wrapped around existing &lt;code&gt;chorus review&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;test-generator&lt;/code&gt;&lt;/strong&gt; → Pattern B with test execution loop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;code-generator&lt;/code&gt;&lt;/strong&gt; → Pattern C with Codex backend. Largest, highest-risk.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;ci-build-agent&lt;/code&gt;&lt;/strong&gt; → Pattern B (Haiku-only).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;At each step:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ship one agent transition as its own PR.&lt;/li&gt;
&lt;li&gt;Measure 3–5 real-task invocations before rolling to the next agent.&lt;/li&gt;
&lt;li&gt;Metrics: &lt;strong&gt;token cost per invocation&lt;/strong&gt;, &lt;strong&gt;success rate&lt;/strong&gt; (did the output commit cleanly?), &lt;strong&gt;human-correction rate&lt;/strong&gt; (did we have to fix the output manually?).&lt;/li&gt;
&lt;li&gt;If the success rate drops &amp;gt; 10 % vs the Sonnet baseline, stop and reassess.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Skills with the same treatment
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/fix-review&lt;/code&gt; — already done.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/task&lt;/code&gt;, &lt;code&gt;/backlog&lt;/code&gt;, &lt;code&gt;/pm-issue-writer&lt;/code&gt; — Haiku is enough; no OR delegation needed.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/ship&lt;/code&gt; — add &lt;code&gt;--cheap&lt;/code&gt; flag that swaps its code-generator step to Pattern C.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/find-bugs&lt;/code&gt; — delegate to &lt;code&gt;chorus debug&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/review&lt;/code&gt; — already uses parallel agents.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Risks
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Haiku context awareness is lower than Sonnet.&lt;/strong&gt; Prompts must inline the most relevant &lt;code&gt;CLAUDE.md&lt;/code&gt; / &lt;code&gt;docs/*&lt;/code&gt; section rather than assume the agent "knows the codebase". Mitigation: for each agent that moves, expand the system prompt to include the exact conventions it must adhere to (e.g., stale-time rules, RLS patterns).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenRouter model drift.&lt;/strong&gt; A model string that works today may change default behavior next week (e.g., &lt;code&gt;deepseek-v3.2&lt;/code&gt; swaps back-end). Pin models explicitly and record the version in &lt;code&gt;config.yaml&lt;/code&gt; / telemetry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI sandboxing quirks.&lt;/strong&gt; &lt;code&gt;codex exec&lt;/code&gt; needed &lt;code&gt;-s danger-full-access&lt;/code&gt; this session (bubblewrap failure on this host). Tests in CI may hit similar issues. Each CLI migration needs a &lt;code&gt;--dry-run&lt;/code&gt; (smoke) test.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Free vs paid confusion.&lt;/strong&gt; Free &lt;code&gt;:free&lt;/code&gt; tiers have rate limits that silently 429. Migration MUST use paid &lt;code&gt;openrouter/*&lt;/code&gt; model IDs and must NOT mix with CLIs in their free default mode (&lt;code&gt;kilo/kilo-auto/free&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rollback coupling.&lt;/strong&gt; If a migrated agent regresses and we revert it to Sonnet, we need to keep the Pattern-A skill / Pattern-B wrapper in place for downstream callers. So: &lt;strong&gt;don't delete Sonnet-agent definitions&lt;/strong&gt; until the hybrid has run cleanly for at least a week.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log attribution.&lt;/strong&gt; When OR models do the work, our existing &lt;code&gt;.claude/skills/fix-review/telemetry.jsonl&lt;/code&gt; records cost per-PR. We need an equivalent file per migrated subagent, or at least a shared one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output parsing tolerance.&lt;/strong&gt; Cheap models return messy output (no JSON mode guarantees across providers). Orchestrators must tolerate trailing prose, markdown fences around JSON, etc.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Non-goals / out of scope
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Rewriting &lt;code&gt;/fix-review&lt;/code&gt; — it already does this pattern.&lt;/li&gt;
&lt;li&gt;Moving &lt;code&gt;bug-fixer&lt;/code&gt;, &lt;code&gt;migration-generator&lt;/code&gt;, &lt;code&gt;project-devops&lt;/code&gt; to any cheaper model. Their blast radius dominates their spend.&lt;/li&gt;
&lt;li&gt;Replacing the orchestrator Claude (top-level) with Haiku. The top-level agent needs to read this doc, understand the whole session, and make decisions — Sonnet stays there.&lt;/li&gt;
&lt;li&gt;Switching CLI worker choice &lt;em&gt;inside&lt;/em&gt; a single PR / session. Pick one backend per migration and stick with it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Verification/acceptance
&lt;/h2&gt;

&lt;p&gt;The overhaul is successful when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;8 of 11 subagents run on Haiku + OR delegate by default.&lt;/li&gt;
&lt;li&gt;Telemetry shows ≥ 50 % cost reduction on subagent spend over a rolling 7-day window.&lt;/li&gt;
&lt;li&gt;Human-correction rate (PRs where a sub-agent-produced change had to be manually fixed before merge) stays within 15 % of the Sonnet baseline.&lt;/li&gt;
&lt;li&gt;No regressions in security reviews (security-reviewer catches the same class of findings it did on Sonnet — validated by a manual replay of 3 recent PRs).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/ai-agent-cost-optimization.md&lt;/code&gt; and each migrated agent's frontmatter reflect the new model + pattern.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.claude/skills/fix-review/config.yaml&lt;/code&gt; — provider/model routing template.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/skills/lib/rest.sh&lt;/code&gt; — REST helpers.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/agents/*.md&lt;/code&gt; — current subagent definitions.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;~/wrk/projects/chorus/chorus/plugins/chorus/scripts/companion.mjs&lt;/code&gt; — multi-CLI review pipeline.&lt;/li&gt;
&lt;li&gt;OpenRouter pricing: &lt;a href="https://openrouter.ai/models" rel="noopener noreferrer"&gt;https://openrouter.ai/models&lt;/a&gt; (snapshot 2026-04-22).&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>claude</category>
    </item>
    <item>
      <title>Chorus: letting AI coding CLIs review each other</title>
      <dc:creator>Valentyn Solomko</dc:creator>
      <pubDate>Fri, 17 Apr 2026 12:41:02 +0000</pubDate>
      <link>https://dev.to/valpere/chorus-letting-ai-coding-clis-review-each-other-503g</link>
      <guid>https://dev.to/valpere/chorus-letting-ai-coding-clis-review-each-other-503g</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18z636at6ldo0gtns6t6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18z636at6ldo0gtns6t6.png" alt="Agent delegation mesh diagram" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I use several AI coding CLIs depending on the task.&lt;/p&gt;

&lt;p&gt;Claude Code is good at one kind of workflow. OpenCode has its own shape. Gemini CLI is useful when I want another model family in the loop. Codex is often strong when I need a second implementation or review pass.&lt;/p&gt;

&lt;p&gt;The annoying part is not the models. The annoying part is switching tools.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;chorus&lt;/code&gt; is my attempt to remove that friction.&lt;/p&gt;

&lt;p&gt;It is an open-source cross-agent plugin collection for four AI coding CLIs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code&lt;/li&gt;
&lt;li&gt;OpenCode&lt;/li&gt;
&lt;li&gt;Gemini CLI&lt;/li&gt;
&lt;li&gt;Codex&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The idea is simple: from the tool I am already using, I should be able to delegate a task to the other agents.&lt;/p&gt;

&lt;p&gt;That creates a &lt;strong&gt;4×3 mesh&lt;/strong&gt;. Each agent can call the other three.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it looks like in practice
&lt;/h2&gt;

&lt;p&gt;From Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/gemini:review Review this diff for hidden edge cases and missing tests.
/codex:run Add regression tests for the parser bug we just fixed.
/opencode:run Try a smaller refactor of the auth middleware without changing behavior.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From OpenCode, the same idea is exposed through MCP tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;delegate_claude
delegate_gemini
delegate_codex
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gemini CLI and Codex get skills installed so they can delegate in any direction too.&lt;/p&gt;

&lt;h2&gt;
  
  
  The main use case: parallel review
&lt;/h2&gt;

&lt;p&gt;Instead of asking one agent "is this fine?", ask three different agents to review the same change independently.&lt;/p&gt;

&lt;p&gt;Different agents have different failure modes. One will over-focus on architecture. Another will catch a small test gap. Another will suggest a simpler implementation. Often one of them is wrong. That is fine. The value is in having multiple independent passes without leaving the terminal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/gemini:review Check correctness and missed edge cases.
/codex:run Review test coverage and suggest missing cases.
/opencode:run Look for simplifications and risky abstractions.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not about pretending agents are teammates. It is about using model disagreement as a tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design philosophy
&lt;/h2&gt;

&lt;p&gt;The important design constraint for &lt;code&gt;chorus&lt;/code&gt; is that it does not try to become a new AI IDE or orchestration platform. It is glue.&lt;/p&gt;

&lt;p&gt;One install gives you access to the other agents from your preferred tool. Claude Code gets slash commands. OpenCode gets MCP tools. Gemini CLI and Codex get skills.&lt;/p&gt;

&lt;p&gt;Keep using the interface you already like, but stop treating each CLI as an isolated island.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude Code&lt;/span&gt;
claude plugin &lt;span class="nb"&gt;install &lt;/span&gt;https://github.com/valpere/chorus

&lt;span class="c"&gt;# OpenCode&lt;/span&gt;
opencode plugin @valpere/chorus-opencode

&lt;span class="c"&gt;# Gemini CLI&lt;/span&gt;
gemini skills &lt;span class="nb"&gt;install &lt;/span&gt;https://github.com/valpere/chorus &lt;span class="nt"&gt;--path&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="nt"&gt;-gemini&lt;/span&gt;/claude
gemini skills &lt;span class="nb"&gt;install &lt;/span&gt;https://github.com/valpere/chorus &lt;span class="nt"&gt;--path&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="nt"&gt;-gemini&lt;/span&gt;/opencode
gemini skills &lt;span class="nb"&gt;install &lt;/span&gt;https://github.com/valpere/chorus &lt;span class="nt"&gt;--path&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="nt"&gt;-gemini&lt;/span&gt;/codex
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I built this because my own workflow had become repetitive — make a change in one CLI, copy context into another, ask for a review, manually bring the useful parts back. It worked, but it was clumsy.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;chorus&lt;/code&gt; turns that into a normal command.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;strong&gt;&lt;a href="https://github.com/valpere/chorus" rel="noopener noreferrer"&gt;https://github.com/valpere/chorus&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you already use more than one AI coding CLI, this may fit your workflow without asking you to change it. If you only use one, multi-agent review may still be worth trying on risky changes. A second opinion from a different agent is often cheaper than debugging the same blind spot later.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Valentyn Solomko — Ukrainian software engineer&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devtools</category>
      <category>opensource</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>Unwanted gifts.

When you are asked to complete a test task with an unknown repo using VSCode, look in `.vscode/`. You may see `tasks.json` there. Note that there is no comma after `"command":`.</title>
      <dc:creator>Valentyn Solomko</dc:creator>
      <pubDate>Thu, 08 Jan 2026 15:34:01 +0000</pubDate>
      <link>https://dev.to/valpere/unwanted-gifts-when-you-are-asked-to-complete-a-test-task-with-an-unknown-repo-using-vscode-1i5c</link>
      <guid>https://dev.to/valpere/unwanted-gifts-when-you-are-asked-to-complete-a-test-task-with-an-unknown-repo-using-vscode-1i5c</guid>
      <description></description>
    </item>
    <item>
      <title>AI as a Development Tool</title>
      <dc:creator>Valentyn Solomko</dc:creator>
      <pubDate>Wed, 24 Dec 2025 14:42:15 +0000</pubDate>
      <link>https://dev.to/valpere/ai-as-a-development-tool-4h9a</link>
      <guid>https://dev.to/valpere/ai-as-a-development-tool-4h9a</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"...the greatest part of the questions and controversies that perplex mankind depending on the doubtful and uncertain use of words, or (which is the same) indetermined ideas..."&lt;/em&gt;&lt;br&gt;
-- John Locke, &lt;em&gt;An Essay Concerning Human Understanding&lt;/em&gt;, The epistle to the reader (1690)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  💡 What Is This About?
&lt;/h2&gt;

&lt;p&gt;This year, many terms have emerged describing a new development tool -- AI.&lt;/p&gt;

&lt;p&gt;Immediately, as always, discussions began about the taste of different-colored pencils, because some believe they can derive pleasure from it, while others think it's only for production, and even then, only after the church's blessing.&lt;/p&gt;

&lt;p&gt;In principle, you can take any rules or best practices of development or engineering and replace the name of any tool with "AI," and you can smack your opponents right on the head with them.&lt;/p&gt;

&lt;p&gt;While willing experts are feeling up this elephant and arguing about what they've touched or felt, let's try to establish some terminology.&lt;/p&gt;

&lt;p&gt;We need something simple and understandable, so we can immediately point to where we are on the map.&lt;/p&gt;

&lt;p&gt;I, too, can use smart words, so let's try a paradigmatic approach.&lt;/p&gt;

&lt;p&gt;From here on -- no irony. Let's try to establish working definitions.&lt;/p&gt;

&lt;p&gt;The goal of this document is not to evaluate AI as "good/bad," but to provide a common language for discussion: what exactly we're doing now, what level of risk we're accepting, and which practices are appropriate for the chosen approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 Paradigms: Brief Descriptions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🔍 Vibe Coding
&lt;/h3&gt;

&lt;p&gt;Vibe Coding is an approach where a developer describes desired functionality in human-understandable language and generates code using AI without detailed review or editing of the result. The emphasis is on experimentation, rapid prototyping, and trust in AI rather than on manually writing or checking each line of code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intent → AI → Code&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI writes code based on high-level intent&lt;/li&gt;
&lt;li&gt;Minimal review, minimal structure&lt;/li&gt;
&lt;li&gt;Maximum speed, maximum risk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prototypes&lt;/li&gt;
&lt;li&gt;Demos&lt;/li&gt;
&lt;li&gt;Spikes&lt;/li&gt;
&lt;li&gt;One-off code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Never use for&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Production systems&lt;/li&gt;
&lt;li&gt;Core logic&lt;/li&gt;
&lt;li&gt;Security-sensitive paths&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 AI-Assisted Development
&lt;/h3&gt;

&lt;p&gt;AI-assisted development (AIAD) is the use of artificial intelligence tools to support the developer at various stages of development: from writing code, testing, and debugging to optimization and automation of repetitive tasks. The developer remains an active participant in the process, and AI acts as an assistant that offers ideas, automates routine tasks, and improves code quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human-led development using AI as a tool&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Human owns architecture and decisions&lt;/li&gt;
&lt;li&gt;AI accelerates implementation&lt;/li&gt;
&lt;li&gt;Standard reviews and testing apply&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing production features&lt;/li&gt;
&lt;li&gt;Maintaining long-lived codebases&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 AI-Powered Pair Programming
&lt;/h3&gt;

&lt;p&gt;AI assistants work alongside the developer in real time, suggesting alternatives, detecting errors, optimizing code, and even generating new features based on context. This approach reduces debugging time and improves project architecture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human ⇄ AI in real time&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Continuous dialogue&lt;/li&gt;
&lt;li&gt;AI suggests, human decides&lt;/li&gt;
&lt;li&gt;Comparable to pair programming with a strong junior/mid developer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Day-to-day development&lt;/li&gt;
&lt;li&gt;Learning unfamiliar codebases or languages&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 Generative AI for Specification and Design
&lt;/h3&gt;

&lt;p&gt;AI is used for automatic generation of architectural decisions, diagrams, documentation, and even test scenarios based on requirements. This allows for quickly obtaining a system prototype and verifying its compliance with business requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requirements → AI → Specifications / Diagrams&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI helps &lt;em&gt;before&lt;/em&gt; coding&lt;/li&gt;
&lt;li&gt;Generates draft specifications, architecture diagrams, and test plans&lt;/li&gt;
&lt;li&gt;Human reviews and refines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;System design&lt;/li&gt;
&lt;li&gt;Architecture exploration&lt;/li&gt;
&lt;li&gt;Early planning stages&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 AI-Augmented Spec-Driven Development
&lt;/h3&gt;

&lt;p&gt;AI-Augmented Spec-Driven Development is a paradigm where structured specifications (requirements, architectural constraints, acceptance criteria, etc.) are the primary source for AI-powered code generation. These specifications become the single source of truth that AI transforms into implementation, tests, and documentation. This approach ensures high alignment between requirements and implementation, reduces error risks, and increases development speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specification → AI → Code + Tests&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Specification is the source of truth&lt;/li&gt;
&lt;li&gt;AI generates code and tests based on specifications&lt;/li&gt;
&lt;li&gt;Highest predictability and maintainability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Core business logic&lt;/li&gt;
&lt;li&gt;Financial, security, or regulated systems&lt;/li&gt;
&lt;li&gt;Large codebases with long lifespans&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 AI-Driven Development (AIDD)
&lt;/h3&gt;

&lt;p&gt;This is a paradigm where AI is deeply integrated into all stages of development -- from design and code writing to testing, optimization, and even deployment. The developer and AI work as partners, significantly increasing productivity and code quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human + AI jointly drive the process&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI participates in design, coding, testing, and optimization&lt;/li&gt;
&lt;li&gt;Human remains responsible for strategy and constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams are intentionally adopting AI into SDLC&lt;/li&gt;
&lt;li&gt;Clear constraints and evaluation processes exist&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 Autonomous Refactoring and Adaptive Coding
&lt;/h3&gt;

&lt;p&gt;AI automatically analyzes and refactors code, identifying potential problems, suggesting optimizations, and adapting to changes in the project. Such systems can learn from large amounts of code to better understand the project's context and requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Existing code → AI → Improved code&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI refactors, optimizes, and reduces technical debt&lt;/li&gt;
&lt;li&gt;Operates under strict rules and is subject to evaluation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Controlled refactoring&lt;/li&gt;
&lt;li&gt;Performance optimization&lt;/li&gt;
&lt;li&gt;Style and consistency improvements&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 Pipeline Synthesis and Security Scanning Orchestration
&lt;/h3&gt;

&lt;p&gt;AI can automatically generate CI/CD configurations, analyze code security, identify technical debt, and suggest ways to eliminate it. This reduces risks and accelerates the deployment process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policy → AI → CI/CD and security automation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI generates and maintains pipelines&lt;/li&gt;
&lt;li&gt;Automates security scanning and policy enforcement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use when&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DevSecOps automation&lt;/li&gt;
&lt;li&gt;Reducing operational risk&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 AI-Agent Driven Development
&lt;/h3&gt;

&lt;p&gt;AI agents can independently perform certain tasks -- from generating code according to specifications to automatically creating tests, monitoring, and even fixing bugs in production. This allows creating self-protecting systems that minimize human intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Goal → AI Agents → Execution&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Autonomous agents decompose and execute tasks&lt;/li&gt;
&lt;li&gt;Minimal human intervention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use with extreme caution&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Internal tools&lt;/li&gt;
&lt;li&gt;Clearly bounded automation tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 Self-Healing Systems
&lt;/h3&gt;

&lt;p&gt;AI agents monitor the system in production, detect problems, generate patches, and automatically apply them, ensuring high reliability and minimizing downtime.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runtime signals → AI → Fix → Deploy&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI monitors production and automatically applies fixes&lt;/li&gt;
&lt;li&gt;Highest autonomy, highest risk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not recommended by default&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Only with strict safeguards&lt;/li&gt;
&lt;li&gt;Only for non-critical systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  📊 Comparison Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Paradigm&lt;/th&gt;
&lt;th&gt;Who Leads&lt;/th&gt;
&lt;th&gt;Source of Truth&lt;/th&gt;
&lt;th&gt;Typical Use&lt;/th&gt;
&lt;th&gt;Production Readiness&lt;/th&gt;
&lt;th&gt;Risk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Vibe Coding&lt;/td&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Prompt&lt;/td&gt;
&lt;td&gt;Prototypes, demo&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;🔥🔥🔥&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI-assisted Dev&lt;/td&gt;
&lt;td&gt;Human&lt;/td&gt;
&lt;td&gt;Code&lt;/td&gt;
&lt;td&gt;Production features&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;🟡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI Pair Programming&lt;/td&gt;
&lt;td&gt;Human + AI&lt;/td&gt;
&lt;td&gt;Code&lt;/td&gt;
&lt;td&gt;Daily development&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;🟡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Generative Spec &amp;amp; Design&lt;/td&gt;
&lt;td&gt;Human&lt;/td&gt;
&lt;td&gt;Spec&lt;/td&gt;
&lt;td&gt;Architecture, planning&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;🟡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spec-Driven + AI&lt;/td&gt;
&lt;td&gt;Specification&lt;/td&gt;
&lt;td&gt;Spec&lt;/td&gt;
&lt;td&gt;Core systems&lt;/td&gt;
&lt;td&gt;✅✅&lt;/td&gt;
&lt;td&gt;🟢&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI-Driven Dev&lt;/td&gt;
&lt;td&gt;Human + AI&lt;/td&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;td&gt;End-to-end dev&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;🟡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autonomous Refactoring&lt;/td&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Code + Rules&lt;/td&gt;
&lt;td&gt;Tech debt, cleanup&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;🟡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pipeline &amp;amp; Security AI&lt;/td&gt;
&lt;td&gt;Policy&lt;/td&gt;
&lt;td&gt;Policy&lt;/td&gt;
&lt;td&gt;CI/CD, Security&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;🟡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI-Agent Driven Dev&lt;/td&gt;
&lt;td&gt;AI Agents&lt;/td&gt;
&lt;td&gt;Goal / Policy&lt;/td&gt;
&lt;td&gt;Automation&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;🔥🔥&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-Healing Systems&lt;/td&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Runtime signals&lt;/td&gt;
&lt;td&gt;Ops / reliability&lt;/td&gt;
&lt;td&gt;⚠️⚠️&lt;/td&gt;
&lt;td&gt;🔥🔥🔥&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🌳 AI Development Paradigms -- Decision Tree
&lt;/h2&gt;

&lt;p&gt;A decision tree for determining what we're discussing at any given moment.&lt;br&gt;
This tree defines the development mode, not a specific tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  🏢 Policies
&lt;/h3&gt;

&lt;h4&gt;
  
  
  🟢 &lt;strong&gt;Allowed by Default&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;AI-assisted development&lt;/li&gt;
&lt;li&gt;AI pair programming&lt;/li&gt;
&lt;li&gt;Refactoring &lt;em&gt;with review&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  🟡 &lt;strong&gt;Allowed with Explicit Approval&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;AI-augmented spec-driven development&lt;/li&gt;
&lt;li&gt;AI-generated code in core logic&lt;/li&gt;
&lt;li&gt;Any AI in regulated domains&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  🔵 &lt;strong&gt;Operations Only&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD generation&lt;/li&gt;
&lt;li&gt;Security scanning&lt;/li&gt;
&lt;li&gt;Dependency and policy enforcement&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  🔴 &lt;strong&gt;Explicitly Prohibited&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;AI-agent-driven development in production&lt;/li&gt;
&lt;li&gt;Self-healing systems&lt;/li&gt;
&lt;li&gt;Autonomous production changes without human approval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2jd78vudi2nue1avn0i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2jd78vudi2nue1avn0i.png" alt=" " width="784" height="1499"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Overall
&lt;/h2&gt;

&lt;p&gt;AI doesn't change engineering. It just makes the mistakes faster -- or more manageable.&lt;/p&gt;

</description>
      <category>vibecoding</category>
      <category>ai</category>
      <category>aiops</category>
      <category>aiassisteddevelopment</category>
    </item>
  </channel>
</rss>
