<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vuong Ngo</title>
    <description>The latest articles on DEV Community by Vuong Ngo (@vuong_ngo).</description>
    <link>https://dev.to/vuong_ngo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3547478%2Fecc8e128-4c3c-41e8-90c2-70ab046071c5.png</url>
      <title>DEV Community: Vuong Ngo</title>
      <link>https://dev.to/vuong_ngo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vuong_ngo"/>
    <language>en</language>
    <item>
      <title>Operating Model for AI Coding Agents: Delegate, Review, Own</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Sun, 14 Jun 2026 11:08:56 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/operating-model-for-ai-coding-agents-delegate-review-own-3g4m</link>
      <guid>https://dev.to/vuong_ngo/operating-model-for-ai-coding-agents-delegate-review-own-3g4m</guid>
      <description>&lt;p&gt;An operating model for AI coding agents isn't optional. As of mid-2026, it is the gap between teams that scale AI assistance and teams that drown in AI-generated review queues.&lt;/p&gt;

&lt;p&gt;The pattern is predictable. You add an AI coding assistant, the team ships PRs faster, and within a few weeks the review queue is longer than it has ever been.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://opsera.ai/resources/report/ai-coding-impact-2026-benchmark-report/" rel="noopener noreferrer"&gt;Opsera's 2026 AI Coding Impact Benchmark&lt;/a&gt;, drawn from 250,000 developers across 60-plus enterprises, puts numbers to it: AI reduces time-to-PR by up to 58%, but AI-generated pull requests wait 4.6 times longer in review than human-authored ones. The agent didn't slow you down. The process around the agent did.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://resources.anthropic.com/2026-agentic-coding-trends-report" rel="noopener noreferrer"&gt;Anthropic 2026 Agentic Coding Trends report&lt;/a&gt; names this directly: verification and coordination are the new bottleneck, not writing code. The &lt;a href="https://dora.dev/dora-report-2025/" rel="noopener noreferrer"&gt;DORA 2025 research on AI-assisted software development&lt;/a&gt; adds an uncomfortable corollary: higher AI adoption correlates with both more delivery throughput and more delivery instability. Agents amplify what's already in place.&lt;/p&gt;

&lt;p&gt;If you're an engineering lead who already has agents running and is watching the review queue grow, you probably don't need more agent capability. You need a process that tells the agent exactly what to do, tells the reviewer exactly what to check, and tells the team who owns the result.&lt;/p&gt;

&lt;p&gt;That's the framework I'll walk through here: Delegate, Review, Own.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frkg5hwp0fjrprl928o2l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frkg5hwp0fjrprl928o2l.png" alt="Flow diagram showing Intent feeding into Delegate with scope boundaries, the agent executing within scope, Review checking against the original criteria, Own recording the decision and deferred items, and the output feeding back to Intent." width="800" height="212"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure 1: The Delegate-Review-Own loop. Scope boundaries are fixed before the agent runs; the decision log feeds the next iteration.&lt;/em&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Problem with Fuzzy Mandates
&lt;/h2&gt;

&lt;p&gt;Before the mechanics: the coordination failure usually starts before the agent runs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.augmentcode.com/guides/agentic-engineering-operating-model" rel="noopener noreferrer"&gt;Augment Code's agentic engineering operating model guide&lt;/a&gt; states it plainly: "fuzzy team boundaries produce fuzzy agent scopes, with the same downstream coordination costs." Their framing is a useful starting point: a three-tier decision model that maps to what most teams are already running informally.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Who acts&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Human-only&lt;/td&gt;
&lt;td&gt;Human, no agent involvement&lt;/td&gt;
&lt;td&gt;Architecture decisions, security calls, release approvals, defining agent scope itself&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent-assisted&lt;/td&gt;
&lt;td&gt;Agent generates; human approves before the effect is applied&lt;/td&gt;
&lt;td&gt;PR authoring, test writing, refactor passes, documentation drafts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fully autonomous&lt;/td&gt;
&lt;td&gt;Agent executes within a pre-approved, policy-bounded scope&lt;/td&gt;
&lt;td&gt;Lint fixes, dependency patch PRs within a constraint, scheduled changelog updates&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The row most teams skip is the first one. You cannot delegate well if you haven't decided what is not delegatable. Once that boundary is explicit, the other two tiers become manageable.&lt;/p&gt;

&lt;p&gt;The framework below assumes you've done that work. If you haven't, start there.&lt;/p&gt;


&lt;h2&gt;
  
  
  Delegate: Scope First, Task Second
&lt;/h2&gt;

&lt;p&gt;The most common delegation mistake is handing the agent a task description and letting it decide the scope. The agent will interpret scope generously, because nothing in its prompt told it not to.&lt;/p&gt;

&lt;p&gt;A well-formed MCP task delegation includes the scope boundary directly in the call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"run_task"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"task_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"T-204"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"intent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Refactor the user authentication module to use the new token-validation library"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"allowed_paths"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"src/auth/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"tests/auth/"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"definition_of_done"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"All existing auth tests pass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"No changes outside allowed_paths"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"No new dependencies added without explicit approval"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"out_of_scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"Do not modify session management"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"Do not touch src/middleware/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"Do not update package.json"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;out_of_scope&lt;/code&gt; list is what people omit. It takes two minutes to write and prevents the agent from helpfully refactoring things adjacent to the task because they "looked related."&lt;/p&gt;

&lt;p&gt;For longer-running work where the agent reads a context file at the start of its session, the same contract translates to YAML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# task-brief.yaml&lt;/span&gt;
&lt;span class="na"&gt;task_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T-204&lt;/span&gt;
&lt;span class="na"&gt;intent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Refactor&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;auth&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;module&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;use&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;new&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;token-validation&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;library"&lt;/span&gt;
&lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@vuong"&lt;/span&gt;

&lt;span class="na"&gt;allowed_paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;src/auth/&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;tests/auth/&lt;/span&gt;

&lt;span class="na"&gt;definition_of_done&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;existing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;auth&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pass&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(run:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pnpm&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;src/auth)"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;changes&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;outside&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;allowed_paths"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;new&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;dependencies&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;without&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;explicit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;approval"&lt;/span&gt;

&lt;span class="na"&gt;out_of_scope&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;session management&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;src/middleware/&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;package.json modifications&lt;/span&gt;

&lt;span class="na"&gt;stop_and_ask_on_uncertainty&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;stop_and_ask_on_uncertainty&lt;/code&gt; flag is a convention, not a standard MCP field. Add it to your agent config as a rule: when the agent hits a decision it wasn't scoped for, it surfaces the question rather than resolving it silently. That one convention eliminates a large portion of the scope-drift issues that produce bloated PRs and ambiguous review requests.&lt;/p&gt;




&lt;h2&gt;
  
  
  Review: Deliberate, Not Accidental
&lt;/h2&gt;

&lt;p&gt;If AI-generated PRs already wait 4.6 times longer for review, the answer is not to skip review. It's to make the wait intentional rather than incidental.&lt;/p&gt;

&lt;p&gt;The difference is specificity. "Needs review" tells the reviewer nothing. An agent-generated PR should carry a checklist that maps to the actual failure modes of agent-produced code: logic drift from the acceptance criteria, scope overrun, and security exposure in the changed path.&lt;/p&gt;

&lt;p&gt;The Opsera report is specific on the security point: AI-generated code carries 15 to 18 percent more security vulnerabilities than human-authored code. A review gate that ignores that is not a real gate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpr1srura5q9vyojue449.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpr1srura5q9vyojue449.png" alt="Bar chart showing AI reduces time-to-PR by 58 percent on the left, and AI-generated PRs face a 4.6 times longer review wait than human PRs on the right. Source: Opsera 2026 Benchmark." width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure 2: AI cuts time-to-PR by 58% but AI-generated PRs wait 4.6x longer in review. Source: &lt;a href="https://opsera.ai/resources/report/ai-coding-impact-2026-benchmark-report/" rel="noopener noreferrer"&gt;Opsera AI Coding Impact 2026 Benchmark&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here's a GitHub Actions gate that blocks agent-generated PRs until a reviewer confirms the right things:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/agent-pr-review.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Agent PR Review Gate&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;opened&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;synchronize&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;check-agent-pr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;contains(github.event.pull_request.labels.*.name, 'agent-generated')&lt;/span&gt;

    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Require human review checklist&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/github-script@v7&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;const body = context.payload.pull_request.body || '';&lt;/span&gt;
            &lt;span class="s"&gt;const required = [&lt;/span&gt;
              &lt;span class="s"&gt;'- [x] Scope: changes are within the declared allowed_paths',&lt;/span&gt;
              &lt;span class="s"&gt;'- [x] Intent: output matches the task definition-of-done',&lt;/span&gt;
              &lt;span class="s"&gt;'- [x] Security: no new auth, session, or credential handling introduced',&lt;/span&gt;
            &lt;span class="s"&gt;];&lt;/span&gt;
            &lt;span class="s"&gt;const allChecked = required.every(item =&amp;gt; body.includes(item));&lt;/span&gt;
            &lt;span class="s"&gt;if (!allChecked) {&lt;/span&gt;
              &lt;span class="s"&gt;core.setFailed(&lt;/span&gt;
                &lt;span class="s"&gt;'Agent-generated PR is missing the required human review checklist. ' +&lt;/span&gt;
                &lt;span class="s"&gt;'Add and complete each item in the PR description before merging.'&lt;/span&gt;
              &lt;span class="s"&gt;);&lt;/span&gt;
            &lt;span class="s"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Label agent-generated PRs with &lt;code&gt;agent-generated&lt;/code&gt; as part of your delegation step. The gate then becomes self-activating. Reviewers know what they're looking at and what they're responsible for confirming.&lt;/p&gt;

&lt;p&gt;The checklist items aren't arbitrary. "Scope" and "Intent" address the two most common agent failure modes. "Security" is there because the data says it should be.&lt;/p&gt;




&lt;h2&gt;
  
  
  Own: Log the Decision So the Next Session Has Evidence
&lt;/h2&gt;

&lt;p&gt;The delegate and review steps protect you during the run. Own protects you after it.&lt;/p&gt;

&lt;p&gt;Once a reviewer approves an agent-executed task, record why. Not for compliance, though that's a side benefit. Because the next agent session, or a developer who joins the team next quarter, has no context for a decision made in a prior conversation that no longer exists.&lt;/p&gt;

&lt;p&gt;A minimal decision log entry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# decision-log/T-204.yaml&lt;/span&gt;
&lt;span class="na"&gt;task_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T-204&lt;/span&gt;
&lt;span class="na"&gt;approved_by&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@vuong"&lt;/span&gt;
&lt;span class="na"&gt;approved_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-06-14T09:32:00+10:00"&lt;/span&gt;
&lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Auth&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;module&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;refactored&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;token-validation&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;v2.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;All&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;existing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pass."&lt;/span&gt;

&lt;span class="na"&gt;acceptance_check&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pnpm&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;src/auth&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(47&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;failures)"&lt;/span&gt;
&lt;span class="na"&gt;scope_confirmed&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;out_of_scope_violations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;none&lt;/span&gt;

&lt;span class="na"&gt;rollback_plan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Revert&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;commit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;abc123f&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;if&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;auth&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;failure&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;rate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exceeds&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;0.5%&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;first&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;24h"&lt;/span&gt;

&lt;span class="na"&gt;deferred&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T-205&lt;/span&gt;
    &lt;span class="na"&gt;note&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Agent&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;proposed&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;removing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;legacy&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;token&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cache&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;during&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;execution.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Deferred&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;security&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;review."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;deferred&lt;/code&gt; block is the part that compounds over time. When the agent proposes something outside scope, that proposal shouldn't vanish into a dismissed PR comment. Log it as a deferred item with an ID. The next session has a starting point rather than a blank slate.&lt;/p&gt;

&lt;p&gt;If your team uses a project board to manage agent work, these decision records belong there rather than scattered across PR threads. &lt;a href="https://agiflow.io/docs/features/workflows?utm_source=devto&amp;amp;utm_medium=referral" rel="noopener noreferrer"&gt;Agiflow models work units with status tracking, artifact storage, and workflow locks&lt;/a&gt; so decision logs have a stable, addressable home the agent can reference in subsequent runs. That's a useful pattern regardless of which board you use; the critical thing is that the record lives somewhere durable and findable, not in a conversation that expires.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changes When This Operating Model Runs at Scale
&lt;/h2&gt;

&lt;p&gt;At one agent, one task, these three steps are easy to follow manually. They become more important, not less, when you're running multiple agents across multiple work units simultaneously.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.cio.com/article/4134741/how-agentic-ai-will-reshape-engineering-workflows-in-2026.html" rel="noopener noreferrer"&gt;CIO.com coverage of McKinsey's agentic AI research&lt;/a&gt; notes that organizations achieving 20 to 40 percent operating cost reductions from AI share one attribute: a deliberate orchestration layer with audit trails built in from the start. The article frames this as a correlation rather than proven causation, which is honest. But the direction is clear: coordination discipline is what makes the gains stick.&lt;/p&gt;

&lt;p&gt;The DORA finding I cited earlier is the plain version of the same point. AI amplifies what's already there. Strong teams with clear ownership and tight feedback loops get better. Teams with fuzzy handoffs and unclear mandates find those problems more expensive, not cheaper, to untangle.&lt;/p&gt;

&lt;p&gt;Delegate with an explicit scope. Review against that scope. Own a record of what changed and why, and hand that record to the next session.&lt;/p&gt;

&lt;p&gt;The loop is short. The discipline is the work.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>vibecoding</category>
      <category>management</category>
      <category>scrum</category>
    </item>
    <item>
      <title>Three Kinds of AI Context: Most Tools Only Solve One</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Tue, 09 Jun 2026 13:02:08 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/three-kinds-of-ai-context-most-tools-only-solve-one-3c7l</link>
      <guid>https://dev.to/vuong_ngo/three-kinds-of-ai-context-most-tools-only-solve-one-3c7l</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;AI context failure bundles three distinct problems: personal context (who you are), product-decision context (what the product should do), and local task persistence (what work is queued). Two new tools and one Anthropic feature each solve one layer. But the fourth layer — a shared, writable contract of the current open work item — is what none of them address, and it's why developers who've installed all three still feel stuck.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;You set up a &lt;code&gt;CLAUDE.md&lt;/code&gt;. Maybe you wrote memory files. In the week of 8 June 2026, two tools hit Product Hunt — one for personal context, one for product decisions — and you installed those too. You are still re-explaining the project at the start of every session.&lt;/p&gt;

&lt;p&gt;The problem is the diagnosis. "AI starts from scratch" treats one frustration as one cause. It isn't. It's at least three separate context failures that happen to produce the same symptom, and most tools solve exactly one of them.&lt;/p&gt;

&lt;p&gt;Here's the model I've landed on after watching this category for the past six months.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1 — Who you are
&lt;/h2&gt;

&lt;p&gt;This is the most static layer. Your stack, your role, your preferences, how you like commit messages formatted, what framework you avoid. It changes roughly as often as your LinkedIn headline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.producthunt.com/products/unabyss" rel="noopener noreferrer"&gt;Unabyss&lt;/a&gt; hit #1 on Product Hunt on launch day with 755 upvotes for solving exactly this. The tagline is unambiguous: "Set it up once and never re-explain yourself to AI again." It pulls structured context from LinkedIn, Notion, and Gmail, then exposes it to any tool that speaks &lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;MCP&lt;/a&gt;, with per-tool visibility controls.&lt;/p&gt;

&lt;p&gt;If this is your gap, you can fix it manually today. A &lt;code&gt;CLAUDE.md&lt;/code&gt; covering who you are is a perfectly working solution for a single assistant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## About me&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Stack: TypeScript, Node, PostgreSQL, React
&lt;span class="p"&gt;-&lt;/span&gt; Prefer functional React; no class components
&lt;span class="p"&gt;-&lt;/span&gt; Testing: Vitest + Testing Library; mock at service boundaries only
&lt;span class="p"&gt;-&lt;/span&gt; Commit style: conventional commits, imperative mood, no period
&lt;span class="p"&gt;-&lt;/span&gt; Time zone: AEST
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The limit is portability. If you run multiple agents, switch machines, or want consistent preferences across tools, file-per-tool doesn't hold. That's the gap Unabyss fills: one writable context store, readable by anything that asks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2 — What your product should do
&lt;/h2&gt;

&lt;p&gt;This layer moves slower than your task queue but faster than your identity. It's the architectural decisions already made, the approaches that were ruled out and why, the constraints that aren't visible in the code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.producthunt.com/products/brief-10" rel="noopener noreferrer"&gt;Brief&lt;/a&gt; launched this week and reached #5 with 253 upvotes. The problem statement is sharp: "AI agents can ship quickly, but without the right product context, they're often flying blind." Brief stores those decisions and serves relevant context to agents through chat, Slack, CLI, and MCP.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;CLAUDE.md&lt;/code&gt; can carry this layer too, but it gets unwieldy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Architecture decisions&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Auth: custom JWT + refresh token. Rejected Clerk (vendor lock-in concern, 2024-11).
  See: docs/ADRs/0003-auth-approach.md
&lt;span class="p"&gt;-&lt;/span&gt; DB: Postgres. MongoDB ruled out early — our query patterns are relational.
&lt;span class="p"&gt;-&lt;/span&gt; Background jobs: BullMQ. No migration to new runners without a spike.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At a certain point, keeping that file accurate is its own maintenance job. Tools like Brief try to automate the curation. Whether you use a tool or a disciplined ADR directory, the important thing is that this layer exists and stays current — because an assistant that doesn't know why the auth system looks the way it does will confidently propose changes you ruled out eight months ago.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3 — What work is left (on this machine)
&lt;/h2&gt;

&lt;p&gt;On January 22, 2026, Anthropic shipped Claude Code Tasks — a &lt;a href="https://www.dplooy.com/blog/claude-code-tasks-complete-guide-to-ai-agent-workflow" rel="noopener noreferrer"&gt;persistent task system&lt;/a&gt; that survives session termination. Tasks live in &lt;code&gt;~/.claude/tasks/&lt;/code&gt; as JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"01JJ3QZWZ4R2XM6GBTF9V7Y8KP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Implement rate limiting on /api/v1/completions"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"in_progress"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dependencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"01JJ3QY..."&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"owner"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-01-24T08:12:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before Tasks, Claude Code stored todos in session memory. They disappeared when the terminal closed. Tasks fix this: create them once, and they persist across restarts, terminal crashes, and session resets. That's a genuine improvement over the status quo.&lt;/p&gt;

&lt;p&gt;The constraint is scope. Tasks are local. They live on one machine. They store orchestration metadata — status, dependencies, owner — but not the &lt;em&gt;content of the work&lt;/em&gt;. What "done" means, what the acceptance criteria are, what artifacts prove the task is complete. And they don't synchronise across machines or agents.&lt;/p&gt;




&lt;h2&gt;
  
  
  The four layers, together
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What it answers&lt;/th&gt;
&lt;th&gt;Change frequency&lt;/th&gt;
&lt;th&gt;Example tools&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 — Personal context&lt;/td&gt;
&lt;td&gt;Who you are, preferences, stack&lt;/td&gt;
&lt;td&gt;Rarely (months)&lt;/td&gt;
&lt;td&gt;Unabyss, CLAUDE.md&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 — Product-decision context&lt;/td&gt;
&lt;td&gt;What should be built and why&lt;/td&gt;
&lt;td&gt;Occasionally (weeks)&lt;/td&gt;
&lt;td&gt;Brief, ADRs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3 — Local task persistence&lt;/td&gt;
&lt;td&gt;What work is queued on this machine&lt;/td&gt;
&lt;td&gt;Constantly (sessions)&lt;/td&gt;
&lt;td&gt;Claude Code Tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4 — Structured current-work context&lt;/td&gt;
&lt;td&gt;What is open, what done means, what proves it&lt;/td&gt;
&lt;td&gt;Constantly, shared&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The question mark in that last column is where most developers who've installed layers 1–3 are still stuck.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb6y72bsjbcab6qzw1wtl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb6y72bsjbcab6qzw1wtl.png" alt="Quadrant diagram mapping the four layers of AI context by change frequency and scope. Layer 4 — structured current-work context — occupies the high-frequency, shared quadrant and is highlighted in orange as the gap most tools leave unfilled." width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The four layers plotted by how often they change and who can see them. Layers 1 and 2 sit in the slow-change rows; layers 3 and 4 are in constant flux. Most tools cover the left column. The top-right cell is the gap. (Author's model.)&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The AI Context Gap Nobody Names
&lt;/h2&gt;

&lt;p&gt;Walk through a real session. You open a new Claude Code instance. Layer 1 tells it you prefer TypeScript and conventional commits. Layer 2 tells it why the auth system looks the way it does. Layer 3 tells it there's a task called "Implement rate limiting" in progress.&lt;/p&gt;

&lt;p&gt;What it doesn't know: what done means for that task. What the acceptance criteria are. Whether there's a failing test waiting. Whether another agent already started the same work in a different worktree. Whether the spec changed since you queued the task.&lt;/p&gt;

&lt;p&gt;That information isn't in your &lt;code&gt;CLAUDE.md&lt;/code&gt;. It's not in your decisions log. It's not in the Tasks JSON. It's the &lt;em&gt;contract of the work&lt;/em&gt; — and it needs to live somewhere shared, writable, and structured. Not a file you write once and hope stays accurate.&lt;/p&gt;

&lt;p&gt;This is also what distinguishes Layer 4 from the others in a practical sense: the contract changes as the work progresses. An assistant needs to be able to &lt;em&gt;read&lt;/em&gt; it at session start and &lt;em&gt;write&lt;/em&gt; to it as evidence accumulates. Static files can't do that.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why a longer prompt doesn't close this gap
&lt;/h2&gt;

&lt;p&gt;The instinct is to paste more context into the system prompt or CLAUDE.md. It rarely helps, and there's a mechanical reason.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqruetzmeql1u2sa5l9l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqruetzmeql1u2sa5l9l.png" alt="Qualitative U-shaped curve showing model recall accuracy by document position in a long context: highest at start and end, lowest in the middle, with the middle position highlighted as the worst retrieval point." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Model recall by position in a long context window. Acceptance criteria buried in paragraph 12 of a CLAUDE.md face the worst retrieval odds. Based on &lt;a href="https://arxiv.org/abs/2307.03172" rel="noopener noreferrer"&gt;Liu et al. 2023&lt;/a&gt;. Y-axis values are qualitative only.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://arxiv.org/abs/2307.03172" rel="noopener noreferrer"&gt;A 2023 study on how language models use long contexts&lt;/a&gt; — "Lost in the Middle" — showed that models retrieve information reliably from the start and end of long inputs but degrade badly for content in the middle. The longer the context window, the more of your carefully-written CLAUDE.md sits in the graveyard.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents" rel="noopener noreferrer"&gt;Anthropic's context engineering guide&lt;/a&gt; for agents says it directly: "context is a critical but finite resource." The guidance is to treat it as something you curate and structure, not something you dump in bulk.&lt;/p&gt;

&lt;p&gt;For Layer 4, the implication is concrete. If the acceptance criteria for a task are buried in paragraph 12 of a 600-line memory file, the assistant is not reliably reading them. They need to be in a distinct, retrievable record — something the assistant fetches on demand rather than scans.&lt;/p&gt;




&lt;h2&gt;
  
  
  What structured current-work context actually looks like
&lt;/h2&gt;

&lt;p&gt;Here's the shape of the missing piece. This isn't a vendor-specific format — it's what a work item record needs to carry to be genuinely useful to an AI assistant at session start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wu_01JJ3R"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rate limiting on /api/v1/completions"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"acceptanceCriteria"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Returns 429 with Retry-After header when limit exceeded"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Limit is configurable per API key, not global"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Integration test covers the 429 path with a real Redis instance"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"artifacts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"spec"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rate limit spec"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"linkedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-06-08T09:00:00Z"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Failing test run (pre-fix)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"linkedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-06-09T11:43:00Z"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compare that to the Layer 3 task record. Layer 3 tells the agent that a task exists, who owns it, and whether it's in progress. Layer 4 tells it what the task &lt;em&gt;means&lt;/em&gt; — the criteria that will constitute evidence of completion, and the evidence that already exists.&lt;/p&gt;

&lt;p&gt;Wiring this up over MCP looks like any other context server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"project"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@your-tool/project-mcp@latest"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"PROJECT_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-key"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this configured, the assistant can call &lt;code&gt;get_work_unit&lt;/code&gt; at the start of every session and receive the full record — criteria, artifacts, status — fetched fresh. Not read from a static file that may have drifted.&lt;/p&gt;

&lt;p&gt;For a detailed breakdown of how this plays out across multiple tasks and agents, &lt;a href="https://agiflow.io/blog/coordinating-multi-task-ai-workflows-with-work-units?utm_source=devto&amp;amp;utm_medium=referral" rel="noopener noreferrer"&gt;Agiflow's write-up on coordinating multi-task workflows with work units&lt;/a&gt; covers the model in practice.&lt;/p&gt;




&lt;h2&gt;
  
  
  Which layer is actually your problem
&lt;/h2&gt;

&lt;p&gt;You probably don't need all four fixed today.&lt;/p&gt;

&lt;p&gt;If the assistant keeps asking about your stack or commit style: Layer 1. A good &lt;code&gt;CLAUDE.md&lt;/code&gt; or Unabyss solves it in an afternoon.&lt;/p&gt;

&lt;p&gt;If it makes decisions that contradict past architecture choices: Layer 2. Start writing ADRs, or try Brief.&lt;/p&gt;

&lt;p&gt;If it loses track of what it was doing when the session ends: Layer 3. Claude Code Tasks is already shipped, it's free, and it's local.&lt;/p&gt;

&lt;p&gt;If it knows what work is queued but not what done means, drifts off the spec mid-session, or can't pick up where another agent left off: that's Layer 4. No static file solution handles it cleanly. You need a writable, shared, structured source of truth for the current work contract.&lt;/p&gt;

&lt;p&gt;Most developers I've seen hit Layer 4 and diagnose it as Layer 3. They add more to CLAUDE.md, the agent still drifts, and the conclusion is "AI just isn't reliable enough yet." Sometimes that's true. More often, the right AI context structure was never there to begin with.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>MCP Server for Task Tracking: What the MCP Tasks Extension Specifies in 2026</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Tue, 02 Jun 2026 15:06:01 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/mcp-server-for-task-tracking-what-the-mcp-tasks-extension-specifies-in-2026-1f8j</link>
      <guid>https://dev.to/vuong_ngo/mcp-server-for-task-tracking-what-the-mcp-tasks-extension-specifies-in-2026-1f8j</guid>
      <description>&lt;p&gt;If you are building an MCP server for task tracking, you eventually hit the same wall: the work outlives the connection. As of June 2026, that is exactly the gap MCP Tasks is trying to close. The important part is not that it exists — it is how far the protocol has actually stabilized, what still looks experimental, and where application-layer task boards already solved the same shape in a different way.&lt;/p&gt;

&lt;p&gt;This post is a skeptical read for developers and technical architects. We will walk through the current MCP Tasks semantics, look at the state machine, compare the extension to A2A, and end with a minimal application-layer example of durable, agent-readable task state. Agiflow appears only as one concrete example of a board that exposes scoped task state to agents, not as a recommendation or a proof of spec maturity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this exists at all
&lt;/h2&gt;

&lt;p&gt;The core problem is simple: long-running work does not fit cleanly into a request/response cycle. If the model needs to wait for a batch job, a human approval, or a slow external API, blocking a connection is fragile and hard to resume.&lt;/p&gt;

&lt;p&gt;The current MCP roadmap makes that explicit. The &lt;a href="https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/" rel="noopener noreferrer"&gt;2026 MCP roadmap&lt;/a&gt; (captured 2026-06-02) still treats task lifecycle edge cases as active protocol work, and the &lt;a href="https://modelcontextprotocol.io/extensions/tasks/overview" rel="noopener noreferrer"&gt;Tasks overview&lt;/a&gt; (captured 2026-06-02) describes a durable handle, not a streaming socket replacement.&lt;/p&gt;

&lt;p&gt;That distinction matters. A task ID is a state handle. It is not a chat transcript. It is not a job queue. It is not a promise that every client and server will agree on the same retry behavior next quarter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of MCP Tasks
&lt;/h2&gt;

&lt;p&gt;The protocol model is a durable task with a small state machine. The useful bit for builders is that the shape is explicit:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Surface&lt;/th&gt;
&lt;th&gt;What looks stable&lt;/th&gt;
&lt;th&gt;What still moves&lt;/th&gt;
&lt;th&gt;Practical takeaway&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Task identity&lt;/td&gt;
&lt;td&gt;Durable task ID&lt;/td&gt;
&lt;td&gt;Retention policies&lt;/td&gt;
&lt;td&gt;Persist the ID and expect resume/poll flows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State machine&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;working&lt;/code&gt;, &lt;code&gt;input_required&lt;/code&gt;, &lt;code&gt;completed&lt;/code&gt;, &lt;code&gt;failed&lt;/code&gt;, &lt;code&gt;cancelled&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Transition edge cases&lt;/td&gt;
&lt;td&gt;Design for clear terminal states&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mid-flight input&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;tasks/update&lt;/code&gt; for outstanding input requests&lt;/td&gt;
&lt;td&gt;Client UX details&lt;/td&gt;
&lt;td&gt;Treat human approval as a first-class path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transport behavior&lt;/td&gt;
&lt;td&gt;Polling works everywhere&lt;/td&gt;
&lt;td&gt;Push support varies&lt;/td&gt;
&lt;td&gt;Poll first, subscribe second&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lifecycle policy&lt;/td&gt;
&lt;td&gt;Terminal states are terminal&lt;/td&gt;
&lt;td&gt;Retry and expiry semantics are still evolving&lt;/td&gt;
&lt;td&gt;Keep your implementation conservative&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnm0c51a57ibk875aqrxj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnm0c51a57ibk875aqrxj.png" alt="Diagram of MCP task lifecycle showing working and input_required as non-terminal states connected by bidirectional arrows, with three terminal states — completed, failed, and cancelled — fanning out to the right." width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MCP Tasks state machine: &lt;code&gt;working&lt;/code&gt; and &lt;code&gt;input_required&lt;/code&gt; are non-terminal; &lt;code&gt;completed&lt;/code&gt;, &lt;code&gt;failed&lt;/code&gt;, and &lt;code&gt;cancelled&lt;/code&gt; are terminal. Source: &lt;a href="https://modelcontextprotocol.io/extensions/tasks/overview" rel="noopener noreferrer"&gt;MCP Tasks extension overview&lt;/a&gt; (captured 2026-06-02).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The easiest way to see the negotiation is to look at the client and server capabilities side by side.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_meta"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"io.modelcontextprotocol/clientCapabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"extensions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"io.modelcontextprotocol/tasks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"capabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"extensions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"io.modelcontextprotocol/tasks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If both sides opt in, the server can return a &lt;code&gt;CreateTaskResult&lt;/code&gt; instead of a synchronous result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"resultType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"task"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"task"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"taskId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"task_01JQ2Z8XKQ7G6Q5X5ZK1N7T9A2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"working"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ttlMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;600000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pollIntervalMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The point of the &lt;code&gt;ttlMs&lt;/code&gt; and &lt;code&gt;pollIntervalMs&lt;/code&gt; fields is not cosmetic. They tell the client how long the task can reasonably be resumed and how often to ask for an update.&lt;/p&gt;

&lt;h2&gt;
  
  
  Polling, completion, and input_required
&lt;/h2&gt;

&lt;p&gt;Once a task exists, the client follows a straightforward loop: poll until terminal, or respond if the server pauses for input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"taskId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"task_01JQ2Z8XKQ7G6Q5X5ZK1N7T9A2"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"taskId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"task_01JQ2Z8XKQ7G6Q5X5ZK1N7T9A2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Build finished successfully."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"artifacts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"dist/app.js"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"dist/app.js.map"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the work needs a decision, the protocol explicitly stops pretending it can continue alone.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"taskId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"task_01JQ2Z8XKQ7G6Q5X5ZK1N7T9A2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"input_required"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"inputRequests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"approve_release"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"confirmation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Approve deployment to staging?"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"taskId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"task_01JQ2Z8XKQ7G6Q5X5ZK1N7T9A2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"inputResponses"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"approve_release"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confirmed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the most useful mental model here: task state is not an implementation detail. It is a protocol surface for deferred work.&lt;/p&gt;

&lt;p&gt;The experimental repository reinforces the caution. The &lt;a href="https://github.com/modelcontextprotocol/experimental-ext-tasks" rel="noopener noreferrer"&gt;experimental Tasks spec repo&lt;/a&gt; (captured 2026-06-02) labels the extension experimental and warns that it may change or disappear. That does not make it unusable. It does mean you should treat it like an evolving contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is stable for MCP task tracking in production
&lt;/h2&gt;

&lt;p&gt;My read is conservative:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Safe to build on: the existence of a durable task handle, explicit task states, polling, and cooperative cancellation.&lt;/li&gt;
&lt;li&gt;Use with caution: exact retry behavior, retention/expiry policy, and any client-specific push notification behavior.&lt;/li&gt;
&lt;li&gt;Do not over-interpret: the fact that a task exists does not mean every agent workflow should become a task.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The roadmap backs that up. In the &lt;a href="https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/" rel="noopener noreferrer"&gt;2026 MCP roadmap&lt;/a&gt; (captured 2026-06-02), retry semantics and result-retention policy are still called out as open gaps. That is a clue that the shape is real, but the surrounding policy is still settling.&lt;/p&gt;

&lt;p&gt;There is also market signal, but it is still only market signal. &lt;a href="https://siliconangle.com/2026/02/12/manufact-raises-6-3m-help-developers-connect-ai-agents-model-context-protocol/" rel="noopener noreferrer"&gt;SiliconANGLE's February 12, 2026 report on Manufact&lt;/a&gt; (captured 2026-06-02) shows money flowing into MCP infrastructure. That says the category is getting real attention. It does not prove the protocol has finished evolving.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP Tasks is not the only answer
&lt;/h2&gt;

&lt;p&gt;If your use case is agent-to-agent collaboration rather than deferred execution, you should also look at A2A. The &lt;a href="https://a2a-protocol.org/latest/specification/" rel="noopener noreferrer"&gt;A2A specification&lt;/a&gt; (captured 2026-06-02) focuses on inter-agent communication, capability discovery, and collaborative tasks with its own task lifecycle and message model.&lt;/p&gt;

&lt;p&gt;That makes the comparison easier:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Topic&lt;/th&gt;
&lt;th&gt;MCP Tasks&lt;/th&gt;
&lt;th&gt;A2A&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Primary problem&lt;/td&gt;
&lt;td&gt;Deferred execution inside MCP requests&lt;/td&gt;
&lt;td&gt;Coordination between independent agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Core primitive&lt;/td&gt;
&lt;td&gt;Durable task handle&lt;/td&gt;
&lt;td&gt;Agent-to-agent session/task exchange&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best fit&lt;/td&gt;
&lt;td&gt;Slow tools, approvals, batch work, resumable jobs&lt;/td&gt;
&lt;td&gt;Multi-agent workflows and interoperability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failure mode&lt;/td&gt;
&lt;td&gt;Overfitting every long job into one protocol&lt;/td&gt;
&lt;td&gt;Using a coordination protocol when you only need deferred tool execution&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In other words, use the simplest layer that matches the problem. MCP Tasks is a good fit when the work is still fundamentally one request that needs to finish later. A2A is a better fit when the work is really a conversation between agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  A minimal application-layer analogue
&lt;/h2&gt;

&lt;p&gt;This is the part that often gets conflated with the protocol extension. A board can expose the same shape without implementing MCP Tasks itself: stable ID, readable status, and a write path that updates state as work progresses.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3wc0hhl27b7632itvur.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3wc0hhl27b7632itvur.png" alt="Three-column diagram comparing MCP Tasks protocol layer on the left and an application-layer board task on the right, with arrows pointing toward a centre box labelled Shared Structural Pattern listing stable durable ID, explicit status state machine, pollable readable state, and deferred async execution model." width="799" height="329"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Two independent primitives that independently converge on the same structural shape: stable durable ID, explicit status state machine, and pollable/readable state.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That is close to what Agiflow's &lt;a href="https://agiflow.io/docs/connecting-ai-tools?utm_source=devto&amp;amp;utm_medium=referral" rel="noopener noreferrer"&gt;connection docs&lt;/a&gt; show at the application layer: a scoped task endpoint that lets an assistant work against durable board state. The important distinction is still the same one from above. That is an application-level model, not proof that the product implements the MCP Tasks extension.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;TaskStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;working&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;input_required&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;completed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;TaskRecord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;updatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;TaskRecord&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;task_123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;task_123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Review deployment checklist&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;working&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;updatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;readTaskState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;TaskRecord&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;writeTaskState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Partial&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Pick&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;TaskRecord&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;notes&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;TaskRecord&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Unknown task: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;TaskRecord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;updatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;before&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;readTaskState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;task_123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;before&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;working&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;writeTaskState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;task_123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;input_required&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Need approval for the release window.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the same shape MCP Tasks is formalizing, minus the protocol mechanics. You can swap the in-memory &lt;code&gt;Map&lt;/code&gt; for a database row, a Durable Object, a project board record, or an external job handle. The pattern stays the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would do in practice
&lt;/h2&gt;

&lt;p&gt;My rule of thumb is boring on purpose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use core MCP for synchronous tool calls.&lt;/li&gt;
&lt;li&gt;Use MCP Tasks when the work is truly deferred and the client can resume later.&lt;/li&gt;
&lt;li&gt;Keep the state machine simple and terminal-state driven.&lt;/li&gt;
&lt;li&gt;Treat push notifications as an optimization, not a requirement.&lt;/li&gt;
&lt;li&gt;Reach for A2A when the problem is really agent coordination, not deferred execution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gets you most of the value without pretending the extension is more mature than it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;The useful headline is not "MCP Tasks is done." The useful headline is that MCP server task tracking now has a credible protocol shape: durable, resumable, long-running work that survives disconnects. That is a real step forward, but it is still an evolving surface area, so the safest implementation stance is conservative.&lt;/p&gt;

&lt;p&gt;If you want to see how a real product scopes assistants to board state and task context, start with &lt;a href="https://agiflow.io/docs/connecting-ai-tools?utm_source=devto&amp;amp;utm_medium=referral" rel="noopener noreferrer"&gt;Agiflow's connection docs&lt;/a&gt;. It is a good application-layer contrast to the protocol story here, and it makes the boundary between "board state" and "protocol task" much easier to see.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>MCP, Code, or Commands? A Decision Framework for AI Tool Integration</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Sun, 07 Dec 2025 05:16:36 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/mcp-code-or-commands-a-decision-framework-for-ai-tool-integration-2f30</link>
      <guid>https://dev.to/vuong_ngo/mcp-code-or-commands-a-decision-framework-for-ai-tool-integration-2f30</guid>
      <description>&lt;p&gt;When building AI-assisted development workflows, the documentation explains &lt;em&gt;what&lt;/em&gt; each approach does—but not the real cost implications or when to use which.&lt;/p&gt;

&lt;p&gt;I instrumented network traffic and ran controlled experiments across five approaches using identical tasks: same 500-row dataset, same analysis requirements, same model (Claude Sonnet). The results revealed that &lt;strong&gt;architecture matters more than protocol choice&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;MCP Optimized consumed 60,420 tokens. MCP Vanilla consumed 309,053 tokens. Same protocol. Same task. &lt;strong&gt;5x difference&lt;/strong&gt;—driven entirely by one decision: file-path references vs. data-array parameters.&lt;/p&gt;

&lt;p&gt;This article provides a decision framework based on measured data, not marketing claims.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Decision Framework
&lt;/h2&gt;

&lt;p&gt;Before diving into data, here's the framework I developed from these experiments:&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick Decision Guide
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If your situation is...&lt;/th&gt;
&lt;th&gt;Use this approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Repeating task (&amp;gt;20 executions), large datasets, need predictable costs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MCP Optimized&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;One-off exploration, evolving requirements, prototyping&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Code-Driven (Skills)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User must control when it runs, deterministic behavior needed&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Slash Commands&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production system with security requirements&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;MCP Optimized&lt;/strong&gt; (never Skills)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Decision Flowchart
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Q1: One-off task (&amp;lt; 5 executions)?
    YES → Code-Driven or direct prompting
    NO  → Continue

Q2: Dataset &amp;gt; 100 rows AND need &amp;lt; 5% cost variance?
    YES → MCP Optimized
    NO  → Continue

Q3: User needs explicit control over invocation?
    YES → Slash Commands
    NO  → Continue

Q4: Execution count &amp;gt; 20 AND requirements stable?
    YES → MCP Optimized
    NO  → Code-Driven (prototype, then migrate)

NEVER:
  - MCP Vanilla for production (always suboptimal)
  - Skills for multi-user or sensitive systems
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Three Approaches Explained
&lt;/h2&gt;

&lt;h3&gt;
  
  
  MCP (Model Context Protocol)
&lt;/h3&gt;

&lt;p&gt;A structured protocol for AI-tool communication. The model calls tools with JSON parameters, the server executes and returns structured results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// MCP tool call - structured, typed, validated&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;analyze_csv_file&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/data/employees.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;analysis_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;salary_by_department&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Characteristics:&lt;/strong&gt; Structured I/O, access-controlled, model-decided invocation, reusable across applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical distinction:&lt;/strong&gt; There's a 5x token difference between &lt;em&gt;vanilla&lt;/em&gt; MCP (passing data directly) and &lt;em&gt;optimized&lt;/em&gt; MCP (passing file references). Same protocol, vastly different economics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code-Driven (Skills &amp;amp; Code Generation)
&lt;/h3&gt;

&lt;p&gt;The model writes and executes code to accomplish tasks. Claude Code's "skills" feature lets the model invoke capabilities based on semantic matching.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Claude writes this, executes it, iterates
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/data/employees.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;department&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Characteristics:&lt;/strong&gt; Maximum flexibility, unstructured I/O, higher variance between runs, requires sandboxing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slash Commands
&lt;/h3&gt;

&lt;p&gt;Pure string substitution. You type &lt;code&gt;/review @file.js&lt;/code&gt;, the command template expands, and the result injects into your message.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- .claude/commands/review.md --&amp;gt;&lt;/span&gt;
Review the following file for security vulnerabilities,
performance issues, and code quality:

{file_content}

Focus on: authentication, input validation, error handling.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Characteristics:&lt;/strong&gt; User-explicit, deterministic, single-turn, zero tool-call overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Measured Data: What the Numbers Show
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Methodology
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Same workload: load 500-row CSV, perform grouping, summary stats, two plots&lt;/li&gt;
&lt;li&gt;Same model: Claude Sonnet, default settings&lt;/li&gt;
&lt;li&gt;3-4 runs per approach with logged request/response payloads&lt;/li&gt;
&lt;li&gt;Costs calculated at current Claude Sonnet pricing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Token Consumption
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpuh6gfa2xmfhmb8c2ee.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpuh6gfa2xmfhmb8c2ee.png" alt=" " width="800" height="690"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Token consumption per API request. MCP Optimized achieves consistently low usage through file-path architecture.&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Avg tokens/run&lt;/th&gt;
&lt;th&gt;vs Baseline&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP Optimized&lt;/td&gt;
&lt;td&gt;60,420&lt;/td&gt;
&lt;td&gt;-55%&lt;/td&gt;
&lt;td&gt;File-path parameters; zero data duplication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Proxy (warm)&lt;/td&gt;
&lt;td&gt;81,415&lt;/td&gt;
&lt;td&gt;-39%&lt;/td&gt;
&lt;td&gt;Shared context + warm cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code-Skill (baseline)&lt;/td&gt;
&lt;td&gt;133,006&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Model-written Python; nothing cached&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UTCP Code-Mode&lt;/td&gt;
&lt;td&gt;204,011&lt;/td&gt;
&lt;td&gt;+53%&lt;/td&gt;
&lt;td&gt;Extra prompt framing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Vanilla&lt;/td&gt;
&lt;td&gt;309,053&lt;/td&gt;
&lt;td&gt;+133%&lt;/td&gt;
&lt;td&gt;JSON-serialized data in every call&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Cost at Scale
&lt;/h3&gt;

&lt;p&gt;At 1,000 monthly executions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Per Execution&lt;/th&gt;
&lt;th&gt;Monthly&lt;/th&gt;
&lt;th&gt;Annual&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP Optimized&lt;/td&gt;
&lt;td&gt;$0.21&lt;/td&gt;
&lt;td&gt;$210&lt;/td&gt;
&lt;td&gt;$2,520&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code-Skill&lt;/td&gt;
&lt;td&gt;$0.44&lt;/td&gt;
&lt;td&gt;$440&lt;/td&gt;
&lt;td&gt;$5,280&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Vanilla&lt;/td&gt;
&lt;td&gt;$0.99&lt;/td&gt;
&lt;td&gt;$990&lt;/td&gt;
&lt;td&gt;$11,880&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;$9,360 annual difference&lt;/strong&gt; between optimized and vanilla MCP for a single workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scalability
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl22iecr7grovi6buwngb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl22iecr7grovi6buwngb.png" alt=" " width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cumulative token consumption. MCP Optimized maintains low growth; vanilla approaches accumulate steeply.&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Scaling Factor&lt;/th&gt;
&lt;th&gt;10K Row Projection&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP Optimized&lt;/td&gt;
&lt;td&gt;1.5x&lt;/td&gt;
&lt;td&gt;~65K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code-Skill&lt;/td&gt;
&lt;td&gt;1.1-1.6x&lt;/td&gt;
&lt;td&gt;~150-220K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Vanilla&lt;/td&gt;
&lt;td&gt;2.0-2.9x&lt;/td&gt;
&lt;td&gt;~500-800K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MCP Optimized exhibits &lt;strong&gt;sub-linear scaling&lt;/strong&gt; because file paths cost the same tokens regardless of file size. MCP Vanilla exhibits &lt;strong&gt;super-linear scaling&lt;/strong&gt; because larger datasets require proportionally more tokens for JSON serialization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Variance
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Coefficient of Variation&lt;/th&gt;
&lt;th&gt;Consistency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP Optimized&lt;/td&gt;
&lt;td&gt;0.6%&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Proxy (warm)&lt;/td&gt;
&lt;td&gt;0.5%&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code-Skill&lt;/td&gt;
&lt;td&gt;18.7%&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Vanilla&lt;/td&gt;
&lt;td&gt;21.2%&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MCP Optimized hit 60,307, 60,144, and 60,808 tokens across three runs. Code-Skill ranged from 108K to 158K. High variance breaks capacity planning and makes cost prediction unreliable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Latency
&lt;/h3&gt;

&lt;p&gt;Skills and sub-agents use tool-calling, which means two LLM invocations instead of one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User message → Model decides → Tool call → Tool result → Final response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Slash commands avoid this—they're just prompt injection with direct response.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Lessons
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Architecture Trumps Protocol
&lt;/h3&gt;

&lt;p&gt;The 5x token difference between MCP Optimized and MCP Vanilla uses the same protocol. The difference is entirely architectural: file paths vs data arrays. Focus on data flow design, not protocol debates.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The File-Path Pattern
&lt;/h3&gt;

&lt;p&gt;The single biggest efficiency gain: eliminate data duplication.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Anti-pattern: 10,000 tokens just for data&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;analyze_data&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="cm"&gt;/* 500 rows serialized */&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Pattern: 50 tokens for the same operation&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;analyze_csv_file&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/data/employees.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MCP server handles file I/O internally. Data never enters the context window.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Prototype with Skills, Ship with MCP
&lt;/h3&gt;

&lt;p&gt;Skills execute arbitrary code—bash commands, file system access, network calls. They're excellent for figuring out what tools you need. They're inappropriate for production systems where security matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Slash Commands Are Underrated
&lt;/h3&gt;

&lt;p&gt;When you need deterministic, user-controlled workflows, slash commands win. No tool-call overhead, no model surprises, no latency penalty. Use them for repeatable tasks like code review checklists or deployment procedures.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Sub-Agent Context Isolation
&lt;/h3&gt;

&lt;p&gt;Sub-agents can't see your main conversation history. If they need context, you must explicitly pass it in the delegation prompt. This is by design—clean delegation—but requires explicit information passing.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. CLAUDE.md Costs Compound
&lt;/h3&gt;

&lt;p&gt;CLAUDE.md content injects into every message, including sub-agent conversations. Keep it concise. Use file references to pull in additional docs only when needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- CLAUDE.md --&amp;gt;&lt;/span&gt;
&lt;span class="gh"&gt;# Project Standards&lt;/span&gt;
See @docs/CODING_STANDARDS.md for detailed guidelines.

Key rules:
&lt;span class="p"&gt;-&lt;/span&gt; Use TypeScript strict mode
&lt;span class="p"&gt;-&lt;/span&gt; No any types
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Measure Before Optimizing
&lt;/h3&gt;

&lt;p&gt;Instrument your network traffic. The Anthropic API returns token usage in every response—log it. You might be surprised where tokens are actually going.&lt;/p&gt;




&lt;h2&gt;
  
  
  Implementation Patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Parallel Tool Execution
&lt;/h3&gt;

&lt;p&gt;File-path architecture enables parallel calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Four visualizations, one API call, ~400 tokens total&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
  &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;create_viz&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/data/emp.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bar&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;salary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;create_viz&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/data/emp.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scatter&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;exp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;salary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;create_viz&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/data/emp.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pie&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;col&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;department&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;create_viz&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/data/emp.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bar&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;location&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;salary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Progressive Tool Discovery
&lt;/h3&gt;

&lt;p&gt;For large tool catalogs (20+ tools), use meta-tools for on-demand discovery instead of loading all tools upfront:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Initial context: 2 tools, ~400 tokens&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;meta_tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;describe_tools&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Discover available tools&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;use_tool&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Execute a specific tool&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="c1"&gt;// Instead of: 50 tools, ~50,000 tokens upfront&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Phased Migration Strategy
&lt;/h3&gt;

&lt;p&gt;For uncertain repeatability:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Phase 1:&lt;/strong&gt; Use code-driven to validate the task. Accept higher per-execution cost for flexibility.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 2:&lt;/strong&gt; If the task stabilizes and will repeat, invest in MCP Optimized.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3:&lt;/strong&gt; Track actual execution count and token consumption. Migrate when patterns are clear.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Avoid When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP Optimized&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Production workloads, large datasets, predictable costs, security requirements&lt;/td&gt;
&lt;td&gt;One-off tasks, evolving requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code-Driven&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Prototyping, novel requirements, maximum flexibility&lt;/td&gt;
&lt;td&gt;Production systems, multi-user environments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Slash Commands&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User-controlled workflows, deterministic behavior, zero overhead&lt;/td&gt;
&lt;td&gt;Automation, context-dependent decisions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The core insight: &lt;strong&gt;how you architect data flow matters more than which protocol you choose&lt;/strong&gt;. The 5x token difference between optimized and vanilla MCP—for the same task—demonstrates this clearly.&lt;/p&gt;

&lt;p&gt;Match the tool to your constraints. Measure the results.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://agiflow.io/blog/token-efficiency-in-ai-assisted-development" rel="noopener noreferrer"&gt;Token Efficiency in AI-Assisted Development&lt;/a&gt; - Full analysis of token consumption across approaches&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://agiflow.io/blog/claude-code-internals-reverse-engineering-prompt-augmentation" rel="noopener noreferrer"&gt;Claude Code Internals: Reverse Engineering Prompt Augmentation&lt;/a&gt; - Deep dive into how Claude Code's prompt mechanisms work&lt;/li&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;MCP Specification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/AgiFlow/aicode-toolkit" rel="noopener noreferrer"&gt;AICode Toolkit (GitHub)&lt;/a&gt; - MCP servers and tools for AI-assisted development&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/AgiFlow/token-usage-metrics" rel="noopener noreferrer"&gt;Token efficiency experiments (GitHub)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/AgiFlow/claude-code-prompt-analysis" rel="noopener noreferrer"&gt;Prompt augmentation analysis (GitHub)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;All claims are reproducible using the open-source data and tooling in the referenced repositories.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>vibecoding</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>AI Keeps Reinventing Your Components. Here's How to Stop It.</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Sun, 30 Nov 2025 10:43:40 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/ai-keeps-reinventing-your-components-heres-how-to-stop-it-580a</link>
      <guid>https://dev.to/vuong_ngo/ai-keeps-reinventing-your-components-heres-how-to-stop-it-580a</guid>
      <description>&lt;p&gt;Three days before a customer pilot, our PM pinged me: "Can we ship that analytics dashboard?" The design had been sitting in Figma for weeks. I promised I'd have it in production by Friday with AI co-pilot.&lt;/p&gt;

&lt;p&gt;By Wednesday morning, the PR was still in draft. Not because the UI was hard—it looked exactly like the mock—but because the AI kept inventing work.&lt;/p&gt;

&lt;p&gt;Here's what a typical week produced:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Monday - inline styles&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;RevenueCard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;
      &lt;span class="na"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;white&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;borderRadius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;12px&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;24px&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;boxShadow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0 1px 3px rgba(0,0,0,0.1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="nx"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#6B7280&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fontSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;14px&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;Total&lt;/span&gt; &lt;span class="nx"&gt;Revenue&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/span&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt; &lt;span class="na"&gt;fontSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;32px&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fontWeight&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;$124&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Tuesday - MUI (we use Tailwind)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataGrid&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@mui/x-data-grid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Wednesday - CSS modules (since when?)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;styles&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./FilterPanel.module.css&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Thursday - styled-components (not even installed)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;styled&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;styled-components&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four days. Four completely different approaches. The code worked, technically. But maintaining it? Good luck.&lt;/p&gt;

&lt;p&gt;The root cause became obvious: &lt;strong&gt;AI doesn't read documentation the way humans do.&lt;/strong&gt; It pattern-matches. And if your codebase doesn't have clear patterns to match, AI will invent its own—differently every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lesson
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AI reflects your architecture.&lt;/strong&gt; Chaotic codebase, chaotic output. Structured codebase, structured output.&lt;/p&gt;

&lt;p&gt;After months of trial and error, here's what actually works.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Separate State From Representation (Smart vs Dumb Components)
&lt;/h2&gt;

&lt;p&gt;AI writes strange things when fetch logic, loading UI, and display live in the same file. Split them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Container: owns data fetching&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;RevenueCardContainer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRevenue&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RevenueCardView&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;loading&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RevenueCardView&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Revenue unavailable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RevenueCardView&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;empty&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;No revenue yet&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RevenueCardView&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ready&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Presentational: pure UI, tokens only&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;RevenueCardView&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;loading&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;MetricCard&lt;/span&gt; &lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Revenue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;MetricCard&lt;/span&gt; &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Revenue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;empty&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;MetricCard&lt;/span&gt; &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Revenue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;empty&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;MetricCard&lt;/span&gt;
      &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Revenue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;currency&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Storybook becomes the contract AI must honor. Capture the four canonical states—&lt;code&gt;loading&lt;/code&gt;, &lt;code&gt;empty&lt;/code&gt;, &lt;code&gt;error&lt;/code&gt;, &lt;code&gt;ready&lt;/code&gt;—so the bot can't invent new ones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// RevenueCardView.stories.tsx&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Loading&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;loading&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Empty&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;empty&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No revenue yet&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Revenue unavailable&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Ready&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ready&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;124500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;110600&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI stops rebuilding components that already exist because the stories show the "golden" versions.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Adopt Atomic Design
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://atomicdesign.bradfrost.com/chapter-2/" rel="noopener noreferrer"&gt;Atomic Design&lt;/a&gt; by Brad Frost turns out to be exactly what you need when AI is generating your code.&lt;/p&gt;

&lt;p&gt;The core insight: hierarchical composition. Atoms form molecules. Molecules form organisms. Each level has a single responsibility.&lt;/p&gt;

&lt;p&gt;Why does this matter for AI? Because &lt;strong&gt;AI excels at composition when given well-defined pieces&lt;/strong&gt; and falls apart when rules are ambiguous.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Level 1: Atoms - indivisible primitives&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;children&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;TextProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;cn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;textVariants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;textColors&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;color&lt;/span&gt;&lt;span class="p"&gt;])}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;children&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/span&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Skeleton&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;SkeletonProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;cn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;animate-pulse bg-muted rounded&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;skeletonVariants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;])}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Level 2: Molecules - atoms with a purpose&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MetricValue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;space-y-2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Skeleton&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;heading&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Skeleton&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;w-16&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;space-y-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Text&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;display&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;primary&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;formatters&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/Text&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;previousValue&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;TrendIndicator&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;calculateChange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="sr"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Level 3: Organisms - complete UI sections&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MetricCard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Card&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;elevated&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;lg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;MetricLabel&lt;/span&gt; &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;MetricValue&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;previousValue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/Card&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when I ask for "a revenue metric card," AI composes: &lt;code&gt;&amp;lt;MetricCard label="Revenue" value={revenue} format="currency" /&amp;gt;&lt;/code&gt;. Consistent every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Design Tokens as Vocabulary
&lt;/h2&gt;

&lt;p&gt;Components solve structural consistency. Design tokens solve visual consistency.&lt;/p&gt;

&lt;p&gt;Named constants for every visual decision—not "blue" but &lt;code&gt;action-primary&lt;/code&gt;, not "16px" but &lt;code&gt;spacing-4&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;colors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;primary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#6366F1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;primaryHover&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#4F46E5&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;surface&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#F9FAFB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;card&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#FFFFFF&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;primary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#111827&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;secondary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#4B5563&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;muted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#9CA3AF&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wire them through Tailwind config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tailwind.config.js&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;primary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;var(--color-action-primary)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;surface&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;card&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;var(--color-surface-card)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;primary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;var(--color-content-primary)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when AI writes &lt;code&gt;bg-surface-card&lt;/code&gt; or &lt;code&gt;text-content-secondary&lt;/code&gt;, it's speaking your design language. No hex codes that drift.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Scaffold Before You Generate
&lt;/h2&gt;

&lt;p&gt;AI behaves best when you give it a guardrailed sandbox instead of a blank file.&lt;/p&gt;

&lt;p&gt;A command like &lt;code&gt;pnpm ui:generate metric-card&lt;/code&gt; should create:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MetricCard/
├── MetricCard.tsx        # Container
├── MetricCardView.tsx    # Presentational
├── MetricCard.stories.tsx
├── MetricCard.test.tsx
└── index.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The generated files include TODOs and comments telling AI where to edit and where not to touch. AI fills the blanks instead of rewriting the world. You can also use &lt;a href="https://github.com/AgiFlow/aicode-toolkit/tree/main/packages/scaffold-mcp" rel="noopener noreferrer"&gt;this mcp&lt;/a&gt; to help with scaffolding with the folder structure you liked.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Enforce Contracts with Lint and Stories
&lt;/h2&gt;

&lt;p&gt;Static rules catch mistakes before they ship.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// eslint.config.mjs&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;no-restricted-imports&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;styled-components&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@mui/material&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
          &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;**/../*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Import UI from @/components/ui&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tailwindcss/no-custom-classname&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;callees&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cn&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./tailwind.config.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;ESLint bans off-piste imports&lt;/li&gt;
&lt;li&gt;Tailwind plugin forces token utilities&lt;/li&gt;
&lt;li&gt;CI fails if stories miss the four states&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mistakes die in CI, not in code review.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bonus: Composition Over God Components
&lt;/h2&gt;

&lt;p&gt;Don't build this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ 60+ props nobody can reason about&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTable&lt;/span&gt;
  &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;pagination&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;paginationPosition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bottom&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="nx"&gt;sortable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;filterable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;selectable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;// ... 55 more props&lt;/span&gt;
&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Composition: each piece does one thing well&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTable&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTableToolbar&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTableFilter&lt;/span&gt; &lt;span class="nx"&gt;column&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;statusOptions&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTableSearch&lt;/span&gt; &lt;span class="nx"&gt;placeholder&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Search...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/DataTableToolbar&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTableBody&lt;/span&gt; &lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;emptyState&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Empty&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTableFooter&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;DataTablePagination&lt;/span&gt; &lt;span class="nx"&gt;pageSize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/DataTableFooter&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/DataTable&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same capabilities, different mental model. When requirements change, you reorganize JSX rather than hunting through props.&lt;/p&gt;




&lt;h2&gt;
  
  
  Extra Tips
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Embed source locations in the DOM:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt;
  &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;component&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MetricCard&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;src/components/MetricCard/MetricCard.tsx&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI can inspect the DOM and jump straight to the file. No guessing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sub-agents to save context:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your main conversation doesn't need the entire component library in memory. Spin up focused agents for specific tasks (UI fixes, story writing, a11y audits)—they load only what they need and return.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reusable commands:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Build &lt;code&gt;/add-story&lt;/code&gt;, &lt;code&gt;/review-component&lt;/code&gt;, &lt;code&gt;/fix-ui&lt;/code&gt; commands that encode your conventions. AI follows them without you repeating yourself.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Path Forward
&lt;/h2&gt;

&lt;p&gt;The fix for inconsistent AI output isn't better prompting. It's tighter architecture.&lt;/p&gt;

&lt;p&gt;Every atom you add to your library is an atom AI never reinvents. Every design token is a decision that never drifts. Every composition pattern is a template for variations you haven't thought of yet.&lt;/p&gt;

&lt;p&gt;Build the rails, the bot stays on track.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://agiflow.io/blog/roadmap-to-build-scalable-frontend-application-with-ai/" rel="noopener noreferrer"&gt;agiflow.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>react</category>
      <category>webdev</category>
      <category>ai</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>AI Keeps Breaking Your Architectural Patterns. Documentation Won't Fix It.</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Sun, 12 Oct 2025 07:40:32 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/ai-keeps-breaking-your-architectural-patterns-documentation-wont-fix-it-4dgj</link>
      <guid>https://dev.to/vuong_ngo/ai-keeps-breaking-your-architectural-patterns-documentation-wont-fix-it-4dgj</guid>
      <description>&lt;p&gt;I've been using AI coding assistants across our engineering team for over a year. Working in a data department, we had some privilege to experiment and use Claude, Roo-Code and other in-house agents for our daily workflow.&lt;/p&gt;

&lt;p&gt;The pattern emerged slowly. Junior developers shipping features faster than before, which was great. Code reviews taking longer, which wasn't. The code functionally worked, tests passed, but something was consistently off. Direct database imports in service layers. Default exports scattered across a codebase that had standardized on named exports years ago. Repository pattern bypassed in favor of inline SQL.&lt;/p&gt;

&lt;p&gt;These weren't bugs. The code ran fine in production. They were architectural drift—the slow erosion of patterns we'd spent years establishing. What made it frustrating was the inconsistency. A junior developer would correctly implement dependency injection in one file, then bypass it completely in the next. Same developer, same day, same codebase. The knowledge was there, but it wasn't being applied consistently.&lt;/p&gt;

&lt;p&gt;The obvious answer was "better code review." But that doesn't scale. When you're reviewing 20+ PRs a day across a 50-package monorepo, you can't catch every architectural violation. And the ones you miss compound.&lt;/p&gt;

&lt;p&gt;Here's what we figured out: this isn't an AI problem or a developer problem. It's a feedback timing problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AI-generated code violates architectural patterns because of timing and context, not capability&lt;/li&gt;
&lt;li&gt;Static documentation creates a validation gap that AI can't bridge&lt;/li&gt;
&lt;li&gt;Effective architecture enforcement requires runtime feedback loops, not upfront documentation&lt;/li&gt;
&lt;li&gt;Path-based pattern matching provides file-specific architectural context&lt;/li&gt;
&lt;li&gt;We built Architect MCP to close the feedback loop at code generation time&lt;/li&gt;
&lt;li&gt;Results: 80% pattern compliance vs 30-40% with documentation alone&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Real Problem: Temporal and Spatial Context Loss
&lt;/h2&gt;

&lt;p&gt;Let's be precise about what's happening here. AI coding assistants operate with ephemeral context windows. Even with project-specific documentation (CLAUDE.md, system prompts, etc.), there's a fundamental mismatch between when architectural constraints are communicated and when they need to be applied.&lt;/p&gt;

&lt;p&gt;Consider a typical session:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Claude reads your architectural guidelines at initialization (t=0)&lt;/li&gt;
&lt;li&gt;You discuss requirements, explore the codebase, iterate on design (t=0 to t=20min)&lt;/li&gt;
&lt;li&gt;Claude generates code implementing the agreed-upon logic (t=20min)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By step 3, the architectural constraints from step 1 are 20 minutes and dozens of messages removed from the working context. The AI is optimizing for correctness against the immediate requirements, not consistency against architectural patterns defined at session start.&lt;/p&gt;

&lt;p&gt;This isn't a memory problem—it's a &lt;strong&gt;priority and relevance problem&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What AI Optimizes For
&lt;/h3&gt;

&lt;p&gt;When generating code, LLMs are fundamentally pattern-matching against their training data. Your specific architectural conventions represent a tiny signal compared to the millions of codebases in the training set. Without active feedback, the model defaults to the strongest statistical patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Common &amp;gt; Custom&lt;/strong&gt;: Express.js patterns over your Hono.js conventions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simple &amp;gt; Structured&lt;/strong&gt;: Direct database calls over repository pattern&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Familiar &amp;gt; Framework-specific&lt;/strong&gt;: Default exports because they're ubiquitous in the training data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why you see the same violations repeatedly, even with extensive documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Documentation Fails (And What That Tells Us)
&lt;/h2&gt;

&lt;p&gt;Our first attempt was documentation. We already had a substantial CLAUDE.md, but we expanded it. Detailed sections on dependency injection patterns, repository layer requirements, export conventions, framework-specific architectural rules. We made it comprehensive—over 3,000 lines.&lt;/p&gt;

&lt;p&gt;Junior developers referenced it. AI assistants had access to it. Compliance rate stayed around 40%. The failure modes are instructive:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Relevance Gap
&lt;/h3&gt;

&lt;p&gt;A 1k-line document applies to every file equally, which means it applies to no file specifically. A repository needs repository-specific guidance. A React component needs component-specific rules. Serving generic "follow clean architecture" advice to both is essentially noise.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The Retrieval Problem
&lt;/h3&gt;

&lt;p&gt;Even with RAG systems, retrieving the right architectural context at code generation time is non-trivial. You need to know what patterns apply before you can retrieve them. If Claude is generating a new file type, there's no obvious query to pull the relevant constraints.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The Validation Gap
&lt;/h3&gt;

&lt;p&gt;This is the critical one. Documentation describes correct patterns but provides no mechanism to verify compliance. It's teaching without testing. The feedback loop is broken.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rethinking the Problem: Feedback Over Front-loading
&lt;/h2&gt;

&lt;p&gt;Here's the architectural insight: &lt;strong&gt;you can't front-load all context, but you can close the feedback loop.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of trying to make AI remember everything upfront, we need to provide architectural feedback at two critical moments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Before code generation&lt;/strong&gt;: "What patterns apply to this specific file?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After code generation&lt;/strong&gt;: "Does this implementation comply with those patterns?"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This shifts from a memory problem to a validation problem. And validation can be automated.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Feedback Loop Architecture
&lt;/h3&gt;

&lt;p&gt;The system needs three components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Pattern Database&lt;/strong&gt;&lt;br&gt;
Organized by file path patterns with specific architectural requirements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;src/repositories/**/*.ts → Repository pattern rules&lt;/span&gt;
&lt;span class="s"&gt;src/services/**/*.ts → Service layer rules&lt;/span&gt;
&lt;span class="s"&gt;src/components/**/*.tsx → Component architecture rules&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Pre-generation Context Injection&lt;/strong&gt;&lt;br&gt;
Before generating code, query the pattern database with the target file path. Inject specific, relevant architectural constraints into the immediate context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Post-generation Validation&lt;/strong&gt;&lt;br&gt;
After code generation, validate against the same patterns. Use severity ratings to determine action (submit, flag, auto-fix).&lt;/p&gt;

&lt;p&gt;The key insight: &lt;strong&gt;specificity matters more than comprehensiveness&lt;/strong&gt;. Better to provide five highly relevant rules for a specific file than 50 generic rules that might apply.&lt;/p&gt;
&lt;h2&gt;
  
  
  Implementation: Architect MCP
&lt;/h2&gt;

&lt;p&gt;We implemented this as an MCP (Model Context Protocol) server with two primary tools:&lt;/p&gt;
&lt;h3&gt;
  
  
  get-file-design-pattern
&lt;/h3&gt;

&lt;p&gt;Provides file-specific architectural context before code generation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Input: File path&lt;/span&gt;
&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;design&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nf"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;src/repositories/userRepository.ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Output: Specific patterns for this file type&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;template&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;backend/hono-api&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;patterns&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Implement IRepository&amp;lt;T&amp;gt; interface&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Use constructor-injected database connection&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Named exports only (export class RepositoryName)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;No direct database imports (import from '../db' is violation)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;reference&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;src/repositories/baseRepository.ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs &lt;strong&gt;before&lt;/strong&gt; Claude generates code, injecting precise architectural requirements into the active context.&lt;/p&gt;

&lt;h3&gt;
  
  
  review-code-change
&lt;/h3&gt;

&lt;p&gt;Validates generated code against architectural patterns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Input: File path and generated code&lt;/span&gt;
&lt;span class="nx"&gt;review&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nf"&gt;change&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;src/repositories/userRepository.ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;generatedCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Output: Structured validation results&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;severity&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;LOW&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MEDIUM&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;HIGH&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;violations&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...],&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;compliance&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;92%&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;patterns_followed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;✅ Implements IRepository&amp;lt;User&amp;gt;&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...],&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;recommendations&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs &lt;strong&gt;after&lt;/strong&gt; code generation, providing structured feedback that can drive automation (auto-submit on LOW, flag on MEDIUM, auto-fix on HIGH).&lt;/p&gt;

&lt;h2&gt;
  
  
  Path-Based Pattern Matching: The Critical Detail
&lt;/h2&gt;

&lt;p&gt;The pattern database uses path-based matching to provide file-specific guidance. This deserves deeper explanation because it's where the system gains leverage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern Hierarchy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Global patterns (apply to all projects)&lt;/span&gt;
&lt;span class="na"&gt;**/*.ts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;No 'any' types without justification&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Use named exports&lt;/span&gt;

&lt;span class="c1"&gt;# Template patterns (apply to projects using this template)&lt;/span&gt;
&lt;span class="na"&gt;backend/hono-api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;src/repositories/**/*.ts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Implement IRepository&amp;lt;T&amp;gt;&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Use dependency injection&lt;/span&gt;

  &lt;span class="na"&gt;src/services/**/*.ts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;No direct database access&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Use repository layer&lt;/span&gt;

&lt;span class="c1"&gt;# Project patterns (apply to specific project)&lt;/span&gt;
&lt;span class="na"&gt;user-management-api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;src/services/authService.ts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Must use AuthProvider interface&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Specific to auth domain&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system applies patterns from most general to most specific, with later patterns overriding earlier ones. This provides both consistency (global rules) and flexibility (project-specific exceptions).&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Scales
&lt;/h3&gt;

&lt;p&gt;New projects inherit template patterns automatically. No need to reconfigure architectural rules for every new service—just specify the template in &lt;code&gt;project.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"new-api-service"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourceTemplate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"backend/hono-api"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The service immediately inherits 50+ architectural patterns specific to Hono.js APIs.&lt;/p&gt;

&lt;h2&gt;
  
  
  LLM-Powered Validation: Using AI to Check AI
&lt;/h2&gt;

&lt;p&gt;Here's a non-obvious design choice: we use Claude to validate Claude-generated code.&lt;/p&gt;

&lt;p&gt;Why? Because architectural compliance isn't mechanical pattern matching. Consider:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mechanical linter approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Regex: /export\s+default/&lt;/span&gt;
&lt;span class="c1"&gt;// Violation: Uses default export&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;LLM validation approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Understands context and intent&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Violation: Uses default export when named export required per repository pattern&lt;/span&gt;
&lt;span class="c1"&gt;// Recommendation: Change to 'export class UserService' for consistency with repository pattern established in architect.yaml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM-based validation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understands architectural intent, not just syntax&lt;/li&gt;
&lt;li&gt;Provides contextual explanations&lt;/li&gt;
&lt;li&gt;Can reason about related patterns (if you're violating DI, you're probably also missing interface implementation)&lt;/li&gt;
&lt;li&gt;Generates actionable recommendations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is more expensive than static linting, but the cost is justified because it runs only on changed files and provides significantly higher signal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layered Validation: Defense in Depth
&lt;/h2&gt;

&lt;p&gt;Architect MCP isn't a replacement for existing validation layers—it's complementary. The full validation stack:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: TypeScript Compiler&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Catches: Type errors, syntax violations&lt;/li&gt;
&lt;li&gt;Speed: &amp;lt; 1s&lt;/li&gt;
&lt;li&gt;Coverage: Type safety&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Biome/ESLint&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Catches: Code style, simple rules&lt;/li&gt;
&lt;li&gt;Speed: &amp;lt; 5s&lt;/li&gt;
&lt;li&gt;Coverage: Style consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Architect MCP&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Catches: Architectural pattern violations&lt;/li&gt;
&lt;li&gt;Speed: 5-10s (LLM call)&lt;/li&gt;
&lt;li&gt;Coverage: Framework-specific architecture&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: Code Review (Human/AI)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Catches: Business logic, complex issues&lt;/li&gt;
&lt;li&gt;Speed: Minutes to hours&lt;/li&gt;
&lt;li&gt;Coverage: Domain-specific concerns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each layer has different trade-offs. TypeScript is fast but can't enforce architectural patterns. Linting handles style but not domain architecture. Architect MCP fills the gap between syntax/style and human review.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Changed
&lt;/h2&gt;

&lt;p&gt;After 3 months in production across our 50+ project monorepo with a team of 8 developers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The obvious improvement:&lt;/strong&gt; Architectural violations became rare instead of common. Not eliminated—there are still legitimate cases where you need to break a pattern—but the unconscious drift stopped. Junior developers stopped ping-ponging between following patterns correctly and breaking them in the next file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The unexpected improvement:&lt;/strong&gt; Code review shifted. We thought we'd just catch violations faster. What actually happened was we stopped spending review cycles on architectural corrections. Comments like "this should use dependency injection" or "use named exports" basically disappeared. Reviews focused on design decisions, edge cases, business logic—things that actually need human judgment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The subtle improvement:&lt;/strong&gt; Context-switching overhead decreased. When you're working across multiple projects with different architectural patterns (Next.js app vs Hono API vs TypeScript library), you're constantly reloading mental context. Having the validation layer means you find out immediately when you've applied the wrong pattern to the wrong project, not three reviews later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What didn't improve:&lt;/strong&gt; We still see legitimate architectural violations. Sometimes you need to bypass a pattern for a specific reason. The difference is those are now conscious decisions documented in the PR, not unconscious mistakes that slip through review.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Reveals About AI-Assisted Development
&lt;/h2&gt;

&lt;p&gt;The broader lesson: &lt;strong&gt;AI coding assistants need tight feedback loops, not extensive documentation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This mirrors how junior developers actually learn a codebase. They don't absorb architectural patterns by reading documentation upfront. They learn by:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Getting specific guidance for the task at hand&lt;/li&gt;
&lt;li&gt;Making changes&lt;/li&gt;
&lt;li&gt;Getting feedback on what they did wrong&lt;/li&gt;
&lt;li&gt;Iterating&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When junior developers pair with AI, both need the same learning structure. The difference is speed. Human code review happens in hours or days. Automated feedback happens in seconds. That speed difference is what makes the approach viable.&lt;/p&gt;

&lt;p&gt;The unexpected insight: this doesn't just help junior developers. Senior developers using AI make the same architectural mistakes—they just catch them earlier in their own review. Automated validation helps everyone maintain consistency when context-switching between projects with different architectural patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation Notes
&lt;/h2&gt;

&lt;p&gt;If you're considering building something similar, a few non-obvious lessons:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Pattern Granularity Matters&lt;/strong&gt;&lt;br&gt;
Too broad (e.g., "follow clean architecture") and AI can't apply it. Too narrow (e.g., "line 47 must use Promise.all") and you've essentially hardcoded the implementation. The right level is "file-type specific patterns" (repository pattern for repositories, component pattern for components).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Severity Ratings Enable Automation&lt;/strong&gt;&lt;br&gt;
Without severity ratings, you can't automate responses. With them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LOW → Auto-submit (pattern followed)&lt;/li&gt;
&lt;li&gt;MEDIUM → Flag for attention (minor violations)&lt;/li&gt;
&lt;li&gt;HIGH → Block submission (critical violations)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Template Inheritance Is Critical for Scale&lt;/strong&gt;&lt;br&gt;
Defining patterns per-project doesn't scale past ~10 projects. Template-based inheritance means you define patterns once per framework/architecture, then all projects using that template inherit them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. LLM Validation Is Worth the Cost&lt;/strong&gt;&lt;br&gt;
We initially tried regex-based pattern matching. It caught obvious violations—literal regex matches like &lt;code&gt;export default&lt;/code&gt;—but missed anything requiring context. Why is this a default export? Is it actually violating the pattern or is this one of the legitimate exceptions? Regex can't answer that. LLM validation understands intent and context. Yes, it costs money per validation. But the alternative is human code review catching these issues, which is orders of magnitude more expensive in terms of developer time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Architect MCP is open source: &lt;a href="https://github.com/AgiFlow/aicode-toolkit" rel="noopener noreferrer"&gt;github.com/AgiFlow/aicode-toolkit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The implementation is straightforward—it's an MCP server that reads YAML pattern definitions and uses Claude to validate code against them. The hard part isn't the code—it's defining your architectural patterns clearly enough to encode them. We spent more time debating what our patterns actually were than building the validation system.&lt;/p&gt;

&lt;p&gt;If you're building something similar, start with:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Identify your top 5 most-violated architectural patterns&lt;/li&gt;
&lt;li&gt;Define them as path-based rules in YAML&lt;/li&gt;
&lt;li&gt;Build the pre-generation context injection first (higher ROI than validation)&lt;/li&gt;
&lt;li&gt;Add validation once you've proven the concept&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Open Questions
&lt;/h2&gt;

&lt;p&gt;We're still figuring out:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Pattern Evolution&lt;/strong&gt;&lt;br&gt;
How do you version architectural patterns? When you update a pattern, do you auto-update all projects or let them opt-in?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Cross-File Patterns&lt;/strong&gt;&lt;br&gt;
Current implementation handles single-file patterns well. Cross-file architectural concerns (e.g., "services should only call repositories, never directly call other services") are harder to encode and validate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Performance at Scale&lt;/strong&gt;&lt;br&gt;
LLM-based validation works well at our scale (50 projects, ~10 changes/day). What happens at 500 projects or 1000 changes/day? Do you need caching, batching, or a hybrid approach?&lt;/p&gt;

&lt;p&gt;If you've solved these problems, I'd love to hear about it.&lt;/p&gt;




&lt;p&gt;If you're dealing with similar problems—AI generating code that works but breaks your architectural patterns—I'd be curious to hear how you're handling it. Drop a comment or reach out.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Resources:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/AgiFlow/aicode-toolkit" rel="noopener noreferrer"&gt;Architect MCP GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Preview post about &lt;a href="https://dev.to/vuong_ngo/scaling-ai-assisted-development-how-scaffolding-solved-my-monorepo-chaos-1g1k"&gt;Scaffolding technique&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>development</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>Scaling AI-Assisted Development: How Scaffolding Solved My Monorepo Chaos</title>
      <dc:creator>Vuong Ngo</dc:creator>
      <pubDate>Sun, 05 Oct 2025 23:40:29 +0000</pubDate>
      <link>https://dev.to/vuong_ngo/scaling-ai-assisted-development-how-scaffolding-solved-my-monorepo-chaos-1g1k</link>
      <guid>https://dev.to/vuong_ngo/scaling-ai-assisted-development-how-scaffolding-solved-my-monorepo-chaos-1g1k</guid>
      <description>&lt;p&gt;The Moment I Realized AI Coding Was Broken.&lt;/p&gt;

&lt;p&gt;It was 10PM. I'd just asked Claude to add a navigation component. Thirty seconds later, I was staring at this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What the AI generated (again)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Navigation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;NavigationProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setOpen&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;nav&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;navigation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/nav&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;Navigation&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing wrong with it, technically. Except I don't use default exports. I use named exports. And useState should come from our custom hooks. And we use 'isOpen', not 'open'. And the TypeScript interface should be exported separately like every other component in our codebase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I'd explained this exact pattern so many times I'd lost count.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Same pattern. Different day. Different component. Different wrong implementation.&lt;/p&gt;

&lt;p&gt;This wasn't a one-off. My monorepo had become a Frankenstein's monster of inconsistent patterns—each one technically correct, all of them a maintenance nightmare.&lt;/p&gt;

&lt;p&gt;The promise was simple: &lt;em&gt;AI would code faster than humans.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The reality? I was spending more time fixing AI-generated code than I would've spent just writing it myself.&lt;/p&gt;

&lt;h2&gt;
  
  
  How AI-Assisted Development Actually Breaks
&lt;/h2&gt;

&lt;p&gt;Here's what nobody tells you about scaling with AI coding assistants:&lt;/p&gt;

&lt;h3&gt;
  
  
  Week 1: The Honeymoon Phase
&lt;/h3&gt;

&lt;p&gt;You: "Build me a login page"&lt;br&gt;
AI: ✨ &lt;em&gt;generates perfect login page&lt;/em&gt; ✨&lt;br&gt;
You: "Holy shit, this is the future"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Everything works.&lt;/strong&gt; You're shipping features at 10x speed. Your manager thinks you're a wizard. You're thinking about that promotion.&lt;/p&gt;
&lt;h3&gt;
  
  
  Month 1: The Cracks Appear
&lt;/h3&gt;

&lt;p&gt;You're reviewing frontend components and notice something odd:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// TaskBadge.tsx (written 2 weeks ago)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TaskBadge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;TaskBadgeProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;`badge &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/span&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// PriorityBadge.tsx (written yesterday)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;PriorityBadge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PriorityProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getPriorityColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;color&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// StatusLabel.tsx (written today)&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;StatusLabel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;StatusProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Badge&lt;/span&gt; &lt;span class="nx"&gt;variant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;getVariant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/Badge&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three badge components. Three different patterns. All from the same AI. All from the same human (you).&lt;/p&gt;

&lt;p&gt;And on the backend, it's the same story:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// userService.ts (2 weeks ago)&lt;/span&gt;
&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(@&lt;/span&gt;&lt;span class="nd"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// authService.ts (yesterday)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AuthService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// paymentService.ts (today)&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PaymentService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three services. One uses dependency injection properly. One doesn't. One is halfway there.&lt;/p&gt;

&lt;p&gt;"Okay, I need better instructions," you think.&lt;/p&gt;

&lt;h3&gt;
  
  
  Month 2: The Documentation Death Spiral
&lt;/h3&gt;

&lt;p&gt;Your &lt;code&gt;CLAUDE.md&lt;/code&gt; file has grown massively. You've documented:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Component patterns ✓&lt;/li&gt;
&lt;li&gt;Import styles ✓&lt;/li&gt;
&lt;li&gt;File naming ✓&lt;/li&gt;
&lt;li&gt;Prop validation ✓&lt;/li&gt;
&lt;li&gt;Error handling ✓&lt;/li&gt;
&lt;li&gt;State management ✓&lt;/li&gt;
&lt;li&gt;API patterns ✓&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You've told the AI &lt;strong&gt;everything&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Then you ask it to create a settings page, and it &lt;em&gt;still&lt;/em&gt; uses a different button component than the rest of your app.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"But I literally documented this!"&lt;/strong&gt; you scream at your screen at 3 AM.&lt;/p&gt;

&lt;p&gt;The AI apologizes (One-time, I said "Your're f*king right" which is hilarious). Generates a new version. Wrong again, but differently wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  Month 3: The Breaking Point
&lt;/h3&gt;

&lt;p&gt;You're now maintaining:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dozens of CLAUDE.md files scattered everywhere&lt;/li&gt;
&lt;li&gt;Multiple variations of what should be the same pattern&lt;/li&gt;
&lt;li&gt;A massive style guide that the AI follows inconsistently&lt;/li&gt;
&lt;li&gt;Code reviews that are mostly style debates instead of logic discussions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The math breaks:&lt;/strong&gt; You're spending hours fixing what should've taken minutes to write.&lt;/p&gt;

&lt;p&gt;This was me. And this was my monorepo:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend apps built with Next.js and TanStack Start&lt;/li&gt;
&lt;li&gt;Backend APIs using Hono.js, FastAPI and Lambda&lt;/li&gt;
&lt;li&gt;Shared packages for everything reusable&lt;/li&gt;
&lt;li&gt;Microservices, edge functions, and infrastructure all in one repo&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The bigger it grew, the worse it got. And I wasn't alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Failed Experiments
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Attempt 1: The Mega CLAUDE.md
&lt;/h3&gt;

&lt;p&gt;I created comprehensive documentation files referencing everything:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Project Structure&lt;/li&gt;
&lt;li&gt;Coding Standards&lt;/li&gt;
&lt;li&gt;Technology Stack&lt;/li&gt;
&lt;li&gt;Conventions&lt;/li&gt;
&lt;li&gt;Style System&lt;/li&gt;
&lt;li&gt;Development Process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Even with token-efficient docs, I couldn't cover all design patterns across multiple languages and frameworks. AI still made mistakes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Attempt 2: CLAUDE.md Everywhere
&lt;/h3&gt;

&lt;p&gt;"Maybe collocated instructions work better?" I created CLAUDE.md files everywhere for different apps, APIs, and packages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Slightly better when loaded in context (which didn't always happen). But the real issue: I only had a handful of distinct patterns. Maintaining dozens of instruction files for those same patterns? Nightmare fuel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Attempt 3: Autonomous Workflows
&lt;/h3&gt;

&lt;p&gt;I set up autonomous loops: PRD → code → lint/test → fix → repeat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; I spent more time removing code and fixing bugs than if I'd just coded it myself. The AI would hallucinate solutions, ignore patterns, and create technical debt faster than I could clean it up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Core Problems
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;Inconsistency Across Codebase&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Frontend:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AgentStatus.tsx - uses our design system&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;AgentStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Badge&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;getStatusColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/Badge&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// TaskStatus.tsx - reinvents the wheel&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;TaskProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status-badge&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// SessionStatus.tsx - different again&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SessionStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SessionProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;styles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/span&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Backend:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// taskRepo.ts - proper DI&lt;/span&gt;
&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TaskRepository&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(@&lt;/span&gt;&lt;span class="nd"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IDatabaseService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// projectRepo.ts - missing decorator&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProjectRepository&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IDatabaseService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// memberRepo.ts - no DI at all&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemberRepository&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getDatabaseClient&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same concept, different implementations. All technically correct. All maintenance nightmares.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;strong&gt;Context Window Overload&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Your documentation grows from this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Conventions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Use functional components
&lt;span class="p"&gt;-&lt;/span&gt; Use TypeScript
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To this monstrosity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Conventions&lt;/span&gt;
&lt;span class="gu"&gt;## Components&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Use functional components
&lt;span class="p"&gt;-&lt;/span&gt; Props interface must be exported
&lt;span class="p"&gt;-&lt;/span&gt; Use PascalCase for component names
...(10+ reference docs)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Eventually, even AI can't keep up.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. &lt;strong&gt;Pattern Recreation Waste&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;How many times have you watched AI recreate the same pattern?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authenticated API routes with similar structure&lt;/li&gt;
&lt;li&gt;Badge components that look identical but use different approaches&lt;/li&gt;
&lt;li&gt;Repository classes with the same DI pattern but inconsistent implementation&lt;/li&gt;
&lt;li&gt;Service classes that all need the same base configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each time slightly different. &lt;strong&gt;Hours wasted on work already done.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Intelligent Scaffolding
&lt;/h2&gt;

&lt;p&gt;Instead of fighting these problems with longer instructions, I needed a fundamental shift: &lt;strong&gt;teach AI to use templates, not just write code&lt;/strong&gt;.&lt;br&gt;
&lt;strong&gt;How It Works:&lt;/strong&gt; The scaffolding approach leverages &lt;strong&gt;MCP (Model Context Protocol)&lt;/strong&gt; to expose template generation as a tool that AI agents can call. It uses &lt;strong&gt;structured output&lt;/strong&gt; (JSON Schema validation) for the initial code generation, ensuring variables are properly typed and validated. This generated code then serves as &lt;strong&gt;guided generation for the LLM&lt;/strong&gt;—providing a solid foundation that follows your patterns, which the AI can then enhance with context-specific logic. Think of it as "fill-in-the-blanks" coding: the structure is guaranteed consistent, while the AI adds intelligence where it matters.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Key Insight
&lt;/h3&gt;

&lt;p&gt;Traditional scaffolding requires complete, rigid templates. But with AI coding assistants, you only need:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;A skeleton with minimal code&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A header comment declaring the pattern and rules&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Let the AI fill in the blanks contextually&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example from our actual codebase:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="cm"&gt;/**
 * PATTERN: Injectable Service with Dependency Injection
 * - MUST use @injectable() decorator
 * - MUST inject dependencies with @inject(TYPES.*)
 * - MUST define constructor parameters as private/public based on usage
 * - MUST include JSDoc with design principles
 */&lt;/span&gt;
&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="err"&gt;{{ &lt;/span&gt;&lt;span class="nc"&gt;ServiceName&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="nx"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IDatabaseService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// AI fills in initialization logic&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// AI generates methods following the established pattern&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI now knows the rules and generates code that follows them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter: Scaffold MCP
&lt;/h2&gt;

&lt;p&gt;I built the &lt;a href="https://github.com/AgiFlow/aicode-toolkit" rel="noopener noreferrer"&gt;&lt;code&gt;@agiflowai/scaffold-mcp&lt;/code&gt;&lt;/a&gt; to implement this approach. It's an MCP (Model Context Protocol) server that provides:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Boilerplate templates&lt;/strong&gt; for new projects&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature scaffolds&lt;/strong&gt; for adding to existing projects&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-friendly&lt;/strong&gt; minimal templates with clear patterns&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why MCP?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Works with Claude Code, Cursor, or any MCP-compatible tool&lt;/li&gt;
&lt;li&gt;✅ Tech stack agnostic (Next.js, React, Hono.js, your custom setup)&lt;/li&gt;
&lt;li&gt;✅ Multiple modes: MCP server or standalone CLI&lt;/li&gt;
&lt;li&gt;✅ Always available to AI like any other tool&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real-World Workflow Transformation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Before Scaffolding: Starting a New API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: "Create a new Hono API with authentication"
AI: *generates files with different patterns*
You: "Wait, where's the dependency injection?"
You: "Can you use our standard middleware setup?"
You: "Actually, use Zod for validation like our other APIs..."
*Back-and-forth debugging*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  After Scaffolding: Starting a New API
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Using CLI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See available templates&lt;/span&gt;
scaffold-mcp boilerplate list

&lt;span class="c"&gt;# Create API with exact conventions&lt;/span&gt;
scaffold-mcp boilerplate create hono-api-boilerplate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--vars&lt;/span&gt; &lt;span class="s1"&gt;'{"apiName":"notification-api","port":"3002"}'&lt;/span&gt;

&lt;span class="c"&gt;# ✓ Complete API structure created&lt;/span&gt;
&lt;span class="c"&gt;# ✓ Dependency injection configured&lt;/span&gt;
&lt;span class="c"&gt;# ✓ All following your team's conventions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using Claude Code:&lt;/strong&gt;&lt;br&gt;
Simply say: "Create a new notification API"&lt;/p&gt;

&lt;p&gt;Claude automatically uses the scaffold-mcp MCP server and creates your API with proper DI, middleware, and validation.&lt;/p&gt;
&lt;h3&gt;
  
  
  Before Scaffolding: Adding Features
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: "Add a new repository class for comments"
AI: *creates class without DI decorator*
You: "No, use dependency injection like the other repos"
AI: *adds DI but forgets the @injectable decorator*
You: "Look at TaskRepository as an example"
*More back-and-forth*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  After Scaffolding: Adding Features
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Using CLI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What can I add?&lt;/span&gt;
scaffold-mcp scaffold list ./backend/apis/my-api

&lt;span class="c"&gt;# Add matching feature&lt;/span&gt;
scaffold-mcp scaffold add scaffold-repository &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--project&lt;/span&gt; ./backend/apis/my-api &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--vars&lt;/span&gt; &lt;span class="s1"&gt;'{"entityName":"Comment","tableName":"comments"}'&lt;/span&gt;

&lt;span class="c"&gt;# ✓ Perfect pattern match with proper DI&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using Claude Code:&lt;/strong&gt;&lt;br&gt;
"Add a repository for comments to my API"&lt;/p&gt;

&lt;p&gt;Claude uses scaffold-mcp to ensure the new repository matches your DI patterns, uses the correct decorators, and follows your coding standards.&lt;/p&gt;
&lt;h2&gt;
  
  
  Creating Your Own Templates
&lt;/h2&gt;

&lt;p&gt;The real power comes from encoding your team's patterns.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Installation
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @agiflowai/scaffold-mcp

&lt;span class="c"&gt;# Initialize templates&lt;/span&gt;
scaffold-mcp init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: Enable Admin Tools (Claude Code)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"scaffold-mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@agiflowai/scaffold-mcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp-serve"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"--admin-enable"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 3: Create Template with AI
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Tell Claude:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Create a boilerplate template called 'injectable-service' in 'backend-templates'
that creates backend services with dependency injection and proper structure"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude uses the admin tool&lt;/strong&gt; to generate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# templates/backend-templates/scaffold.yaml&lt;/span&gt;
&lt;span class="na"&gt;boilerplate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;injectable-service&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Backend service with DI, config, and proper structure&lt;/span&gt;
  &lt;span class="na"&gt;targetFolder&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend/apis/my-api/src/services&lt;/span&gt;

  &lt;span class="na"&gt;variables_schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
    &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;serviceName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^[A-Z][a-zA-Z0-9]*$"&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service name in PascalCase&lt;/span&gt;
    &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;serviceName&lt;/span&gt;

  &lt;span class="na"&gt;includes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="pi"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;serviceName | camelCase&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;&lt;span class="s"&gt;Service.ts&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="pi"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;serviceName | camelCase&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;&lt;span class="s"&gt;Service.test.ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Create Template Files
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Tell Claude:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Create the service template file with dependency injection, config support, and JSDoc"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude generates:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// templates/backend-templates/{{ serviceName | camelCase }}Service.ts.liquid&lt;/span&gt;
&lt;span class="cm"&gt;/**
 * {{ serviceName }} Service
 *
 * DESIGN PRINCIPLES:
 * - Dependency Injection: Use Inversify for loose coupling
 * - Type Safety: Full TypeScript support
 * - Configuration: Use Config service for environment variables
 *
 * PATTERN: Injectable Service
 * - MUST use @injectable() decorator
 * - MUST inject dependencies with @inject(TYPES.*)
 * - MUST document design principles in JSDoc
 */&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;injectable&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;inversify&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IDatabaseService&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@agimonai/backend-database&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Config&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@/services&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;TYPES&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@/types&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="err"&gt;{{ &lt;/span&gt;&lt;span class="nc"&gt;serviceName&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="nx"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IDatabaseService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Initialization logic&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="cm"&gt;/**
   * Add your service methods here
   */&lt;/span&gt;
  &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Implementation&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Use Your Template
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;scaffold-mcp boilerplate create injectable-service &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--vars&lt;/span&gt; &lt;span class="s1"&gt;'{"serviceName":"Email"}'&lt;/span&gt;

&lt;span class="c"&gt;# ✓ Created backend/apis/my-api/src/services/emailService.ts&lt;/span&gt;
&lt;span class="c"&gt;# ✓ Created backend/apis/my-api/src/services/emailService.test.ts&lt;/span&gt;
&lt;span class="c"&gt;# ✓ All with proper DI, JSDoc, and patterns&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Create a new Email service using our injectable service template"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;p&gt;After switching to scaffolding:&lt;/p&gt;

&lt;h3&gt;
  
  
  Before
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Setup time&lt;/strong&gt;: Hours of back-and-forth per project&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code consistency&lt;/strong&gt;: Inconsistent across the codebase&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review time&lt;/strong&gt;: Mostly spent on style debates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt;: Weeks to learn all the conventions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  After
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Setup time&lt;/strong&gt;: Minutes per project&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code consistency&lt;/strong&gt;: Enforced by templates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review time&lt;/strong&gt;: Focused on logic, not style&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt;: Days instead of weeks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Net result&lt;/strong&gt;: Dramatically faster initialization, zero convention debates, consistent quality across the entire monorepo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Start Simple, Evolve Gradually
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Week 1: Use community templates&lt;/span&gt;
scaffold-mcp add &lt;span class="nt"&gt;--name&lt;/span&gt; nextjs-15 &lt;span class="nt"&gt;--url&lt;/span&gt; https://github.com/AgiFlow/aicode-toolkit

&lt;span class="c"&gt;# Weeks 2-4: Observe what you change repeatedly&lt;/span&gt;

&lt;span class="c"&gt;# Week 5+: Create custom templates for your patterns&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Use Liquid Filters for Consistency
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight liquid"&gt;&lt;code&gt;&lt;span class="cp"&gt;{%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;comment&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="c"&gt;
✅ Good: Ensure consistent casing with filters
Available filters: pascalCase, camelCase, kebabCase, snakeCase, upperCase
&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;endcomment&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;%}&lt;/span&gt;
@injectable()
export class &lt;span class="cp"&gt;{{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;serviceName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;pascalCase&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;}}&lt;/span&gt;Service {
  private readonly logger = createLogger('&lt;span class="cp"&gt;{{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;serviceName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;kebabCase&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;}}&lt;/span&gt;');
  private readonly TABLE_NAME = '&lt;span class="cp"&gt;{{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;tableName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;snakeCase&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;}}&lt;/span&gt;';
}

&lt;span class="cp"&gt;{%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;comment&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="c"&gt;
❌ Bad: Rely on user input casing
&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;endcomment&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;%}&lt;/span&gt;
export class &lt;span class="cp"&gt;{{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;serviceName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;}}&lt;/span&gt;Service {
  private logger = createLogger('&lt;span class="cp"&gt;{{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;serviceName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;}}&lt;/span&gt;');
  private TABLE = '&lt;span class="cp"&gt;{{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;tableName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cp"&gt;}}&lt;/span&gt;';
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Validate with JSON Schema
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ✅ Good: Enforce format and patterns&lt;/span&gt;
&lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;serviceName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
    &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^[A-Z][a-zA-Z0-9]*$"&lt;/span&gt;  &lt;span class="c1"&gt;# Must be PascalCase&lt;/span&gt;
    &lt;span class="na"&gt;example&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Email"&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;number&lt;/span&gt;
    &lt;span class="na"&gt;minimum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
    &lt;span class="na"&gt;maximum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;9999&lt;/span&gt;

&lt;span class="c1"&gt;# ❌ Bad: Accept anything&lt;/span&gt;
&lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;serviceName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;number&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Document in Templates
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;instruction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
  &lt;span class="s"&gt;Service created successfully!&lt;/span&gt;

  &lt;span class="s"&gt;Files created:&lt;/span&gt;
  &lt;span class="s"&gt;- {{ serviceName | camelCase }}Service.ts: Main service with DI&lt;/span&gt;
  &lt;span class="s"&gt;- {{ serviceName | camelCase }}Service.test.ts: Test suite&lt;/span&gt;

  &lt;span class="s"&gt;Next steps:&lt;/span&gt;
  &lt;span class="s"&gt;1. Register in TYPES: add {{ serviceName }}Service to dependency container&lt;/span&gt;
  &lt;span class="s"&gt;2. Run `pnpm test` to verify tests pass&lt;/span&gt;
  &lt;span class="s"&gt;3. Inject: @inject(TYPES.{{ serviceName }}Service)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Getting Started Today
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Quick Start (5 minutes)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Install&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @agiflowai/scaffold-mcp

&lt;span class="c"&gt;# 2. Initialize&lt;/span&gt;
scaffold-mcp init

&lt;span class="c"&gt;# 3. List templates&lt;/span&gt;
scaffold-mcp boilerplate list

&lt;span class="c"&gt;# 4. Create project&lt;/span&gt;
scaffold-mcp boilerplate create &amp;lt;name&amp;gt; &lt;span class="nt"&gt;--vars&lt;/span&gt; &lt;span class="s1"&gt;'{"projectName":"my-app"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Claude Code Setup (2 minutes)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"scaffold-mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@agiflowai/scaffold-mcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp-serve"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart Claude Code and say: "What scaffolding templates are available?"&lt;/p&gt;

&lt;h2&gt;
  
  
  The Path Forward
&lt;/h2&gt;

&lt;p&gt;The future of AI-assisted development isn't about AI writing more code—it's about AI writing the &lt;strong&gt;right&lt;/strong&gt; code, &lt;strong&gt;consistently&lt;/strong&gt;, following &lt;strong&gt;your&lt;/strong&gt; conventions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three Levels of Adoption
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Level 1: User&lt;/strong&gt; (Start here)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use existing templates&lt;/li&gt;
&lt;li&gt;10x faster setup&lt;/li&gt;
&lt;li&gt;Guaranteed consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Level 2: Customizer&lt;/strong&gt; (Next step)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adapt templates to your team&lt;/li&gt;
&lt;li&gt;Encode patterns once, reuse forever&lt;/li&gt;
&lt;li&gt;Zero convention debates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Level 3: Creator&lt;/strong&gt; (Advanced)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build custom templates for your stack&lt;/li&gt;
&lt;li&gt;Advanced generators for complex workflows&lt;/li&gt;
&lt;li&gt;Share across your organization&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Stop fighting with AI over conventions. Stop reviewing the same style issues. Stop recreating the same patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start with templates. Scale with scaffolding.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/AgiFlow/aicode-toolkit" rel="noopener noreferrer"&gt;github.com/AgiFlow/aicode-toolkit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NPM&lt;/strong&gt;: &lt;a href="https://www.npmjs.com/package/@agiflowai/scaffold-mcp" rel="noopener noreferrer"&gt;@agiflowai/scaffold-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;"The best code is the code you don't have to write. But when you do write it, scaffolding ensures you write it right the first time—every time."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;This is Part 1 of my series on making AI coding assistants work on complex projects. Stay tuned for Part 2!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Questions? I'm happy to discuss architecture patterns, scaffolding strategies, or share more implementation details in the comments.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>vibecoding</category>
      <category>webdev</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
