<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aming</title>
    <description>The latest articles on DEV Community by Aming (@amingin_ai).</description>
    <link>https://dev.to/amingin_ai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3935251%2Fe38632bc-051d-45df-8c07-d7232255591c.jpg</url>
      <title>DEV Community: Aming</title>
      <link>https://dev.to/amingin_ai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amingin_ai"/>
    <language>en</language>
    <item>
      <title>Route Context: How I Built Right Context for Agents</title>
      <dc:creator>Aming</dc:creator>
      <pubDate>Sat, 30 May 2026 12:28:20 +0000</pubDate>
      <link>https://dev.to/amingin_ai/route-context-how-i-built-right-context-for-agents-3i1</link>
      <guid>https://dev.to/amingin_ai/route-context-how-i-built-right-context-for-agents-3i1</guid>
      <description>&lt;p&gt;&lt;em&gt;We already knew agents needed the right context. Dogfood taught us the harder problem: making that context live, visible, auditable, and enforceable inside a multi-agent workflow.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The failure looked like success.&lt;/p&gt;

&lt;p&gt;A worker changed one file. The focused test passed. The patch was small, readable, and locally correct. If you only looked at the worker's report, the job was done.&lt;/p&gt;

&lt;p&gt;But the overall job was still wrong.&lt;/p&gt;

&lt;p&gt;The change touched a permission-sensitive path. It needed evidence that the right role had acted, that the project map still matched the code, that the audit trail could explain why the edit was allowed, and that the final close check had not been skipped. Instead, the system had solved the nearest local problem. A lane that should have preserved the shape of the work had drifted into implementation. The worker had completed its tiny job, while the whole route had lost its meaning.&lt;/p&gt;

&lt;p&gt;Nobody had to make a dramatic mistake for this to happen. That was the unsettling part. The worker was not lazy. The test was not fake. The prompt was not empty. The system had plenty of context. It just did not have route context.&lt;/p&gt;

&lt;p&gt;Route context is the per-task, per-lane runtime packet that tells an agent what job it is in, what role it has, what it saw, what it cannot do, and what evidence must pass before the work is done.&lt;/p&gt;

&lt;p&gt;That is the positioning of this article. I am not trying to convince technical readers that right context matters. Most people building with agents have already learned that, usually the hard way. The point is how we made right context operational for agent development: route context, a live artifact that turns the principle into prompt-visible, hashable, auditable, gate-checkable workflow state.&lt;/p&gt;

&lt;p&gt;We started with a different hope. We wanted agent development to feel zero-orchestration-ish: the user says what they want, the system understands the work, the right agents handle the right pieces, and nobody has to manually conduct a meeting of subagents. The observer would keep things coherent. Workers would implement. Review lanes would check evidence. Validation would catch bad finishes.&lt;/p&gt;

&lt;p&gt;Then we used it on ourselves.&lt;/p&gt;

&lt;p&gt;Again and again, the same failure appeared in different clothes. The observer would begin correctly, then collapse into a nearby code edit. A worker would pass a test, but not satisfy the larger obligation. A reviewer would evaluate plausible reasoning, but not the same evidence the worker actually produced. A prompt would contain lots of useful system knowledge, but not the specific promises this worker had to satisfy.&lt;/p&gt;

&lt;p&gt;That is when the thesis became operational: AI agent development needs right context, not more context, and route context is how we made that sentence executable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Words We Needed
&lt;/h2&gt;

&lt;p&gt;We had to translate our internal language into something simple enough to survive a live task.&lt;/p&gt;

&lt;p&gt;Topology means what kind of job this really is. Is this a tiny deterministic bug fix, a permission change, a runtime change, a graph update, a public decision, or a mixed task that needs independent lanes?&lt;/p&gt;

&lt;p&gt;A contract is the set of promises a worker must satisfy: target files, acceptance criteria, out-of-scope areas, allowed actions, blocked actions, and evidence to return.&lt;/p&gt;

&lt;p&gt;A gate is a check that prevents a false finish. A test can be a gate, but it is not the only one. A close gate may reject a change because the audit evidence is missing, the runtime was not redeployed, the project map is stale, or the wrong role acted.&lt;/p&gt;

&lt;p&gt;Graph, backlog, timeline, and contract are the governance layers that preserve project state, user intent, execution evidence, and allowed action. The graph is the project map. The backlog records the work and acceptance criteria. The timeline records what happened. The contract says what this lane may do.&lt;/p&gt;

&lt;p&gt;Once those words were clear, the bug was easier to see. Our static skill text explained the system, but the live task needed route context: the current path through the work, with the role, lane, injected context, and checks bound to the action about to happen.&lt;/p&gt;

&lt;p&gt;Static skill text is only a bootloader. It can teach the agent that observers, workers, review lanes, graphs, and gates exist. It cannot reliably decide what matters for this task, this lane, this file, this permission boundary, and this moment. Route context is the runtime packet assembled for that decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Route Context Artifact
&lt;/h2&gt;

&lt;p&gt;The useful fix was not a bigger prompt. It was a route context alert: a small implementation artifact generated before lane dispatch, during topology classification and prompt-contract assembly. It describes what the lane is allowed to do, what was injected into its prompt, and what evidence will be checked later.&lt;/p&gt;

&lt;p&gt;A simplified route context alert looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;route_context_alert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;task_intent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;permission&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;handling&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;without&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;changing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;audit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;policy"&lt;/span&gt;
  &lt;span class="na"&gt;role_boundary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;implementation&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;worker;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;no&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;merge,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;close,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;or&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;graph&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;mutation"&lt;/span&gt;
  &lt;span class="na"&gt;topology&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;permission-sensitive&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;bug;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;requires&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;independent&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;review"&lt;/span&gt;
  &lt;span class="na"&gt;contract&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;edit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;accepted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;file;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;run&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;focused&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;test;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;report&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;evidence"&lt;/span&gt;
  &lt;span class="na"&gt;blocked_actions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;change&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;route&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;policy"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;close&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;backlog"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redeploy&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;runtime"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;visible_injection_manifest&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;route_doc@sha256:..."&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contract@sha256:..."&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;evidence_gates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;focused_test"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;independent_validation"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;close_gate"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not magic model control. The model can still misunderstand things. The point is that the important context is visible in the prompt, listed in an audit manifest, and tied to workflow checks that can reject a false finish.&lt;/p&gt;

&lt;p&gt;The visible injection manifest was especially important. If a document, decision summary, expert note, or implementation contract influences a lane, it should appear in the manifest with an id, kind, source reference, and hash. The hash proves the identity of the injected artifact. Gates and evidence decide whether the lane satisfied the contract. Without the manifest, nobody can reconstruct what the agent actually saw.&lt;/p&gt;

&lt;p&gt;Route context gave us a way to keep orchestration minimal without making it invisible. The system still feels close to zero-orchestration from the user's side, but the route carries enough explicit structure that lanes do not silently blend together.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Implementation Pattern In Aming Claw
&lt;/h2&gt;

&lt;p&gt;This is not only language we use around the system. Aming Claw implements the pattern as code-level contracts, local gates, audited queries, and append-only evidence.&lt;/p&gt;

&lt;p&gt;The route prompt contract is source-controlled in the &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/contract_templates/mf_workflow_runtime.v1.json#L12-L55" rel="noopener noreferrer"&gt;&lt;code&gt;mf_workflow_runtime.v1.json&lt;/code&gt; template&lt;/a&gt;. That contract makes route-owned prompt context explicit: the injected artifacts are listed in a visible manifest, observer and review lanes are blocked from drifting into implementation, and the worker has to carry matching &lt;code&gt;route_context_hash&lt;/code&gt;, &lt;code&gt;prompt_contract_id&lt;/code&gt;, and &lt;code&gt;prompt_contract_hash&lt;/code&gt; values. In other words, the prompt is no longer just a bag of helpful text. It has identity.&lt;/p&gt;

&lt;p&gt;Before a bounded worker is handed the job, Aming Claw runs a local dispatch gate in &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/mf_subagent_contract.py#L859-L1031" rel="noopener noreferrer"&gt;&lt;code&gt;mf_subagent_contract.py&lt;/code&gt;&lt;/a&gt;. The gate checks the worker's branch, worktree, base commit, target head, merge queue, fence token, route hash, prompt hash, and owned files. Same-worktree dispatch is blocked by default because "please stay in this directory" is not a boundary. The boundary has to be represented in durable facts the system can re-check.&lt;/p&gt;

&lt;p&gt;When the worker returns, the finish gate in the same module treats the response as a claim, not as truth. The &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/mf_subagent_contract.py#L1191-L1249" rel="noopener noreferrer"&gt;&lt;code&gt;finish validation&lt;/code&gt;&lt;/a&gt; requires the fence token to match, tests to pass, blockers to be absent, a checkpoint id to exist, and the worker identity to still match the handoff. This is the difference between "the agent says it is done" and "the route can safely advance."&lt;/p&gt;

&lt;p&gt;The audit trail follows the same idea. The &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/task_timeline.py#L1-L7" rel="noopener noreferrer"&gt;&lt;code&gt;task_timeline.py&lt;/code&gt; module&lt;/a&gt; is append-only execution evidence. Its close gate expects the route to have the right event kinds: &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/task_timeline.py#L93-L97" rel="noopener noreferrer"&gt;&lt;code&gt;implementation&lt;/code&gt;, &lt;code&gt;verification&lt;/code&gt;, and &lt;code&gt;close_ready&lt;/code&gt;&lt;/a&gt;. The later &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/task_timeline.py#L773-L818" rel="noopener noreferrer"&gt;&lt;code&gt;close verification&lt;/code&gt;&lt;/a&gt; checks those facts before a backlog item can be honestly closed.&lt;/p&gt;

&lt;p&gt;Even project knowledge is handled this way. Graph context is not dumped wholesale into the prompt. It is queried through an &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/governance/graph_query_trace.py#L1-L6" rel="noopener noreferrer"&gt;&lt;code&gt;audited graph-query trace&lt;/code&gt;&lt;/a&gt; and exposed through the MCP &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/agent/mcp/tools.py#L565-L624" rel="noopener noreferrer"&gt;&lt;code&gt;graph_query&lt;/code&gt; surface&lt;/a&gt;, so later review can ask what the agent looked up instead of guessing. The public manual-fix SOP names the same workflow: route, contract, timeline, and close gates are required evidence, and dispatch has to prove the worker's fenced identity before handoff (&lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/docs/governance/manual-fix-sop.md#L206-L210" rel="noopener noreferrer"&gt;timeline and contract gates&lt;/a&gt;, &lt;a href="https://github.com/amingclawdev/aming-claw/blob/32f47d11f81d85b1fd4d50dddaa5389990ca4513/docs/governance/manual-fix-sop.md#L214-L227" rel="noopener noreferrer"&gt;dispatch requirements&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;That is the implementation pattern: right context becomes a chain of small, checkable facts. The agent can reason with them, but it does not get to be the only witness.&lt;/p&gt;

&lt;h2&gt;
  
  
  Before And After
&lt;/h2&gt;

&lt;p&gt;Before, the agent confidently finished the wrong local job.&lt;/p&gt;

&lt;p&gt;It saw a failing test and fixed the code. It saw a nearby file and edited it. It saw a plausible completion story and wrote one. The observer forgot what kind of job this really was. The worker optimized inside its local patch. The review or validation step, if present, evaluated the local result instead of the global obligation.&lt;/p&gt;

&lt;p&gt;After, route context requires the observer to start by preserving global state. It names the topology: tiny fix, permission-sensitive change, runtime change, graph-impacting change, major decision, or something else. It dispatches lanes accordingly. The worker acts inside a contract. The architecture review lane checks whether the route makes sense. Validation checks evidence, not just confidence. Later validation and close gates check the route context alert, manifest, and returned evidence before accepting the result.&lt;/p&gt;

&lt;p&gt;That rejection matters. A good system must be allowed to say, "The test passed, but the work is not done."&lt;/p&gt;

&lt;p&gt;For example: the focused test passed, but the route context hash did not match the worker contract. Or the code changed the right file, but the audit evidence did not prove the right role acted. Or the patch was correct, but the graph, meaning the project map, was now stale. Or runtime needed redeploy before anyone could claim the fix was live. Route context makes those checks explicit instead of hoping a model remembers them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What The Observer Is For
&lt;/h2&gt;

&lt;p&gt;The observer's advantage is not that it manages subagents. That framing makes it sound like a tiny project manager.&lt;/p&gt;

&lt;p&gt;The observer's real advantage is global-state custody. Route context gives it something concrete to preserve: route identity, dirty scope, graph/current state, runtime status, backlog state, tests, close gate requirements, and follow-up work.&lt;/p&gt;

&lt;p&gt;Workers should be local. That is their strength. A bounded worker should know its target files, acceptance criteria, blocked actions, focused tests, and required evidence. It should not need to carry the whole project in its head. When every worker receives the whole world, prompts get heavier and guarantees get weaker.&lt;/p&gt;

&lt;p&gt;The observer keeps the larger surfaces connected through route context. Did this lane have permission to edit? Did it stay inside its file fence? Did the test prove the actual promise or just a nearby behavior? Did an independent reviewer inspect the same packet the worker produced? Did the runtime or graph need an update? Is there follow-up work outside the worker's scope?&lt;/p&gt;

&lt;p&gt;These checks are not glamorous. They are what stop a green test from becoming a false finish.&lt;/p&gt;

&lt;h2&gt;
  
  
  Efficiency Without The Theater
&lt;/h2&gt;

&lt;p&gt;The biggest improvement was not raw wall-clock speed.&lt;/p&gt;

&lt;p&gt;Parallel lanes can help. Architecture review, implementation, and validation lanes expose different gaps earlier than one linear worker. But the real win was effective efficiency and quality: fewer locally correct patches that could not be honestly closed, less rework after review, and earlier discovery of missing evidence.&lt;/p&gt;

&lt;p&gt;The system became calmer because it stopped treating "done locally" as "done globally." Some gates remain serial on purpose. Commit, runtime redeploy, graph reconcile, and backlog close mutate shared state or claim shared state is current. Those steps should not be casually parallelized just because multiple agents are available.&lt;/p&gt;

&lt;p&gt;This is the correction to zero-orchestration-ish design. The goal is not to hide all orchestration. Hidden orchestration is unreliable because nobody can audit what happened. Route context keeps orchestration visible and minimal at the points where role, evidence, and shared state matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Rule We Use Now
&lt;/h2&gt;

&lt;p&gt;A tiny deterministic edit can be one agent plus a focused test. If route context says the blast radius is clear, the file ownership is obvious, and there is no permission, audit, graph, or runtime implication, keep it simple. Give the worker a tight contract and verify the behavior.&lt;/p&gt;

&lt;p&gt;P1 and P0 work is different. So are routing, permission, audit, graph, and runtime tasks. Those need an observer, an architecture review lane, an implementation worker, and independent validation. Major decisions need adversarial lanes: separate expert packets, an independent review lane comparing evidence, and an observer final decision. Not because important work deserves ceremony, but because important work has more ways to be locally successful and globally wrong.&lt;/p&gt;

&lt;p&gt;The checklist is compact:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Name the topology: what kind of job this really is.&lt;/li&gt;
&lt;li&gt;Bind the worker contract: target files, acceptance criteria, and evidence.&lt;/li&gt;
&lt;li&gt;List blocked actions: what this lane must not do.&lt;/li&gt;
&lt;li&gt;Expose injected context: manifest the artifacts and hashes the lane saw.&lt;/li&gt;
&lt;li&gt;Make gates reject false finishes: tests, validation, and close checks must be able to say no.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the practical lesson we got from dogfood. More context made agents sound more informed. Route context made right context safer to act on.&lt;/p&gt;

&lt;p&gt;For us, right context stopped being a prompt-writing aspiration when it became route context: a route that knows what matters now, a contract that bounds the worker, a manifest that shows what was injected, and gates that refuse to confuse a passing test with a finished job.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>agents</category>
      <category>development</category>
    </item>
    <item>
      <title>AI's tech debt is invisible — even to AI. I solved it at the architecture layer.</title>
      <dc:creator>Aming</dc:creator>
      <pubDate>Sat, 23 May 2026 03:58:23 +0000</pubDate>
      <link>https://dev.to/amingin_ai/ais-tech-debt-is-invisible-even-to-ai-i-solved-it-at-the-architecture-layer-1nh1</link>
      <guid>https://dev.to/amingin_ai/ais-tech-debt-is-invisible-even-to-ai-i-solved-it-at-the-architecture-layer-1nh1</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — AI repeats your patterns badly, ignores existing services, and forgets every cross-session lesson you taught it. This isn't laziness — it's a new kind of tech debt: &lt;strong&gt;invisible&lt;/strong&gt;, &lt;strong&gt;systemic&lt;/strong&gt;, and &lt;strong&gt;architectural&lt;/strong&gt;. Project memory hints don't scale. Bigger context windows don't help. The fix is structural: pin a graph projection of your codebase to every commit, let AI read it before writing, surface "graph stale" prompts when source drifts. Real commit receipts from my own OSS project &lt;a href="https://github.com/amingclawdev/aming-claw" rel="noopener noreferrer"&gt;aming-claw&lt;/a&gt; inline. Architects, change my mind in the comments.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What is AI tech debt?
&lt;/h2&gt;

&lt;p&gt;Let me define this precisely, because it's a different beast from the tech debt you already know.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Traditional tech debt&lt;/th&gt;
&lt;th&gt;AI tech debt&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Who creates it&lt;/td&gt;
&lt;td&gt;Engineers (knowingly)&lt;/td&gt;
&lt;td&gt;AI (unknowingly)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Awareness&lt;/td&gt;
&lt;td&gt;Conscious tradeoff&lt;/td&gt;
&lt;td&gt;AI doesn't know it's accruing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fix lifecycle&lt;/td&gt;
&lt;td&gt;Fix once, done&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Every new session repeats it&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visibility&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;git log&lt;/code&gt; shows it&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Invisible across sessions&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scale&lt;/td&gt;
&lt;td&gt;Team-bounded&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Systemic, AI-generated&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The core asymmetry: &lt;strong&gt;the more your team uses AI for coding, the more invisible debt accrues — and you have no tool that sees it.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  5 symptoms (diagnose yourself)
&lt;/h2&gt;

&lt;p&gt;Run this checklist against your team:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ AI re-implemented a service that &lt;strong&gt;already exists&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;❌ AI shipped code using a &lt;strong&gt;pattern completely inconsistent&lt;/strong&gt; with everything around it&lt;/li&gt;
&lt;li&gt;❌ AI &lt;strong&gt;didn't see&lt;/strong&gt; the implementation sitting in the next file over&lt;/li&gt;
&lt;li&gt;❌ Every new session &lt;strong&gt;repeats the same mistakes&lt;/strong&gt; you corrected last time&lt;/li&gt;
&lt;li&gt;❌ AI treats a &lt;strong&gt;familiar codebase as if it were brand new&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three or more? You're accruing AI tech debt. The bigger your team and the more AI you use, the faster it compounds.&lt;/p&gt;




&lt;h2&gt;
  
  
  A real case study: my toolboxclient stateService
&lt;/h2&gt;

&lt;p&gt;I'm the maintainer of &lt;a href="https://github.com/amingclawdev/toolBoxClient" rel="noopener noreferrer"&gt;toolboxclient&lt;/a&gt; (open-source cross-platform AI agent runtime, 274+ stars). I asked AI to add a &lt;code&gt;stateService&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The directory &lt;code&gt;server/services/&lt;/code&gt; already contained, in clear sight:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TOOLBOXCLIENT/server/services/
├── fingerPrintService.js
├── memoryService.js
├── providerModelService.js
├── proxyService.js
├── taskService.js
├── toolServiceManager.js
├── walletService.js
└── webSocketService.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Roughly a dozen services, all sharing the same HTTP pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What AI shipped&lt;/strong&gt; (commit &lt;a href="https://github.com/amingclawdev/toolBoxClient/commit/68487cc" rel="noopener noreferrer"&gt;&lt;code&gt;68487cc&lt;/code&gt;&lt;/a&gt;, 2026-03-19):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AI's version: WebSocket-based StateClient with Proxy&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StateClient&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agentName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 🚨 WebSocket, not HTTP — inconsistent with every other service in the folder&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ws&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WebSocket&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_createProxy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;_createProxy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Proxy traps to broadcast via WebSocket&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Proxy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It used WebSocket instead of HTTP. It used a Proxy-based intercept-and-broadcast pattern unlike anything else in the codebase. It built a parallel architecture next to an established one.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;This wasn't a code bug. It was a pattern bug. AI literally couldn't see the existing services.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The first fix: project memory
&lt;/h2&gt;

&lt;p&gt;My first instinct: add a hint to project memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;use existing HTTP services, don't add WebSocket
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI refactored cleanly (commit &lt;a href="https://github.com/amingclawdev/toolBoxClient/commit/bbdf82c" rel="noopener noreferrer"&gt;&lt;code&gt;bbdf82c&lt;/code&gt;&lt;/a&gt;, 2026-03-21):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;feat: stateService Phase A+B — HTTP CRUD + SSE broadcast

Phase A: /api/state/* routes (read, write, session CRUD, language pref)
Phase B: SSE subscribe endpoint with topic filtering + EventBus broadcast

74/74 tests pass. No breaking changes — additive only.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;WebSocket gone. HTTP CRUD + SSE matching the existing pattern. Clean fix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For about ten seconds, I thought I'd solved it.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why project memory hints don't scale
&lt;/h2&gt;

&lt;p&gt;Then I realized something uncomfortable:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This catch only worked &lt;strong&gt;because I noticed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The next AI session would start with zero memory of this lesson.&lt;br&gt;
Every context window starts as a blank slate.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the &lt;strong&gt;systemic&lt;/strong&gt; nature of AI tech debt:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI can't see existing patterns when it writes&lt;/li&gt;
&lt;li&gt;I see it → I fix it once → the fix doesn't propagate to future sessions&lt;/li&gt;
&lt;li&gt;Manual &lt;code&gt;project memory&lt;/code&gt; maintenance puts the work back on me, not AI&lt;/li&gt;
&lt;li&gt;This doesn't scale — and the failure mode is silent&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The first insight
&lt;/h2&gt;

&lt;p&gt;I stopped trying to fix prompts and started looking at the structural problem:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AI agents don't need bigger context windows.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They need a persistent structural record of the project that survives across sessions.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Context windows are short-term memory. What's missing is &lt;strong&gt;long-term, project-level memory&lt;/strong&gt; — something any AI session can read before writing.&lt;/p&gt;

&lt;p&gt;This is the insight that turned into &lt;a href="https://github.com/amingclawdev/aming-claw" rel="noopener noreferrer"&gt;aming-claw&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building aming-claw (and falling into the next trap)
&lt;/h2&gt;

&lt;p&gt;The idea: give every AI session a queryable graph of the project. Files, modules, functions, patterns — all of it, machine-readable, persistent.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scan the codebase → build a &lt;strong&gt;graph&lt;/strong&gt; of all entities and relations&lt;/li&gt;
&lt;li&gt;Expose it through an &lt;strong&gt;MCP server&lt;/strong&gt; that any agent can query&lt;/li&gt;
&lt;li&gt;AI &lt;strong&gt;reads the graph before writing&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Graph &lt;strong&gt;persists across sessions&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I built it. It worked. Then it broke — at a higher layer.&lt;/p&gt;

&lt;p&gt;I had implemented the graph with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mutable nodes&lt;/strong&gt; — agents could edit graph state directly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A patch pipeline&lt;/strong&gt; — 5-stage mutation flow (propose → validate → review → apply → snapshot)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A graph editor UI&lt;/strong&gt; — humans could also edit the graph&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Within a few weeks, &lt;strong&gt;the graph drifted from the actual code&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why? Because I had created a &lt;strong&gt;second source of truth&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The real source of truth was source code&lt;/li&gt;
&lt;li&gt;But I also let the graph be directly mutated&lt;/li&gt;
&lt;li&gt;The two sources inevitably diverged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Same trap. Higher layer.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The real architectural insight
&lt;/h2&gt;

&lt;p&gt;After hitting the same trap twice, the answer crystallized:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;del&gt;The graph is something you edit.&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The graph is a projection of the commit.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In concrete terms:&lt;/p&gt;

&lt;h3&gt;
  
  
  Every commit can correspond to one graph
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit (modifies source / hints / config)
     ↓
system detects: HEAD ≠ graph's bound commit
     ↓ ⚠️ "graph stale" prompt
user decides when to reconcile
     ↓ user-triggered
fixed_algorithm(source + hints + config)
     ↓
new graph snapshot ←→ new commit hash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4 key invariants
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Invariant&lt;/th&gt;
&lt;th&gt;What it guarantees&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Fixed algorithm&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same input → same graph (deterministic, no randomness)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1:1 binding&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Every commit hash maps to exactly one graph snapshot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;User-triggered&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reconciliation is explicit, not a background git hook&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Stale prompt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;System surfaces drift in dashboard / CLI; user triggers when ready&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why not a git hook?
&lt;/h3&gt;

&lt;p&gt;A reasonable question: why not auto-rebuild the graph on every commit via a git hook?&lt;/p&gt;

&lt;p&gt;Three reasons I deliberately didn't:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reconciliation is expensive&lt;/strong&gt; (full codebase scan + algorithm)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Surprise auto-builds destabilize state&lt;/strong&gt; — user should control when state changes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Batching commits before a single reconcile is often what users want&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The system shows a &lt;code&gt;graph stale&lt;/code&gt; indicator in dashboard and CLI. Users reconcile when they're ready. This is a deliberate design choice, not a limitation.&lt;/p&gt;

&lt;h3&gt;
  
  
  How modification and rollback work
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Modify the graph&lt;/td&gt;
&lt;td&gt;Modify source / hints / config → trigger reconcile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Roll back the graph&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;git revert&lt;/code&gt; → trigger reconcile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verify consistency&lt;/td&gt;
&lt;td&gt;Same commit → same graph (replayable)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Logic lives in code. The graph is a read-only projection.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How this solves AI tech debt
&lt;/h2&gt;

&lt;p&gt;Returning to the original problem: &lt;strong&gt;AI repeats patterns badly because it can't see the codebase&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The architectural fix:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Every AI session starts by &lt;strong&gt;querying the graph&lt;/strong&gt; (via MCP)&lt;/li&gt;
&lt;li&gt;The graph records the full structure — files, functions, modules, patterns&lt;/li&gt;
&lt;li&gt;AI sees, for example, &lt;code&gt;existing HTTP service pattern in server/services/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;AI &lt;strong&gt;reuses the pattern&lt;/strong&gt; instead of shipping a parallel WebSocket implementation&lt;/li&gt;
&lt;li&gt;After AI makes changes → user commits → system flags graph as stale → user reconciles → next session sees updated patterns&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Cross-session knowledge transfer happens through the graph, not the prompt.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is what "solved at the architecture layer" means: it's not a smarter prompt, it's a different topology of state.&lt;/p&gt;




&lt;h2&gt;
  
  
  Coming up: the algorithm itself
&lt;/h2&gt;

&lt;p&gt;This post covered &lt;strong&gt;why&lt;/strong&gt; the projection model works. The next post covers &lt;strong&gt;how&lt;/strong&gt; the algorithm builds the graph:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;in-degree=0 entry detection&lt;/li&gt;
&lt;li&gt;DFS 3-color marking&lt;/li&gt;
&lt;li&gt;Tarjan SCC for cyclic clusters&lt;/li&gt;
&lt;li&gt;6-signal layer scoring&lt;/li&gt;
&lt;li&gt;Cross-language fact pipeline (Python + TypeScript)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Follow me here to catch the next one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Change my mind
&lt;/h2&gt;

&lt;p&gt;I claim this architectural pattern solves AI tech debt: &lt;strong&gt;every commit corresponds to one graph + user-triggered reconcile + stale-state prompt&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Your turn. Two architectural choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Treat project state as a &lt;strong&gt;single source of truth, commit-bound&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Or maintain a &lt;strong&gt;separate memory store&lt;/strong&gt; that AI writes to&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which is more robust? Which scales better? Where would you attack my approach?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Calibrated invitation: I want senior engineers and AI infra people to push back with specifics. "What about X?" or "Have you considered Y?" lands better than "this won't work." If you've shipped something adjacent, tell me — I want to compare designs.&lt;/p&gt;
&lt;/blockquote&gt;




</description>
    </item>
    <item>
      <title>AI proposed 5 components for my parallel system. After walking one scenario, only 3 were real.</title>
      <dc:creator>Aming</dc:creator>
      <pubDate>Mon, 18 May 2026 04:18:53 +0000</pubDate>
      <link>https://dev.to/amingin_ai/ai-proposed-5-components-for-my-parallel-system-after-walking-one-scenario-only-3-were-real-12nd</link>
      <guid>https://dev.to/amingin_ai/ai-proposed-5-components-for-my-parallel-system-after-walking-one-scenario-only-3-were-real-12nd</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — AI loves to design "enterprise-grade" systems for you: message queue, distributed lock, state machine service, scheduler, monitoring bus. Half of them aren't real. The cheapest filter I know: before letting AI design anything, walk one concrete scenario through the system. Whatever shows up in the scenario is real. Whatever doesn't — delete. This week it took me from a 5-component design down to 3 — and surfaced one critical component AI had missed entirely.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I was building
&lt;/h2&gt;

&lt;p&gt;This week I was extending &lt;a href="https://github.com/amingclawdev/aming-claw" rel="noopener noreferrer"&gt;aming-claw&lt;/a&gt; (an open-source AI code governance tool I'm building) to support &lt;strong&gt;parallel multi-agent development&lt;/strong&gt;: multiple AI agents working on the same project simultaneously, each on its own branch, all of it merging back into trunk.&lt;/p&gt;

&lt;p&gt;I asked AI to help me design it.&lt;/p&gt;

&lt;p&gt;It came back fast. Confident. Five components:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Message queue        (so tasks can line up)
- Distributed lock     (so agents don't step on each other)
- State machine service (so we track progress)
- Task scheduler       (so we know what runs when)
- Monitoring bus       (so we see what's happening)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each component had a paragraph of justification. The diagram looked impressive. The names sounded right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I almost just said "ok, build it."&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I didn't
&lt;/h2&gt;

&lt;p&gt;A thing I've learned working with AI on architecture: AI doesn't filter for &lt;em&gt;necessity&lt;/em&gt;. It filters for &lt;em&gt;plausibility&lt;/em&gt;. The components it lists are real things real systems have — they're just not necessarily things &lt;strong&gt;your&lt;/strong&gt; system needs.&lt;/p&gt;

&lt;p&gt;So instead of letting it design the system, I did one thing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;I walked a concrete scenario through the system before agreeing to anything.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Here's an honest framing: &lt;strong&gt;nobody&lt;/strong&gt; can look at a 5-component design and immediately tell you which 2 are load-bearing. AI can't. Most engineers reading this can't, not on inspection.&lt;/p&gt;

&lt;p&gt;The good news:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;You don't need to know what to design. You just need to walk one scenario.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The scenario does the filtering for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Scenario 1: five tasks with dependencies
&lt;/h2&gt;

&lt;p&gt;I started with the most boring scenario I could think of:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Five AI agents working in parallel. Each one on its own branch. The tasks have a dependency chain: &lt;code&gt;1 → 2 → 3 → 4 → 5&lt;/code&gt;. Task 2 needs what task 1 built. Task 5 needs everything before it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I walked through what the system has to do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Five tasks running in parallel — they need to &lt;strong&gt;queue&lt;/strong&gt; for merging. OK, "message queue" was real.&lt;/li&gt;
&lt;li&gt;BUT — they have to merge &lt;strong&gt;in dependency order&lt;/strong&gt;. Not first-come-first-served. So a plain FIFO message queue isn't enough. &lt;strong&gt;It has to be an ordered queue.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Already, one component refined. "Message queue" → "ordered merge queue."&lt;/p&gt;

&lt;p&gt;Nothing has been deleted yet. Keep going.&lt;/p&gt;




&lt;h2&gt;
  
  
  Scenario 2: the machine reboots mid-batch
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Now the machine reboots. When it comes back up: task 1 already merged. Task 2 tried to merge and failed. Task 3 hadn't started yet. Task 4 was waiting in queue. Task 5 was halfway through executing when the power cut.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I walked it again:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For the system to even know what state each task is in after a reboot, &lt;strong&gt;task state has to be on disk, not just in memory&lt;/strong&gt;. Not a "state machine service" with its own server — just durable per-task state. (&lt;code&gt;task_id → status → checkpoint&lt;/code&gt;.) That's a column in a database, not a service.&lt;/li&gt;
&lt;li&gt;Task 2 failed, but tasks 3-5 are downstream of it. The system has to &lt;strong&gt;recognize "upstream failed, downstream blocked"&lt;/strong&gt; automatically. That's not a separate component — it's a query against the durable state.&lt;/li&gt;
&lt;li&gt;Task 5 was mid-execution when the power cut. When the machine restarts, what stops a second copy from picking it up and racing the half-finished one? Each execution attempt needs a &lt;strong&gt;unique token&lt;/strong&gt; — whoever has the newest token is the live runner, everyone else gets fenced off.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now two more things have surfaced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Durable per-task state (which AI called "state machine service" — but it's not a service, it's a table)&lt;/li&gt;
&lt;li&gt;Fence tokens to prevent zombie reruns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And here's the first thing that got &lt;strong&gt;deleted&lt;/strong&gt;: &lt;strong&gt;distributed lock&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A distributed lock is "this resource is held by exactly one agent right now." Fence tokens solve the same problem in a much weaker, much cheaper way: "the latest token wins, all stale tokens are ignored." For agent merge work, that's sufficient. Distributed locks would be massive overkill for the actual scenario.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1 component deleted, 0 lines of code written.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Scenario 3: the ordering itself was wrong
&lt;/h2&gt;

&lt;p&gt;This one wasn't in my original head-list. It only surfaced when I kept walking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Five tasks ran. Three merged. Then it turns out the &lt;strong&gt;dependency order I gave the system was wrong&lt;/strong&gt; — it should have been &lt;code&gt;1 → 3 → 2 → 4 → 5&lt;/code&gt;, not &lt;code&gt;1 → 2 → 3 → 4 → 5&lt;/code&gt;. The three already-merged tasks need to be &lt;strong&gt;rolled back as a batch and replayed in the correct order.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a scenario most systems never plan for. Per-task rollback is common — undo one merge. &lt;strong&gt;Batch rollback with replay&lt;/strong&gt; is rarer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Plain per-task &lt;code&gt;revert&lt;/code&gt; doesn't work — you can't revert task 2 while leaving task 3 (which depends on task 2's wrong order) intact.&lt;/li&gt;
&lt;li&gt;The whole batch has to roll back atomically.&lt;/li&gt;
&lt;li&gt;Then the system has to &lt;strong&gt;replay them in the new order&lt;/strong&gt;, with all the graph artifacts (snapshots, indices, semantic projection, test results) re-derived per merge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the component &lt;strong&gt;AI had not mentioned at all&lt;/strong&gt;. It only surfaced because I walked a scenario nobody told me to walk.&lt;/p&gt;

&lt;p&gt;Call it &lt;code&gt;BatchMergeRuntime&lt;/code&gt;. It's the rarest kind of architectural decision: not "should we have it" but &lt;strong&gt;"do we even know we need it?"&lt;/strong&gt; — and the answer, for most teams, is &lt;em&gt;not until production&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the architecture actually became
&lt;/h2&gt;

&lt;p&gt;After walking three scenarios:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;What it surfaced&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5 tasks with dependencies&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ordered merge queue&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Machine reboots mid-batch&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Durable task state + fence tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dependency order was wrong&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Batch rollback + replay runtime&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All of the above untested&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Test scenario matrix as P0.0 (highest priority)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three real components. The fourth — the &lt;strong&gt;test scenario matrix itself&lt;/strong&gt; — is a meta-component: the dry-run scenarios I just walked became the &lt;strong&gt;first acceptance bar&lt;/strong&gt; for every subsequent PR. Anything that ships has to survive these scenarios before merge.&lt;/p&gt;




&lt;h2&gt;
  
  
  AI's first design vs what scenarios required
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;AI's first list&lt;/th&gt;
&lt;th&gt;Reality after scenario walk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Message queue&lt;/td&gt;
&lt;td&gt;✅ Needed — but &lt;strong&gt;ordered&lt;/strong&gt;, not FIFO&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distributed lock&lt;/td&gt;
&lt;td&gt;❌ Deleted — fence tokens are sufficient&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State machine service&lt;/td&gt;
&lt;td&gt;✅ Needed — but as a &lt;strong&gt;table&lt;/strong&gt;, not a service&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Task scheduler&lt;/td&gt;
&lt;td&gt;❌ Deleted — the ordered queue &lt;em&gt;is&lt;/em&gt; the scheduler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitoring bus&lt;/td&gt;
&lt;td&gt;❌ Deleted — each component emits its own events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(AI did not propose)&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Batch rollback runtime&lt;/strong&gt; — surfaced only by scenario 3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Net: 5 → 3 components, plus the one critical piece AI had missed entirely.&lt;/p&gt;

&lt;p&gt;The win is not "I deleted 2 components." The win is &lt;strong&gt;I now know why each remaining component exists&lt;/strong&gt;, which means I can explain it, scope it, and reject scope creep on it. That's the difference between a system you &lt;em&gt;built&lt;/em&gt; and a system you &lt;em&gt;understand&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The method, in 3 steps
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❌ Don't:   "Hey AI, design me a system that does X."
           → AI returns a plausible-looking inventory of components.
           → Half of them aren't real for your specific case.

✅ Do:      Step 1.  Write one concrete scenario yourself.
                    (Or: have AI write the scenario, you evaluate it.
                     Real numbers, real steps, with crashes,
                     failures, and orderings going wrong.)

           Step 2.  Walk the scenario through your design.
                    At each step, ask: "What does the system need here?"

           Step 3.  Aggregate "what's needed."
                    That's your minimal architecture.
                    Anything not in that list — delete.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Three steps. No architecture-pattern library required. The scenario does the work for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this works (and why it's hard to skip)
&lt;/h2&gt;

&lt;p&gt;Three reasons:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. AI optimizes for plausibility, not necessity.&lt;/strong&gt; It lists components that &lt;em&gt;sound right for this kind of system&lt;/em&gt;, drawing from its training data. It can't know which components are necessary for &lt;em&gt;your&lt;/em&gt; specific scenario, because it doesn't see your scenario unless you walk it through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Scenarios surface the negative space.&lt;/strong&gt; A happy-path design is the union of every component someone &lt;em&gt;might&lt;/em&gt; need. A scenario walk is the intersection of components someone &lt;em&gt;definitely&lt;/em&gt; needs &lt;em&gt;for that scenario&lt;/em&gt;. The intersection is always smaller — and more honest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Scenarios surface what AI missed.&lt;/strong&gt; The batch-rollback runtime wasn't on AI's list. It surfaced because scenario 3 was a state AI's training data didn't lean on. Whatever your system's weird state is — only your scenarios will find it.&lt;/p&gt;

&lt;p&gt;The reason this method is hard to skip is that the &lt;strong&gt;pressure to just accept AI's design is enormous&lt;/strong&gt;. The design looks complete. It uses real words. You feel productive saying "yes, build it." Walking a scenario feels like &lt;em&gt;slowing down&lt;/em&gt;. It is. That's the whole point.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next in this series
&lt;/h2&gt;

&lt;p&gt;This is &lt;strong&gt;part 2&lt;/strong&gt; of the AI Collaboration Survival Guide. The previous post was about &lt;a href="https://dev.to/amingin_ai/i-told-my-ai-to-build-a-feature-did-it-i-had-no-idea-1f1"&gt;making AI's claims about completed work auditable via a backlog database&lt;/a&gt;. The next ones, lining up:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pain&lt;/th&gt;
&lt;th&gt;Coming up&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI edits one function, breaks 10 callers&lt;/td&gt;
&lt;td&gt;Code graph + impact analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI modifies code it shouldn't touch&lt;/td&gt;
&lt;td&gt;Governance hints as the only authoring surface&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What did AI even change this week?&lt;/td&gt;
&lt;td&gt;Event ledger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every session starts from zero&lt;/td&gt;
&lt;td&gt;Project memory layer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One pain per article. All built around the same open-source project, &lt;a href="https://github.com/amingclawdev/aming-claw" rel="noopener noreferrer"&gt;aming-claw&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  About aming-claw
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/amingclawdev/aming-claw" rel="noopener noreferrer"&gt;amingclawdev/aming-claw&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What it is:&lt;/strong&gt; A shared workspace where you and your AI agent see the same dashboard. Backlog database, code graph, event ledger, governance hints — all queryable by AI through MCP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why I'm writing this series:&lt;/strong&gt; I keep running into the same kind of AI-collaboration pain. Each post fixes one of them. The fixes generalize beyond aming-claw — the scenario-walk method in this post is a 5-minute habit you can adopt in any project.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the parallel-agent scenario sounded familiar, &lt;strong&gt;drop a comment with the architecture decision AI most recently tried to oversell you on&lt;/strong&gt; — I'll work through it the same way in the comments. Free architectural review, basically. The repo also takes stars and they're free for you to give. 🌟&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 2 of "AI Collaboration Survival Guide" — practical patterns for the messy reality of shipping with AI agents.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>vibecoding</category>
      <category>devtool</category>
    </item>
    <item>
      <title>I told my AI to build a feature. Did it? I had no idea.</title>
      <dc:creator>Aming</dc:creator>
      <pubDate>Sat, 16 May 2026 18:43:34 +0000</pubDate>
      <link>https://dev.to/amingin_ai/i-told-my-ai-to-build-a-feature-did-it-i-had-no-idea-1f1</link>
      <guid>https://dev.to/amingin_ai/i-told-my-ai-to-build-a-feature-did-it-i-had-no-idea-1f1</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — I tried to "manage" AI by having it write decisions, todos, and constraints into markdown docs. After 56 files, I realized AI doesn't maintain document state. So I built aming-claw — a backlog database AI can actually read and write through MCP.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  A bug I kept running into
&lt;/h2&gt;

&lt;p&gt;I thought I was doing AI collaboration the right way.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9toneqw4cnyxjolse20j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9toneqw4cnyxjolse20j.png" alt="Screenshot of docs/dev folder with 56 markdown files using proposal-, review-, and handoff- naming patterns" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is the &lt;code&gt;docs/dev/&lt;/code&gt; folder of my aming-claw project — 56 markdown files, all produced through AI collaboration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;proposal-*&lt;/code&gt; — new feature specs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;review-*&lt;/code&gt; — design review records&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;handoff-*&lt;/code&gt; — state passed between sessions&lt;/li&gt;
&lt;li&gt;Plus &lt;code&gt;plan-&lt;/code&gt;, &lt;code&gt;optimization-&lt;/code&gt;, &lt;code&gt;interface-&lt;/code&gt;, &lt;code&gt;manual-fix-&lt;/code&gt;...&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every file dated. Two months in, over a thousand pages of markdown. I figured the next AI session would read these. I figured I'd be able to search them too.&lt;/p&gt;

&lt;p&gt;But there's one problem I can't engineer my way out of:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI doesn't maintain document state.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;proposal-graph-state-reconcile-and-chain-governance-modes.md&lt;/code&gt; — did this proposal ship? Which commit? Is it still valid?&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;handoff-2026-05-10-dashboard-semantic-hash-queue.md&lt;/code&gt; — did the next session actually pick up where this left off?&lt;/li&gt;
&lt;li&gt;18 proposals on file. Which are done, which got rejected, which are still alive? Grep through git log line by line?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;I don't manually maintain the docs, so the docs rot.&lt;/strong&gt; AI doesn't maintain them either — its context window only sees a tiny slice of the workspace. The other 56 files are invisible.&lt;/p&gt;

&lt;p&gt;The more we talk, the more we write — and the further docs drift from code. Eventually you don't trust the docs, and you don't have time to read the code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this happens
&lt;/h2&gt;

&lt;p&gt;This isn't AI being lazy. It's a &lt;strong&gt;structural problem&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Markdown is dead text.&lt;/strong&gt; No state machine. "TODO" doesn't become "DONE" on its own. "Decision: use Redis" doesn't auto-expire when you flip back to in-memory three weeks later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI context has a boundary.&lt;/strong&gt; Each session sees ~200 lines of working code. Old docs never enter the window. Not in the window → can't be maintained.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No traceable link between docs and code.&lt;/strong&gt; Which TODO maps to which function? Once it's done, which commit landed it? Humans can't remember. AI doesn't look it up.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;GitHub Issues, Notion, Linear — none of these help. AI can't see them, so they don't exist.&lt;/p&gt;

&lt;p&gt;The core mismatch is this: &lt;strong&gt;humans want global state. AI sees only local present.&lt;/strong&gt; Between them you need a living, traceable, AI-readable/writable state layer. Markdown isn't that layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  How aming-claw solves it
&lt;/h2&gt;

&lt;p&gt;I gave aming-claw a &lt;strong&gt;dedicated backlog database&lt;/strong&gt; — a peer-level system to the code graph and event ledger, with its own schema, state machine, and query interface. Not stored in markdown. Not buried in code comments. Not dependent on an external issue tracker.&lt;/p&gt;

&lt;p&gt;Each backlog entry is a structured record (todo / decision / constraint) with status, priority, source session, and a code reference (function name or file path). AI reads and writes it through MCP.&lt;/p&gt;

&lt;p&gt;The flow:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. You speak → it goes to the database, not a dead doc
&lt;/h3&gt;

&lt;p&gt;In chat:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Add a retry-after to the rate limiter on UserService.login"&lt;/p&gt;

&lt;p&gt;Or: "Decision — use Redis instead of in-memory for caching"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;aming-claw's MCP server intercepts those statements and writes directly into the backlog:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;UserService.login&lt;/span&gt;   &lt;span class="c1"&gt;# function or file path&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;todo | decision | constraint&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;proposed&lt;/span&gt;
&lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  &lt;span class="s"&gt;P1&lt;/span&gt;
&lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;session-id-xyz&lt;/span&gt;
&lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2026-05-16T10:23:45Z&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Markdown is dead text. The backlog database is live state — schema, indexed, state-machined, AI-accessible. That's the difference.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Dashboard shows it instantly
&lt;/h3&gt;

&lt;p&gt;Open the aming-claw dashboard — the left panel shows the new backlog entry. Click it — the right panel jumps to the function via the &lt;code&gt;vscode://&lt;/code&gt; protocol. Status chips are editable inline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1z428hnem7uh0tlsz5z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1z428hnem7uh0tlsz5z.png" alt="aming-claw dashboard backlog view showing multiple entries with priority, status, code references, and update timestamps" width="800" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The backlog view — every entry has priority, status, code reference, and update timestamp. AI and you query the same source of truth.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. State machine, automatic
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;proposed → in_progress → done(commit hash) → verified
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;in_progress&lt;/code&gt; — AI started working on it&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;done&lt;/code&gt; — commit landed, &lt;strong&gt;hash automatically bound&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;verified&lt;/code&gt; — you reviewed it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every state change is appended to an event ledger: &lt;strong&gt;which day, which session proposed it, which commit implemented it, who verified it&lt;/strong&gt; — all queryable, all replayable.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. AI reads the backlog itself, next time
&lt;/h3&gt;

&lt;p&gt;Days later, in chat:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Did we ever fix that Codex plugin Windows install bug?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;AI queries the backlog through MCP and returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;     &lt;span class="s"&gt;FIXED, P0&lt;/span&gt;
&lt;span class="na"&gt;commit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;     &lt;span class="s"&gt;0ad8c7e&lt;/span&gt;
&lt;span class="na"&gt;fixed at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;2 days ago&lt;/span&gt;
&lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;       &lt;span class="s"&gt;agent/plugin_installer.py (line 455)&lt;/span&gt;
&lt;span class="na"&gt;change&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;     &lt;span class="s"&gt;replaced regex pattern with callable replacement&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No grepping git log. No asking a teammate. No "I think we did?"&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfyd6yvkpktfdsscq9qg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfyd6yvkpktfdsscq9qg.png" alt="aming-claw dashboard backlog view showing multiple entries with priority, status, code references, and update timestamps" width="800" height="655"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The key thing to notice:&lt;/strong&gt; AI didn't "remember" this from conversation history. It queried the backlog database &lt;strong&gt;in real time&lt;/strong&gt; through MCP. Even if this bug was raised three months ago, in a session that's long gone — AI still gets the &lt;strong&gt;current status + full commit trace&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's the difference between dead markdown and a live state layer: &lt;strong&gt;the database is the memory, not the conversation.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  This is just the start
&lt;/h2&gt;

&lt;p&gt;Look back at the &lt;code&gt;docs/dev/&lt;/code&gt; screenshot — 56 markdown files, nobody knows which are alive.&lt;br&gt;
Look at the dashboard screenshot — every backlog entry has status, commit, location.&lt;/p&gt;

&lt;p&gt;The difference isn't the tool. &lt;strong&gt;It's whether information has state.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The backlog solves "did the AI build the feature I asked for?" — but AI collaboration has plenty of other holes I'm planning to fill in this series:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pain&lt;/th&gt;
&lt;th&gt;Next article&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI edits one function, breaks 10 callers&lt;/td&gt;
&lt;td&gt;Code graph + impact analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI modifies code it shouldn't touch&lt;/td&gt;
&lt;td&gt;Governance hints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What did AI even change this week?&lt;/td&gt;
&lt;td&gt;Event ledger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every session starts from zero&lt;/td&gt;
&lt;td&gt;Project memory layer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One article per pain point.&lt;/p&gt;




&lt;h2&gt;
  
  
  About aming-claw
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/amingclawdev/aming-claw" rel="noopener noreferrer"&gt;amingclawdev/aming-claw&lt;/a&gt; — open source&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Next post:&lt;/strong&gt; "AI breaks 10 callers when it edits one function" — coming this week&lt;/li&gt;
&lt;li&gt;Hit me with issues if you've felt this pain&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If &lt;em&gt;"did the AI actually do that thing I asked?"&lt;/em&gt; sounds familiar, give the repo a star — it costs you nothing and tells me I'm not the only one.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is part 1 of an "AI Collaboration Survival Guide" series — practical tools for the messy reality of building with AI agents.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>devtools</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
