<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: holger leichsenring</title>
    <description>The latest articles on DEV Community by holger leichsenring (@holgerleichsenring).</description>
    <link>https://dev.to/holgerleichsenring</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3875448%2F75bc803a-7f4d-4f33-8e46-b91040ca2a78.png</url>
      <title>DEV Community: holger leichsenring</title>
      <link>https://dev.to/holgerleichsenring</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/holgerleichsenring"/>
    <language>en</language>
    <item>
      <title>Specification-First Agentic Development: A Methodology for Structured, Traceable AI-Assisted Development</title>
      <dc:creator>holger leichsenring</dc:creator>
      <pubDate>Sun, 12 Apr 2026 20:04:59 +0000</pubDate>
      <link>https://dev.to/holgerleichsenring/specification-first-agentic-development-a-methodology-for-structured-traceable-ai-assisted-la</link>
      <guid>https://dev.to/holgerleichsenring/specification-first-agentic-development-a-methodology-for-structured-traceable-ai-assisted-la</guid>
      <description>&lt;p&gt;I like clean code. Most of the programs I've written over the years are reasonably well structured. Sure, there's always that moment every two years where I look at old code and think — &lt;em&gt;evolved, great&lt;/em&gt;. Stagnation is dead. But the code follows the "right" rules, whatever that means at the time. Most importantly, it follows a common thread.&lt;/p&gt;

&lt;p&gt;As a freelancer switching between IIOT applications, web apps, message-based backends, infrastructure automation, and pipelines, I'm mostly in the luxury position of being able to forget what I did and understand it again quickly just by reading the lines and the folder structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But what I very rarely do is document the &lt;em&gt;why&lt;/em&gt;.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Religious Wars of Documentation
&lt;/h2&gt;

&lt;p&gt;You know the debate. Using IDE features to auto-generate docs for classes and methods that add exactly zero value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;/// &amp;lt;summary&amp;gt;&lt;/span&gt;
&lt;span class="c1"&gt;/// Gets a blue collar worker &lt;/span&gt;
&lt;span class="c1"&gt;/// &amp;lt;/summary&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GetBlueCollarWorkerRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ILogger&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GetBlueCollarWorkerRequestHandler&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IBlueCollarWorkerAdapter&lt;/span&gt; &lt;span class="n"&gt;blueCollarWorkerAdapter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IRequestHandler&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GetBlueCollarWorkerRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GetBlueCollarWorkerResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the method is called &lt;code&gt;Handle&lt;/code&gt; and the class is called &lt;code&gt;GetBlueCollarWorkerRequestHandler&lt;/code&gt;, I really don't need &lt;em&gt;"Gets a Blue Collar Worker"&lt;/em&gt; written above it.&lt;/p&gt;

&lt;p&gt;So personally I only document two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Official interfaces&lt;/strong&gt; — Swagger/REST APIs, NuGet/npm packages, libs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Problematic areas&lt;/strong&gt; — when I write something genuinely non-obvious, with links to sources&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sometimes that's enough. But it doesn't help you get an overview of &lt;em&gt;why&lt;/em&gt; the program works the way it does. Documentation of the "why" takes time. I use Arc42 with Architecture Decision Records — but that's in a separate repository, and it only covers architectural thoughts, not implementation decisions made feature-by-feature.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result: six months after shipping, nobody remembers why anything was built the way it was.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How Developer Work Has Changed
&lt;/h2&gt;

&lt;p&gt;We've all been through the evolution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stack Overflow copy-paste &lt;em&gt;(okay, I did it)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;ChatGPT copy-paste &lt;em&gt;(I lied, I did both)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Claude/Codex in the IDE writing the code for me&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When I started with AI-assisted coding, I ran into the usual problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude just &lt;em&gt;implements things&lt;/em&gt;. Lots of code. Am I still going to read all of it?&lt;/li&gt;
&lt;li&gt;Using &lt;code&gt;claude.md&lt;/code&gt; and coding principles — Claude sometimes just ignores them&lt;/li&gt;
&lt;li&gt;Well-structured code → Claude produces good results. Bad code → Claude makes it worse&lt;/li&gt;
&lt;li&gt;Beyond a certain complexity, Claude starts doing weird things&lt;/li&gt;
&lt;li&gt;"Just let him do" is a terrible idea, even when you're working on three things in parallel&lt;/li&gt;
&lt;li&gt;Context-switching between parallel topics is hard for humans, harder for AI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And then there were the recurring annoyances with IDE-based Claude:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After context is lost, I waste a lot of tokens just getting back to where I was&lt;/li&gt;
&lt;li&gt;Long chat threads create cluttered history — and once it's gone, re-explaining everything is exhausting&lt;/li&gt;
&lt;li&gt;Every restart, I'm explaining the entire project from scratch again&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Specification-First Agentic Development
&lt;/h2&gt;

&lt;p&gt;I felt my approach wasn't good enough. It wasn't leveraging what AI &lt;em&gt;could&lt;/em&gt; do — specifically: &lt;strong&gt;document whatever you want, without complaining&lt;/strong&gt;, which is something I as a developer was never willing to do consistently.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Core Idea
&lt;/h3&gt;

&lt;p&gt;Instead of staying in the IDE and trying to keep track, I needed a more structured approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Everything gets written down&lt;/li&gt;
&lt;li&gt;The AI needs to know where it was and what to do next&lt;/li&gt;
&lt;li&gt;I need to track what's changed, what's planned, and what's done&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Phase Workflow
&lt;/h3&gt;

&lt;p&gt;Here's how it looks in practice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discuss&lt;/strong&gt; new things in Claude Web (not the IDE)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build a rough plan&lt;/strong&gt; and create a &lt;code&gt;.md&lt;/code&gt; document from the conversation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Move the file to the IDE&lt;/strong&gt; — let Claude double-check the document, ask questions, resolve ambiguities&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Move to &lt;code&gt;planned/&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2njxdjpunnbhahn480yj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2njxdjpunnbhahn480yj.png" alt="phases" width="740" height="1352"&gt;&lt;/a&gt;&lt;br&gt;
When it's time to implement, Claude always knows exactly which phases exist and how to handle them.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Folder Structure
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.agentsmith/phases/
├── done/       # completed phases (historical reference)
├── active/     # phase currently being worked on (max 1)
└── planned/    # upcoming phases with requirements
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;claude.md&lt;/code&gt; Instructions
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Implementation Workflow (follow this order for every phase)&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Write phase prompt first — create planned/p{NN}-slug.md BEFORE writing any code
&lt;span class="p"&gt;2.&lt;/span&gt; Move to active — when starting work
&lt;span class="p"&gt;3.&lt;/span&gt; Enter plan mode — explore codebase, design approach, get approval before coding
&lt;span class="p"&gt;4.&lt;/span&gt; Implement step by step — contracts first, then implementation, then DI, then tests
&lt;span class="p"&gt;5.&lt;/span&gt; Build after each step — fix errors immediately
&lt;span class="p"&gt;6.&lt;/span&gt; Run ALL tests — 0 failures before moving on
&lt;span class="p"&gt;7.&lt;/span&gt; Log decisions — append to decisions.md (what, alternatives, why) — MANDATORY
&lt;span class="p"&gt;8.&lt;/span&gt; Update context.yaml — move phase from active to done
&lt;span class="p"&gt;9.&lt;/span&gt; Move to done
&lt;span class="p"&gt;10.&lt;/span&gt; Commit — one commit per phase
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Decision Log
&lt;/h2&gt;

&lt;p&gt;After every phase, Claude appends to &lt;code&gt;decisions.md&lt;/code&gt; in the repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## p66: Docs Enhancement — Self-Documentation &amp;amp; Multi-Agent Orchestration&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; [Architecture] DESIGN.md placed in docs/ not project root — it is a docs-site
  concern, not product code
&lt;span class="p"&gt;-&lt;/span&gt; [Tooling] CSS-only theme overrides via extra_css, no custom MkDocs templates —
  keeps MkDocs upgrades safe
&lt;span class="p"&gt;-&lt;/span&gt; [TradeOff] Content first, styling second — missing content is a blocker,
  imperfect styling is not

&lt;span class="gu"&gt;## p67: API Scan Compression &amp;amp; ZAP Fix&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; [Architecture] Category slicing (auth/design/runtime) instead of finding
  compression — findings are already compact at ~90 chars/piece, compression
  would lose information
&lt;span class="p"&gt;-&lt;/span&gt; [Implementation] Skip DAST skills on ZAP failure via ZapFailed flag — avoids
  wasting 2 LLM calls on empty input
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Six months later, when nobody remembers why anything was done the way it was — it's all right there.&lt;/p&gt;




&lt;h2&gt;
  
  
  Saving Tokens with &lt;code&gt;context.yaml&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Having all these documents in the repo, I don't want Claude to re-read everything from scratch every time. That's what &lt;code&gt;context.yaml&lt;/code&gt; is for.&lt;/p&gt;

&lt;p&gt;It describes the architecture, stack, integrations, quality rules, and — critically — a compressed summary of every phase:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;meta&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agent-smith&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1.0.0&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;agent&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;pipeline&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;purpose&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Self-hosted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;AI&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;orchestration&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;framework:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;code,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;legal,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;security,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;workflows."&lt;/span&gt;

&lt;span class="na"&gt;stack&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.NET &lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;
  &lt;span class="na"&gt;lang&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;C#&lt;/span&gt;
  &lt;span class="na"&gt;infra&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Docker&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;K8s&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Redis&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;testing&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;xUnit&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Moq&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;FluentAssertions&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;sdks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Anthropic&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;OpenAI&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Google-Gemini&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Octokit&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;LibGit2Sharp&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;done&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;p01&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Solution&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;structure,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;domain&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;entities,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;contracts,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;YAML&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;config&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;loader"&lt;/span&gt;
    &lt;span class="na"&gt;p02&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Command/Handler&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pattern:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;records,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;handler&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;stubs,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;CommandExecutor"&lt;/span&gt;
    &lt;span class="na"&gt;p03&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Providers:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;AzureDevOps+GitHub&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tickets,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Local+GitHub&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;source,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Claude&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;agentic&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;loop"&lt;/span&gt;
    &lt;span class="na"&gt;p04&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pipeline&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;execution:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;IntentParser,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;PipelineExecutor,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ProcessTicketUseCase,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;DI&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;wiring"&lt;/span&gt;
    &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude knows directly what features have been implemented just by reading this file — and knows which phase document to look at for details.&lt;/p&gt;




&lt;h2&gt;
  
  
  An Interesting Parallel: Karpathy's Knowledge Base
&lt;/h2&gt;

&lt;p&gt;Andrej Karpathy recently wrote about using LLMs to build personal knowledge bases: collecting external material into a &lt;code&gt;raw/&lt;/code&gt; directory, letting the LLM compile it into a linked markdown wiki, then running Q&amp;amp;A against it.&lt;/p&gt;

&lt;p&gt;The structural parallel is obvious. But there's a key difference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Karpathy&lt;/strong&gt; collects &lt;em&gt;external&lt;/em&gt; knowledge — papers, articles, datasets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specification-First Agentic Development&lt;/strong&gt; persists &lt;em&gt;internal&lt;/em&gt; knowledge &lt;em&gt;while building the product&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The documentation isn't a separate artifact you create after the fact. It's generated as a side effect of the development process itself.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Real Example: The Documentation Site
&lt;/h2&gt;

&lt;p&gt;As all features, bugfixes, ideas, and decisions are already documented in the repo, it's not surprising that Claude can generate full technical documentation rapidly — and accurately.&lt;/p&gt;

&lt;p&gt;Phase 53 of my project was exactly this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Phase 53: Documentation Site&lt;/span&gt;

&lt;span class="gu"&gt;## Goal: Technical documentation at docs.agent-smith.org&lt;/span&gt;

Complete file structure, MkDocs Material setup, GitHub Actions
deployment, README reduction to essentials...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How long did it take? About 15 minutes. Because all the information was already there — in the phase files, the decision log, the context.yaml. Claude just had to synthesize it.&lt;/p&gt;

&lt;p&gt;I really celebrated that one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Specification-First Agentic Development is just how the work is structured. It defines phases directly in code, producing a consistent development pattern that includes the plan, the decisions, and the reasoning.&lt;/p&gt;

&lt;p&gt;The benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fewer wasted tokens&lt;/strong&gt; — context.yaml gives Claude what it needs without re-reading everything&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelism&lt;/strong&gt; — multiple phases can be planned and tracked simultaneously&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traceability&lt;/strong&gt; — every decision is logged with alternatives and rationale&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Restartability&lt;/strong&gt; — restart your machine, restart Claude, pick up exactly where you left off&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-documentation&lt;/strong&gt; — the docs practically write themselves&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's not rocket science. It's just discipline — finally enforced by a patient AI that never complains about writing things down.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://github.com/holgerleichsenring/specification-first-agentic-development" rel="noopener noreferrer"&gt;GitHub Repo&lt;/a&gt; with the template and howto explanation.&lt;br&gt;
Agent Smith implementation where the idea was born &lt;a href="https://github.com/holgerleichsenring/agent-smith" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; &lt;br&gt;
Originally posted on &lt;a href="https://codingsoul.org/2026/04/11/the-why-never-gets-written-down/" rel="noopener noreferrer"&gt;CodingSoul&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>agile</category>
      <category>claudeai</category>
    </item>
  </channel>
</rss>
