<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Truong Phung</title>
    <description>The latest articles on DEV Community by Truong Phung (@truongpx396).</description>
    <link>https://dev.to/truongpx396</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg</url>
      <title>DEV Community: Truong Phung</title>
      <link>https://dev.to/truongpx396</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/truongpx396"/>
    <language>en</language>
    <item>
      <title>📘 Spec Kit vs. Superpowers ⚡ — A Comprehensive Comparison &amp; Practical Guide to Combining Both 🚀</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Sat, 25 Apr 2026 10:22:45 +0000</pubDate>
      <link>https://dev.to/truongpx396/spec-kit-vs-superpowers-a-comprehensive-comparison-practical-guide-to-combining-both-52jj</link>
      <guid>https://dev.to/truongpx396/spec-kit-vs-superpowers-a-comprehensive-comparison-practical-guide-to-combining-both-52jj</guid>
      <description>&lt;p&gt;A side-by-side look at two of the most influential frameworks for structured, agentic AI coding — plus a step-by-step playbook for using them together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/github/spec-kit" rel="noopener noreferrer"&gt;github/spec-kit&lt;/a&gt;&lt;/strong&gt; — GitHub's toolkit for &lt;strong&gt;Spec-Driven Development (SDD)&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;obra/superpowers&lt;/a&gt;&lt;/strong&gt; — Jesse Vincent's &lt;strong&gt;agentic skills framework&lt;/strong&gt; for disciplined agent-driven development.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both projects address the same underlying problem — &lt;em&gt;AI coding agents are powerful but unstructured&lt;/em&gt; — but they solve it from very different angles. Spec Kit treats &lt;strong&gt;the specification as the source of truth&lt;/strong&gt;; Superpowers treats &lt;strong&gt;the development workflow as the source of truth&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  📑 Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;⚡ TL;DR&lt;/li&gt;
&lt;li&gt;1. 🧠 Philosophy&lt;/li&gt;
&lt;li&gt;2. 🔄 Workflow &amp;amp; Mental Model&lt;/li&gt;
&lt;li&gt;3. 🏗️ Architecture &amp;amp; Primary Unit&lt;/li&gt;
&lt;li&gt;4. 🤖 Agent / Tool Compatibility&lt;/li&gt;
&lt;li&gt;5. 📦 Installation &amp;amp; Distribution&lt;/li&gt;
&lt;li&gt;6. 🧩 Customization &amp;amp; Extensibility&lt;/li&gt;
&lt;li&gt;7. 🌟 What Each Does Especially Well&lt;/li&gt;
&lt;li&gt;8. ⚖️ Tradeoffs &amp;amp; Limitations&lt;/li&gt;
&lt;li&gt;9. 🤝 How They Could Coexist&lt;/li&gt;
&lt;li&gt;
9a. 🛠️ The Best Way to Combine Both — A Practical Guide

&lt;ul&gt;
&lt;li&gt;⚙️ One-time setup&lt;/li&gt;
&lt;li&gt;🔁 The per-feature loop&lt;/li&gt;
&lt;li&gt;📜 Two non-obvious rules&lt;/li&gt;
&lt;li&gt;📋 Handoff prompt template&lt;/li&gt;
&lt;li&gt;📏 When to scale down&lt;/li&gt;
&lt;li&gt;🚫 Anti-patterns to avoid&lt;/li&gt;
&lt;li&gt;✅ 60-second checklist&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;10. 🧭 Quick Decision Guide&lt;/li&gt;

&lt;li&gt;11. 📊 At-a-Glance Summary&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚡ TL;DR
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Spec Kit&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Superpowers&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Author / Owner&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GitHub (org-backed)&lt;/td&gt;
&lt;td&gt;Jesse Vincent + Prime Radiant team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core idea&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Specs are executable; code is generated &lt;em&gt;from&lt;/em&gt; specs&lt;/td&gt;
&lt;td&gt;Skills enforce a disciplined dev workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary artifact&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The specification document&lt;/td&gt;
&lt;td&gt;The skill (a triggered procedure)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trigger model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User-invoked slash commands (&lt;code&gt;/speckit.*&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Auto-triggered skills based on context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Methodology&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spec-Driven Development (SDD)&lt;/td&gt;
&lt;td&gt;Agentic SDLC (brainstorm → design → plan → TDD → review → ship)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Greenfield features, brownfield enhancements, spec-to-code traceability&lt;/td&gt;
&lt;td&gt;Multi-hour autonomous work, parallel subagents, TDD discipline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Distribution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Python CLI (&lt;code&gt;uv tool install specify-cli&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Plugin marketplaces (Claude, Codex, Cursor, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;License&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maturity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;90.8k stars, 136 releases, 100+ community extensions&lt;/td&gt;
&lt;td&gt;167k stars, active releases, Discord community&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  1. 🧠 Philosophy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit — "The spec is the source of truth"
&lt;/h3&gt;

&lt;p&gt;Spec Kit flips the traditional flow: instead of writing code that loosely tracks a spec, the spec &lt;strong&gt;directly generates the implementation&lt;/strong&gt;. Changes happen at the spec layer first; code is regenerated to match. Quoting the README: &lt;em&gt;"specifications become executable, directly generating working implementations rather than just guiding them."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Foundational principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Intent first&lt;/strong&gt; — what &amp;amp; why before how&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rich, guard-railed specifications&lt;/strong&gt; with organizational principles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-step refinement&lt;/strong&gt; instead of one-shot generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-native workflows&lt;/strong&gt; that lean on advanced model capabilities&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🦸 Superpowers — "The workflow is the source of truth"
&lt;/h3&gt;

&lt;p&gt;Superpowers prevents agents from immediately jumping into code. It enforces a disciplined sequence: discovery → design validation → planning → implementation → review → completion. From the README: &lt;em&gt;"As soon as it sees that you're building something, it doesn't just jump into trying to write code."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Four core principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Test-Driven Development&lt;/strong&gt; — tests precede all code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Systematic over ad-hoc&lt;/strong&gt; — process replaces guessing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity reduction&lt;/strong&gt; — simplicity is the primary goal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evidence over claims&lt;/strong&gt; — verify before declaring success&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔀 The philosophical split
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spec Kit&lt;/strong&gt; is &lt;em&gt;artifact-centric&lt;/em&gt;. The spec persists, evolves, and is the contract.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Superpowers&lt;/strong&gt; is &lt;em&gt;process-centric&lt;/em&gt;. The procedure persists; the artifact is whatever the procedure produces.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. 🔄 Workflow &amp;amp; Mental Model
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit's 7-step workflow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/speckit.constitution&lt;/code&gt;&lt;/strong&gt; — Establish project governing principles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/speckit.specify&lt;/code&gt;&lt;/strong&gt; — Define requirements and user stories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/speckit.clarify&lt;/code&gt;&lt;/strong&gt; — Clarify underspecified requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/speckit.plan&lt;/code&gt;&lt;/strong&gt; — Create technical implementation plans&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Validate the plan for completeness&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/speckit.tasks&lt;/code&gt;&lt;/strong&gt; — Generate actionable task breakdowns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/speckit.implement&lt;/code&gt;&lt;/strong&gt; — Execute tasks to build features&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Three modes: &lt;strong&gt;0-to-1&lt;/strong&gt; (greenfield), &lt;strong&gt;Creative Exploration&lt;/strong&gt; (parallel implementations across stacks), &lt;strong&gt;Iterative Enhancement&lt;/strong&gt; (brownfield).&lt;/p&gt;

&lt;h3&gt;
  
  
  🦸 Superpowers' 7 workflow stages
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Brainstorming&lt;/strong&gt; — Socratic questioning to refine ideas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Using Git Worktrees&lt;/strong&gt; — Isolated branches with verified test baselines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Writing Plans&lt;/strong&gt; — Break work into 2–5 minute tasks with exact specs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subagent-Driven Development&lt;/strong&gt; — Fresh subagent per task, two-stage review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test-Driven Development&lt;/strong&gt; — Strict RED-GREEN-REFACTOR&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requesting Code Review&lt;/strong&gt; — Pre-review checklists, severity-based tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Finishing Development Branches&lt;/strong&gt; — Merge/PR decision + cleanup&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  🎯 Key contrast
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Spec Kit's workflow is &lt;strong&gt;linear and document-producing&lt;/strong&gt;: each command emits an artifact (constitution, spec, plan, tasks).&lt;/li&gt;
&lt;li&gt;Superpowers' workflow is &lt;strong&gt;stateful and execution-producing&lt;/strong&gt;: each stage manipulates code, tests, branches, and subagent state.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. 🏗️ Architecture &amp;amp; Primary Unit
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit — Slash commands + templates
&lt;/h3&gt;

&lt;p&gt;Six explicit, user-invoked slash commands (&lt;code&gt;/speckit.constitution&lt;/code&gt;, &lt;code&gt;/speckit.specify&lt;/code&gt;, &lt;code&gt;/speckit.clarify&lt;/code&gt;, &lt;code&gt;/speckit.plan&lt;/code&gt;, &lt;code&gt;/speckit.tasks&lt;/code&gt;, &lt;code&gt;/speckit.implement&lt;/code&gt;). Each is a template that produces a structured artifact stored under &lt;code&gt;.specify/&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.specify/
├── memory/         # Constitution and governance
├── specs/          # Feature specifications by ID
├── scripts/        # Helper automation scripts
├── extensions/     # Custom extensions
├── presets/        # Workflow customizations
└── templates/      # Command templates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🦸 Superpowers — Skills + agents + plugins
&lt;/h3&gt;

&lt;p&gt;14+ composable skills organized by category:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Testing&lt;/strong&gt;: &lt;code&gt;test-driven-development&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging&lt;/strong&gt;: &lt;code&gt;systematic-debugging&lt;/code&gt;, &lt;code&gt;verification-before-completion&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaboration&lt;/strong&gt;: &lt;code&gt;brainstorming&lt;/code&gt;, &lt;code&gt;writing-plans&lt;/code&gt;, &lt;code&gt;executing-plans&lt;/code&gt;, &lt;code&gt;dispatching-parallel-agents&lt;/code&gt;, &lt;code&gt;requesting-code-review&lt;/code&gt;, &lt;code&gt;receiving-code-review&lt;/code&gt;, &lt;code&gt;using-git-worktrees&lt;/code&gt;, &lt;code&gt;finishing-a-development-branch&lt;/code&gt;, &lt;code&gt;subagent-driven-development&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meta&lt;/strong&gt;: &lt;code&gt;writing-skills&lt;/code&gt;, &lt;code&gt;using-superpowers&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agents/        # Agent definitions
skills/        # Skill implementations (auto-triggered)
commands/      # CLI command definitions
.claude-plugin/, .codex-plugin/, .cursor-plugin/   # Per-host configs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🎬 Trigger model — the deepest difference
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spec Kit&lt;/strong&gt;: human types &lt;code&gt;/speckit.plan&lt;/code&gt;. Explicit, deterministic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Superpowers&lt;/strong&gt;: skill auto-fires when its description matches the situation. The agent doesn't &lt;em&gt;decide&lt;/em&gt; to brainstorm; the brainstorming skill triggers because the user mentioned a vague idea.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes Spec Kit feel like a &lt;strong&gt;CLI you drive&lt;/strong&gt;, and Superpowers feel like an &lt;strong&gt;operating system the agent inhabits&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. 🤖 Agent / Tool Compatibility
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent / Tool&lt;/th&gt;
&lt;th&gt;Spec Kit&lt;/th&gt;
&lt;th&gt;Superpowers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (official + Superpowers marketplace)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Copilot CLI&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini CLI&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cursor (CLI / IDE)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (plugin marketplace)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI Codex CLI / Codex App&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenCode&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen / Mistral / others&lt;/td&gt;
&lt;td&gt;✅ (30+ agents total)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Spec Kit casts a wider net&lt;/strong&gt; (30+ agents), selected at install time via &lt;code&gt;--integration&lt;/code&gt;. &lt;strong&gt;Superpowers goes deeper per host&lt;/strong&gt;, with first-class plugin packages tailored to each ecosystem.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. 📦 Installation &amp;amp; Distribution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv tool &lt;span class="nb"&gt;install &lt;/span&gt;specify-cli &lt;span class="nt"&gt;--from&lt;/span&gt; git+https://github.com/github/spec-kit.git@vX.Y.Z
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Python 3.11+, Git, &lt;code&gt;uv&lt;/code&gt; or &lt;code&gt;pipx&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Cross-platform (Linux/macOS/Windows)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Distributed only from GitHub&lt;/strong&gt; — PyPI packages with the same name are not official&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🦸 Superpowers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Claude plugin marketplace: &lt;code&gt;/plugin install superpowers@claude-plugins-official&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Superpowers marketplace registration&lt;/li&gt;
&lt;li&gt;Per-agent installation flows for Codex, Cursor, OpenCode, Copilot CLI, Gemini CLI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Spec Kit&lt;/strong&gt; is a single CLI you install once and configure per project. &lt;strong&gt;Superpowers&lt;/strong&gt; is a plugin you install per agent host, with the host's plugin system managing updates.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. 🧩 Customization &amp;amp; Extensibility
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extensions&lt;/strong&gt; — add new capabilities (Jira sync, post-implementation review, …)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Presets&lt;/strong&gt; — customize existing workflows (compliance formats, terminology localization)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100+ community-contributed extensions&lt;/strong&gt; across &lt;code&gt;docs&lt;/code&gt;, &lt;code&gt;code&lt;/code&gt;, &lt;code&gt;process&lt;/code&gt;, &lt;code&gt;integration&lt;/code&gt;, &lt;code&gt;visibility&lt;/code&gt; categories&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🦸 Superpowers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Skills are the extension primitive&lt;/strong&gt; — write your own &lt;code&gt;SKILL.md&lt;/code&gt; with a description that triggers in your situation&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;writing-skills&lt;/code&gt; meta-skill teaches the agent how to author new skills&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;using-superpowers&lt;/code&gt; documents how skills compose&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 Comparison
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Spec Kit's extension model is &lt;strong&gt;catalog-driven&lt;/strong&gt; — you browse and adopt prebuilt pieces.&lt;/li&gt;
&lt;li&gt;Superpowers' extension model is &lt;strong&gt;author-driven&lt;/strong&gt; — the framework actively supports you writing the next skill.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. 🌟 What Each Does Especially Well
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit shines when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;You need &lt;strong&gt;traceability from requirement to code&lt;/strong&gt; (audits, compliance, regulated industries)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;product manager / non-engineer&lt;/strong&gt; owns the spec and engineers consume it&lt;/li&gt;
&lt;li&gt;You want to &lt;strong&gt;swap stacks&lt;/strong&gt;: regenerate the same spec into Go, Python, TypeScript&lt;/li&gt;
&lt;li&gt;Your org already thinks in terms of &lt;strong&gt;PRDs, RFCs, and design docs&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;enterprise-style governance&lt;/strong&gt; with constitution-level constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🦸 Superpowers shines when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;You want the agent to run &lt;strong&gt;autonomously for hours&lt;/strong&gt; without going off-rails&lt;/li&gt;
&lt;li&gt;You want &lt;strong&gt;strict TDD&lt;/strong&gt; baked into the agent's behavior, not just hoped for&lt;/li&gt;
&lt;li&gt;You're orchestrating &lt;strong&gt;parallel subagents&lt;/strong&gt; and need built-in review patterns&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;evidence-based completion&lt;/strong&gt; — agent must prove it worked, not claim it&lt;/li&gt;
&lt;li&gt;You're operating at the &lt;strong&gt;frontier of agent autonomy&lt;/strong&gt; and want guardrails by default&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  8. ⚖️ Tradeoffs &amp;amp; Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📘 Spec Kit
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Heavier upfront cost&lt;/strong&gt; — writing a constitution and spec before any code feels slow for small tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Less suited for exploratory hacking&lt;/strong&gt; — the workflow assumes you know roughly what you want&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec drift risk&lt;/strong&gt; — if the team edits code without updating specs, the "single source of truth" erodes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document-heavy&lt;/strong&gt; — generates many markdown artifacts that need maintenance&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🦸 Superpowers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Opinionated&lt;/strong&gt; — the workflow assumes you want TDD, worktrees, subagent orchestration; if you don't, friction is high&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity floor&lt;/strong&gt; — even small tasks pay some procedural overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Learning curve&lt;/strong&gt; — 14+ skills and a meta-vocabulary (subagent-driven-development, verification-before-completion) take time to internalize&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-triggering can surprise&lt;/strong&gt; — a skill firing unexpectedly can derail a session if descriptions are loose&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  9. 🤝 How They Could Coexist
&lt;/h2&gt;

&lt;p&gt;These are &lt;strong&gt;not mutually exclusive&lt;/strong&gt;. A team could realistically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;Spec Kit&lt;/strong&gt; for the &lt;em&gt;what&lt;/em&gt; — constitution, spec, plan, tasks committed to the repo as durable artifacts&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Superpowers&lt;/strong&gt; for the &lt;em&gt;how&lt;/em&gt; — once tasks exist, Superpowers' TDD, worktree, subagent, and review skills execute them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The artifacts Spec Kit produces (&lt;code&gt;.specify/specs/&amp;lt;id&amp;gt;/tasks.md&lt;/code&gt;) are exactly the kind of plan Superpowers' &lt;code&gt;executing-plans&lt;/code&gt; skill is designed to consume. The two systems target different layers of the same problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  9a. 🛠️ The Best Way to Combine Both — A Practical Guide
&lt;/h2&gt;

&lt;p&gt;The mental model in one sentence:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Spec Kit plans WHAT to build. Superpowers controls HOW it gets built.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Spec Kit gives you durable, human-readable artifacts (constitution → spec → plan → tasks). Superpowers takes those tasks and executes them with TDD, worktrees, subagents, and review baked in. You hand off at the &lt;strong&gt;task list&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚙️ One-time setup (do this once per machine + once per repo)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;On your machine:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install Spec Kit:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   uv tool &lt;span class="nb"&gt;install &lt;/span&gt;specify-cli &lt;span class="nt"&gt;--from&lt;/span&gt; git+https://github.com/github/spec-kit.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Install Superpowers in your agent host. For Claude Code:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;   /plugin install superpowers@claude-plugins-official
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;In your repo (once):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Initialize Spec Kit with your agent: &lt;code&gt;specify init --integration claude-code&lt;/code&gt; (or whichever agent you use).&lt;/li&gt;
&lt;li&gt;Run &lt;strong&gt;&lt;code&gt;/speckit.constitution&lt;/code&gt;&lt;/strong&gt; once to set project-wide rules. Add a single line that bridges the two systems:
&amp;gt; &lt;em&gt;"Implementation of any task list MUST follow the Superpowers workflow: worktree → TDD (red-green-refactor) → subagent-driven execution → code review → finish-branch."&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Commit &lt;code&gt;.specify/&lt;/code&gt; to the repo. Add &lt;code&gt;.claude/&lt;/code&gt; (or your host's plugin dir) per your team's policy.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the entire setup. From here on, every feature follows the same loop.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔁 The per-feature loop (the one you actually use)
&lt;/h3&gt;

&lt;p&gt;Run these in order. Each step is a single command or short prompt.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Command / Prompt&lt;/th&gt;
&lt;th&gt;What you get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Superpowers&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;"Let's brainstorm: I want to add X."&lt;/em&gt; (triggers &lt;code&gt;brainstorming&lt;/code&gt; skill)&lt;/td&gt;
&lt;td&gt;Clarified idea, alternatives considered&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Spec Kit&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/speckit.specify&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;specs/&amp;lt;id&amp;gt;/spec.md&lt;/code&gt; — the requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Spec Kit&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/speckit.clarify&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open questions resolved&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Spec Kit&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/speckit.plan&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;specs/&amp;lt;id&amp;gt;/plan.md&lt;/code&gt; — technical approach&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Spec Kit&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/speckit.tasks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;specs/&amp;lt;id&amp;gt;/tasks.md&lt;/code&gt; — ordered, small tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Superpowers&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;"Use git worktree for this feature."&lt;/em&gt; (triggers &lt;code&gt;using-git-worktrees&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Isolated branch with green test baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Superpowers&lt;/td&gt;
&lt;td&gt;&lt;em&gt;"Execute &lt;code&gt;specs/&amp;lt;id&amp;gt;/tasks.md&lt;/code&gt; using subagent-driven development with TDD."&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Code, written test-first, one subagent per task&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Superpowers&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;"Request code review."&lt;/em&gt; (triggers &lt;code&gt;requesting-code-review&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Severity-tagged punch list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Superpowers&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;"Finish the development branch."&lt;/em&gt; (triggers &lt;code&gt;finishing-a-development-branch&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;PR opened or merged + cleanup&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's it. Spec Kit owns steps 2–5. Superpowers owns steps 1, 6–9. The handoff happens at &lt;code&gt;tasks.md&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  📜 The two non-obvious rules that make this combo work
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rule 1 — Don't skip &lt;code&gt;/speckit.tasks&lt;/code&gt;, even when you're tempted.&lt;/strong&gt;&lt;br&gt;
Superpowers' &lt;code&gt;executing-plans&lt;/code&gt; skill is designed to consume small (2–5 minute) tasks. Spec Kit's &lt;code&gt;/speckit.tasks&lt;/code&gt; produces exactly that shape. Skipping it forces Superpowers to break the work down at execution time, which is slower and lower quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 2 — Don't let Superpowers re-plan what Spec Kit already planned.&lt;/strong&gt;&lt;br&gt;
When you start step 7, explicitly say: &lt;em&gt;"The plan is already in &lt;code&gt;specs/&amp;lt;id&amp;gt;/tasks.md&lt;/code&gt;. Don't re-plan — execute."&lt;/em&gt; Otherwise Superpowers' &lt;code&gt;writing-plans&lt;/code&gt; skill may auto-fire and duplicate work.&lt;/p&gt;
&lt;h3&gt;
  
  
  📋 One-line prompt template for the execution handoff
&lt;/h3&gt;

&lt;p&gt;Paste this when you're ready to switch from Spec Kit (planning) to Superpowers (execution):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Execute specs/&amp;lt;feature-id&amp;gt;/tasks.md using the Superpowers workflow:
create a worktree, follow strict TDD per task, dispatch one subagent per
task, run code review at the end, then finish the branch. Do not re-plan —
the task list is the contract.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📏 When to scale down (don't over-engineer small work)
&lt;/h3&gt;

&lt;p&gt;For a one-line bug fix or a typo, both frameworks are overkill. A reasonable size cutoff:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task size&lt;/th&gt;
&lt;th&gt;Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt; 30 minutes, &amp;lt; 3 files&lt;/td&gt;
&lt;td&gt;Just prompt directly. Skip both.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30 min – 2 hours, single concern&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Superpowers only&lt;/strong&gt; — brainstorm + TDD + finish-branch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;gt; 2 hours, multi-component, or shipped to users&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Both&lt;/strong&gt; — full Spec Kit planning, then Superpowers execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anything regulated / audited&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Both, mandatory&lt;/strong&gt; — the spec trail is part of compliance&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  🚫 Anti-patterns to avoid
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;❌ &lt;strong&gt;Running &lt;code&gt;/speckit.implement&lt;/code&gt; AND Superpowers.&lt;/strong&gt; Pick one for execution. &lt;code&gt;/speckit.implement&lt;/code&gt; is Spec Kit's own executor; Superpowers replaces it for this combo.&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Editing code without updating the spec.&lt;/strong&gt; If reality diverges from &lt;code&gt;spec.md&lt;/code&gt;, your audit trail dies. Re-run &lt;code&gt;/speckit.specify&lt;/code&gt; for the changed area.&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Letting subagents read the whole &lt;code&gt;.specify/&lt;/code&gt; tree.&lt;/strong&gt; Pass them only the specific task they're executing — context discipline still matters.&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Skipping the constitution.&lt;/strong&gt; Without it, Superpowers and Spec Kit each impose their own defaults and you'll feel the friction.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ A 60-second mental checklist before starting any feature
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Is there a spec? If no → &lt;code&gt;/speckit.specify&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Are tasks small and ordered? If no → &lt;code&gt;/speckit.tasks&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Am I on a worktree with green tests? If no → trigger &lt;code&gt;using-git-worktrees&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Did I tell the agent "don't re-plan, execute"? If no → say it now.&lt;/li&gt;
&lt;li&gt;Will I review the PR diff myself before merging? If no → stop.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If all five are yes, you're using the combo correctly.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. 🧭 Quick Decision Guide
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;📘 Pick Spec Kit if you…&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Want specs as durable, reviewable artifacts&lt;/li&gt;
&lt;li&gt;Need cross-stack portability (regenerate same spec → different language)&lt;/li&gt;
&lt;li&gt;Work in an environment where PRDs/RFCs are already a norm&lt;/li&gt;
&lt;li&gt;Value broad agent compatibility (30+ tools)&lt;/li&gt;
&lt;li&gt;Want a GitHub-backed, enterprise-friendly project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;🦸 Pick Superpowers if you…&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Want the agent itself to behave more like a senior engineer&lt;/li&gt;
&lt;li&gt;Need strict TDD, worktree isolation, subagent orchestration out of the box&lt;/li&gt;
&lt;li&gt;Run long, autonomous sessions and need guardrails&lt;/li&gt;
&lt;li&gt;Prefer auto-triggered skills over user-invoked commands&lt;/li&gt;
&lt;li&gt;Want a writable, composable skill system you can extend yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;🤝 Pick both if you…&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Want artifact-driven &lt;em&gt;planning&lt;/em&gt; + workflow-driven &lt;em&gt;execution&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Are willing to invest in setup for a more rigorous overall pipeline&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  11. 📊 At-a-Glance Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Spec Kit&lt;/th&gt;
&lt;th&gt;Superpowers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Owner&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GitHub&lt;/td&gt;
&lt;td&gt;Jesse Vincent / Prime Radiant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Methodology&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spec-Driven Development&lt;/td&gt;
&lt;td&gt;Agentic SDLC w/ enforced workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary unit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slash command + spec template&lt;/td&gt;
&lt;td&gt;Auto-triggered skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trigger model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User-invoked&lt;/td&gt;
&lt;td&gt;Context-matched&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spec → plan → tasks → code&lt;/td&gt;
&lt;td&gt;Branch + tests + code + review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TDD enforcement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optional&lt;/td&gt;
&lt;td&gt;Mandatory (built-in skill)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Subagent orchestration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not core&lt;/td&gt;
&lt;td&gt;First-class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Worktree management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not core&lt;/td&gt;
&lt;td&gt;First-class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Constitution / governance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in (&lt;code&gt;/speckit.constitution&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Not core&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stack swapping&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strong (regen from spec)&lt;/td&gt;
&lt;td&gt;Weak (workflow is stack-agnostic but no regen)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent reach&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;30+ agents&lt;/td&gt;
&lt;td&gt;~6 first-class hosts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Install&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uv tool install&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Plugin marketplace per host&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Extensibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Extensions + presets (catalog)&lt;/td&gt;
&lt;td&gt;Skills (author-it-yourself)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best fit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Greenfield, brownfield, regulated work&lt;/td&gt;
&lt;td&gt;Long autonomous sessions, parallel agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;License&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;Generated 2026-04-25. Both projects are evolving rapidly — verify version-specific details against their READMEs before adopting.&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>programming</category>
      <category>webdev</category>
    </item>
    <item>
      <title>🚀 Claude Code: From Zero to Hero 🤖</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Sat, 25 Apr 2026 09:26:55 +0000</pubDate>
      <link>https://dev.to/truongpx396/claude-code-from-zero-to-hero-1c4o</link>
      <guid>https://dev.to/truongpx396/claude-code-from-zero-to-hero-1c4o</guid>
      <description>&lt;p&gt;A practical, progressive guide to getting serious value out of Claude Code. Each section is short, actionable, and builds on the one before. If you already know the basics, skim the early parts and jump to where it gets interesting — the later sections are where most users leave value on the table.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;How to read this guide&lt;/strong&gt;&lt;br&gt;
Every section ends with a &lt;strong&gt;Try this now&lt;/strong&gt; box. Do the exercise before moving on. Reading alone will not make you faster; typing will.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Table of contents
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;🌱 Part 1 — Zero&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What Claude Code actually is&lt;/li&gt;
&lt;li&gt;Install and first run&lt;/li&gt;
&lt;li&gt;Your first real task&lt;/li&gt;
&lt;li&gt;The interface: commands and shortcuts you need on day one&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;🔧 Part 2 — Apprentice&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CLAUDE.md — teaching Claude about your project&lt;/li&gt;
&lt;li&gt;Permissions — stop approving the same commands over and over&lt;/li&gt;
&lt;li&gt;Plan mode and review discipline&lt;/li&gt;
&lt;li&gt;Context hygiene — the skill nobody teaches&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;⚡ Part 3 — Journeyman&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Slash commands — your personal shortcuts&lt;/li&gt;
&lt;li&gt;Subagents — delegation and isolation&lt;/li&gt;
&lt;li&gt;Skills — reusable procedures&lt;/li&gt;
&lt;li&gt;Hooks — deterministic automation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;🏆 Part 4 — Hero&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;MCP servers — connecting external systems&lt;/li&gt;
&lt;li&gt;Prompting for real work&lt;/li&gt;
&lt;li&gt;Team setup — making Claude Code a shared asset&lt;/li&gt;
&lt;li&gt;Debugging Claude when it goes sideways&lt;/li&gt;
&lt;li&gt;The habits that separate heroes from tourists&lt;/li&gt;
&lt;/ol&gt;




&lt;h1&gt;
  
  
  🌱 Part 1 — Zero
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. 🤖 What Claude Code actually is
&lt;/h2&gt;

&lt;p&gt;Claude Code is an AI coding agent that runs in your terminal (and inside VS Code, JetBrains, etc.). It reads your codebase, writes and edits files, runs shell commands, and iterates on tasks — all with your permission.&lt;/p&gt;

&lt;p&gt;Three mental models that will save you hours of confusion:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;It's a collaborator, not a compiler.&lt;/strong&gt; You describe intent; it proposes actions and waits for approval on anything risky. Give it context the way you'd brief a new teammate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It is stateless between sessions, but not between turns.&lt;/strong&gt; Close the terminal and it forgets the conversation. That's why project memory (&lt;code&gt;CLAUDE.md&lt;/code&gt;) exists — to carry the important stuff forward.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Every tool call is explicit.&lt;/strong&gt; Reading a file, editing it, running a shell command — these are all visible tool calls. If it's making changes you don't want, you can see exactly where.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What it is &lt;strong&gt;not&lt;/strong&gt;: a chat wrapper. Claude Code has a rich configuration surface — permissions, slash commands, subagents, skills, hooks, MCP servers — and the difference between a tourist and a power user is almost entirely in how those are set up.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. 📦 Install and first run
&lt;/h2&gt;

&lt;p&gt;Install the CLI (one command, no account setup beyond API access):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @anthropic-ai/claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start it inside a project directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/projects/my-repo
claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On first run it will walk you through authentication. After that, you're looking at an interactive prompt waiting for input.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Install Claude Code, &lt;code&gt;cd&lt;/code&gt; into any project you have, and run &lt;code&gt;claude&lt;/code&gt;. Type &lt;code&gt;/help&lt;/code&gt; and read the entire help output. Don't skip this — 30% of the things people ask me could be answered by reading &lt;code&gt;/help&lt;/code&gt; once.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  3. 🎯 Your first real task
&lt;/h2&gt;

&lt;p&gt;Don't start with "hello world." Start with something you'd actually ask a teammate.&lt;/p&gt;

&lt;p&gt;A good first prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Read the &lt;code&gt;README.md&lt;/code&gt; and the top-level directory, then give me a two-paragraph summary of what this project does and how the pieces fit together.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Why this works: it's bounded, read-only, and forces Claude to explore the codebase — which is how you'll learn what tools it has.&lt;/p&gt;

&lt;p&gt;Watch what happens:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It calls &lt;code&gt;Read&lt;/code&gt; on &lt;code&gt;README.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;It runs &lt;code&gt;ls&lt;/code&gt; via &lt;code&gt;Bash&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Maybe it peeks at a few key files&lt;/li&gt;
&lt;li&gt;It answers in prose&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You'll see tool calls scroll by. Each one is a concrete action. This transparency is the whole point — you're not trusting a black box.&lt;/p&gt;

&lt;p&gt;Now try something active:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Add a &lt;code&gt;CHANGELOG.md&lt;/code&gt; at the repo root with a "Keep a Changelog" template and an &lt;code&gt;Unreleased&lt;/code&gt; section.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It will ask permission to write the file. Say yes. Look at the diff it produced. Ask it to change something trivial ("add a link to keepachangelog.com at the top"). Notice how it edits in place rather than re-writing the file.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Pick one real, small task in a real repo — not a toy — and ask Claude Code to do it end to end. Something like "write me a script that renames all &lt;code&gt;.jpg&lt;/code&gt; to &lt;code&gt;.png&lt;/code&gt;" or "refactor this function to use async/await." Review every tool call. Reject one thing you don't like and watch it adjust.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4. ⌨️ The interface
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Key commands (type these at the prompt)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/help&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full command reference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/clear&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Wipe the conversation — start fresh without quitting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/model&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch between models (Opus / Sonnet / Haiku)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/init&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Bootstrap a &lt;code&gt;CLAUDE.md&lt;/code&gt; for the current project&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Show what mode you're in, token usage, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/review&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Review the current branch's changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/compact&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Manually compact conversation context to free room&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Esc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interrupt Claude mid-action&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Shift+Tab&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cycle through permission modes (normal → auto-accept → plan)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Permission modes (the single most useful thing to understand)
&lt;/h3&gt;

&lt;p&gt;Claude Code has three main modes you'll toggle between:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Normal&lt;/strong&gt; — Claude asks before every tool call that isn't pre-approved. Safe default.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-accept&lt;/strong&gt; (a.k.a. "YOLO mode" or "accept edits") — Claude runs without asking. Use when you trust the task and want speed. &lt;strong&gt;Never&lt;/strong&gt; use for destructive or networked operations on a repo you care about.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan mode&lt;/strong&gt; — Claude explores and reasons but cannot modify anything until you approve a plan. Use for anything non-trivial. This is the single biggest time-saver most people ignore.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cycle them with &lt;code&gt;Shift+Tab&lt;/code&gt;. The current mode is shown in the status area.&lt;/p&gt;

&lt;h3&gt;
  
  
  Interrupting
&lt;/h3&gt;

&lt;p&gt;If Claude is going down the wrong path, hit &lt;code&gt;Esc&lt;/code&gt;. Don't wait for it to finish and then complain — redirect mid-flight. It handles interruption well.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Start a session, press &lt;code&gt;Shift+Tab&lt;/code&gt; three times and observe how the mode indicator changes. Then try the same trivial task ("add a blank line to the README") in each mode and feel the difference. Also press &lt;code&gt;Esc&lt;/code&gt; while it's mid-action at least once. Getting comfortable with these three keys is worth more than any config.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  🔧 Part 2 — Apprentice
&lt;/h1&gt;

&lt;h2&gt;
  
  
  5. 📄 CLAUDE.md
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;CLAUDE.md&lt;/code&gt; is the single most valuable file in a Claude-Code-enabled repo. It's plain markdown placed at the repo root (and optionally in subdirectories) that Claude reads at the start of every session.&lt;/p&gt;

&lt;p&gt;Think of it as the onboarding doc for a new engineer, except the engineer reads it every single day and has perfect recall.&lt;/p&gt;

&lt;h3&gt;
  
  
  What to put in it
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Architecture&lt;/strong&gt; — the service boundaries, the data flow, who calls whom&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How to run things&lt;/strong&gt; — exact commands, copy-paste runnable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conventions that the code alone can't tell you&lt;/strong&gt; — "we prefer functional components," "all errors must be wrapped with &lt;code&gt;fmt.Errorf&lt;/code&gt;," etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pitfalls&lt;/strong&gt; — things that have burned the team before ("never edit an applied migration")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;End-to-end checklists&lt;/strong&gt; for common tasks ("adding a feature touches these 9 layers, in this order")&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What to leave out
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Anything &lt;code&gt;ls&lt;/code&gt; or &lt;code&gt;git log&lt;/code&gt; can answer&lt;/li&gt;
&lt;li&gt;Narrative prose, history, aspirations&lt;/li&gt;
&lt;li&gt;Secrets&lt;/li&gt;
&lt;li&gt;Step-by-step tutorials (link to them instead)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Layering
&lt;/h3&gt;

&lt;p&gt;You can have nested &lt;code&gt;CLAUDE.md&lt;/code&gt; files. When Claude works inside a subdirectory, the nested file is loaded too. Use this for service-specific conventions in a monorepo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;repo/
  CLAUDE.md                  # root — cross-cutting context
  backend-go/CLAUDE.md       # Go-specific conventions
  backend-python/CLAUDE.md   # Python-specific conventions
  frontend/CLAUDE.md         # React/TS-specific conventions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There's also &lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt; — personal preferences applied in every project on your machine (e.g., "I prefer tabs over spaces in my personal scripts").&lt;/p&gt;

&lt;h3&gt;
  
  
  Rules of thumb
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Every line competes for context budget.&lt;/strong&gt; Treat it like a tweet — if removing a line wouldn't hurt, remove it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefer commands over instructions.&lt;/strong&gt; &lt;code&gt;make test-integration&lt;/code&gt; beats a paragraph about integration tests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State constraints explicitly.&lt;/strong&gt; "Integration tests require Docker." not "We have integration tests."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Link, don't duplicate.&lt;/strong&gt; If your style guide lives elsewhere, link it.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
In your project, run &lt;code&gt;/init&lt;/code&gt;. Claude will scan the codebase and draft a &lt;code&gt;CLAUDE.md&lt;/code&gt; for you. Then read it critically and delete half the content. The short version is usually better. Commit it.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  6. 🔐 Permissions
&lt;/h2&gt;

&lt;p&gt;By default, Claude asks before running any shell command or editing any file. That's safe but interrupt-heavy.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;.claude/settings.json&lt;/code&gt; in your repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(go test:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(pnpm test:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(npm run build:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git status:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git diff:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git log:*)"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"deny"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(sudo:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(rm -rf /:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git push --force:*)"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pattern rules
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Bash(cmd:*)&lt;/code&gt; — allow &lt;code&gt;cmd&lt;/code&gt; followed by any arguments&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Bash(cmd arg1:*)&lt;/code&gt; — allow &lt;code&gt;cmd arg1&lt;/code&gt; with anything after&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Bash(cmd)&lt;/code&gt; — exact match only&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;deny&lt;/code&gt; always wins over &lt;code&gt;allow&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Personal vs team
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.claude/settings.json&lt;/code&gt; — committed, shared with the team&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/settings.local.json&lt;/code&gt; — git-ignored, personal tweaks (your favorite CLIs, local env)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;~/.claude/settings.json&lt;/code&gt; — global, applies to every project on your machine&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The golden rule
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Never blanket-allow &lt;code&gt;Bash(*)&lt;/code&gt;.&lt;/strong&gt; The permission prompt is cheap; a destructive command you didn't mean to run is expensive. Keep the deny list tight and aggressive.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Look back at your last Claude Code session. Which commands did you approve three or more times? Those are your candidates for the allow list. Add them to &lt;code&gt;.claude/settings.json&lt;/code&gt; and commit. Do NOT add things you only approved once — premature allow-listing is how people accidentally yield too much.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  7. 🗺️ Plan mode
&lt;/h2&gt;

&lt;p&gt;For any task that is more than "rename this variable," press &lt;code&gt;Shift+Tab&lt;/code&gt; until you're in &lt;strong&gt;Plan mode&lt;/strong&gt; before you start. Then prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Plan out what it would take to add rate limiting to the &lt;code&gt;/api/login&lt;/code&gt; endpoint. Don't make any changes — just propose the plan.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude will explore, read code, think, and produce a concrete plan (files to touch, order of operations, tradeoffs). You review the plan, ask follow-up questions, adjust the approach. When you're happy, you exit plan mode and tell it to execute.&lt;/p&gt;

&lt;p&gt;This pattern catches an enormous class of "Claude went off and did the wrong thing" problems. Five minutes of planning saves an hour of re-doing.&lt;/p&gt;

&lt;h3&gt;
  
  
  When to skip plan mode
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Trivial edits (typo fix, one-line change)&lt;/li&gt;
&lt;li&gt;Tasks where "wrong direction" has no cost (exploratory prototyping)&lt;/li&gt;
&lt;li&gt;When you're already in a detailed prompt-response loop and have alignment&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to always use plan mode
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Any change touching more than 3 files&lt;/li&gt;
&lt;li&gt;Any migration or refactor&lt;/li&gt;
&lt;li&gt;Any task involving external systems (APIs, databases, deploys)&lt;/li&gt;
&lt;li&gt;Anything where you'd review a PR carefully — if it needs review, it needs a plan first&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Take a real task on your backlog that you've been putting off. Enter plan mode. Ask for a plan. Read it critically. You'll find that either (a) the plan reveals the task is bigger than you thought — which is useful information — or (b) you're now ready to execute with confidence.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. 🧹 Context hygiene
&lt;/h2&gt;

&lt;p&gt;The single most underrated skill. Long conversations accumulate irrelevant context, which degrades Claude's attention and eats your token budget.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rules
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start a new session for an unrelated task.&lt;/strong&gt; Don't use the same conversation to fix a bug, add a feature, and then write release notes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;/clear&lt;/code&gt; aggressively.&lt;/strong&gt; When you shift focus, clear the conversation. The cost of re-explaining is usually less than the cost of carrying accumulated noise.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't paste huge logs.&lt;/strong&gt; If the full stack trace is 2000 lines, paste the top 20 and let Claude ask for more if needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delegate searches.&lt;/strong&gt; "Find every place X is used" is a job for a subagent (covered in Part 3), not your main conversation. A subagent can ingest a mountain of search output and return a short summary, keeping your main context clean.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;/compact&lt;/code&gt; when context fills up.&lt;/strong&gt; It summarizes history and continues with less bloat. Better than letting auto-compaction kick in mid-task.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Signs your context is polluted
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Claude makes the same small mistake it already corrected earlier&lt;/li&gt;
&lt;li&gt;It references files that have nothing to do with the current task&lt;/li&gt;
&lt;li&gt;It proposes solutions that contradict constraints you already set&lt;/li&gt;
&lt;li&gt;Responses feel generic and hedge-y&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you see these, &lt;code&gt;/clear&lt;/code&gt; is almost always the right move.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Next time you finish a task, before starting a new one, run &lt;code&gt;/clear&lt;/code&gt;. Build the muscle. Most people keep the conversation running forever and wonder why Claude gets worse over time.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  ⚡ Part 3 — Journeyman
&lt;/h1&gt;

&lt;h2&gt;
  
  
  9. ⚡ Slash commands
&lt;/h2&gt;

&lt;p&gt;A slash command is a markdown file in &lt;code&gt;.claude/commands/&lt;/code&gt; that becomes an invocable shortcut. The file body becomes the prompt; &lt;code&gt;$ARGUMENTS&lt;/code&gt; is substituted with whatever you type after the command name.&lt;/p&gt;

&lt;h3&gt;
  
  
  Minimal example
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;.claude/commands/review.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Review the uncommitted changes on this branch.

Focus on:
&lt;span class="p"&gt;-&lt;/span&gt; Security issues (injection, XSS, unsafe deserialization)
&lt;span class="p"&gt;-&lt;/span&gt; API contract changes that would break callers
&lt;span class="p"&gt;-&lt;/span&gt; Missing tests for new code paths
&lt;span class="p"&gt;-&lt;/span&gt; Unused imports, dead code

Don't modify files — review only. Output a punch list.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Invoke with &lt;code&gt;/review&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  With arguments
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;.claude/commands/test.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Run the test suite for the specified service. Argument: $ARGUMENTS

Rules:
&lt;span class="p"&gt;-&lt;/span&gt; If argument is "go": &lt;span class="sb"&gt;`cd backend-go &amp;amp;&amp;amp; go test ./... -race -count=1`&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; If argument is "python": &lt;span class="sb"&gt;`cd backend-python &amp;amp;&amp;amp; uv run pytest -v`&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; If argument is "frontend": &lt;span class="sb"&gt;`cd frontend &amp;amp;&amp;amp; pnpm test`&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; If argument is "all" or empty: run all three, stop on first failure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Invoke with &lt;code&gt;/test go&lt;/code&gt; or &lt;code&gt;/test all&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Frontmatter (optional)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Open a PR for the current branch with auto-generated title and body&lt;/span&gt;
&lt;span class="na"&gt;argument-hint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[issue-number]"&lt;/span&gt;
&lt;span class="na"&gt;allowed-tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bash"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Read"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;opus&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

Steps:
&lt;span class="p"&gt;1.&lt;/span&gt; Ensure branch is pushed
&lt;span class="p"&gt;2.&lt;/span&gt; Read commits since diverging from main
&lt;span class="p"&gt;3.&lt;/span&gt; Draft a PR title and body based on the commits
&lt;span class="p"&gt;4.&lt;/span&gt; Run &lt;span class="sb"&gt;`gh pr create`&lt;/span&gt; with the drafted content
&lt;span class="p"&gt;5.&lt;/span&gt; If $ARGUMENTS is present, add "Closes #$ARGUMENTS" to the body
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Gotchas
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flat namespace.&lt;/strong&gt; &lt;code&gt;.claude/commands/git/commit.md&lt;/code&gt; becomes &lt;code&gt;/commit&lt;/code&gt;, not &lt;code&gt;/git:commit&lt;/code&gt;. Name files carefully.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No code execution in the markdown file itself&lt;/strong&gt; — the body is sent as a prompt. Claude then decides what to run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep them focused.&lt;/strong&gt; If a command tries to do too much, split it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Slash command vs skill vs subagent
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Use when&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Slash command&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User explicitly triggers a workflow (&lt;code&gt;/test&lt;/code&gt;, &lt;code&gt;/lint&lt;/code&gt;, &lt;code&gt;/pr&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Skill&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reusable procedure Claude should apply when relevant (user doesn't type it)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Subagent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;You want an isolated context / fresh conversation for a big sub-task&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Identify the three prompts you most commonly type. Turn each into a slash command. You'll feel the productivity gain immediately.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  10. 🤝 Subagents
&lt;/h2&gt;

&lt;p&gt;Subagents are specialized Claude instances that run in a &lt;strong&gt;separate conversation&lt;/strong&gt; with their own system prompt and tool permissions. The main agent delegates work to them.&lt;/p&gt;

&lt;p&gt;Why this matters:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Context isolation.&lt;/strong&gt; The subagent doesn't see your main conversation. Great for independent reviews — it can't be biased by your prior reasoning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelism.&lt;/strong&gt; The main agent can run multiple subagents concurrently. Perfect for "check A, B, and C independently."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool scoping.&lt;/strong&gt; A "reviewer" subagent can be given read-only tools, so it literally can't write files even if asked.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Defining a subagent
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;.claude/agents/migration-reviewer.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Review SQL migrations for safety. Use proactively whenever a migration file is added or modified.&lt;/span&gt;
&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Read"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bash"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Grep"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;opus&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

You are a database migration safety reviewer. For any migration under &lt;span class="sb"&gt;`backend-go/migrations/`&lt;/span&gt;, check:
&lt;span class="p"&gt;
1.&lt;/span&gt; Does it have both Up and Down blocks?
&lt;span class="p"&gt;2.&lt;/span&gt; Are NOT NULL columns added with a default, or is there a backfill step?
&lt;span class="p"&gt;3.&lt;/span&gt; Are indexes created CONCURRENTLY on large tables?
&lt;span class="p"&gt;4.&lt;/span&gt; Is there anything irreversible (DROP COLUMN, DROP TABLE) that needs a rollback plan?

Report findings as a punch list. Do not modify files.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main agent will now auto-invoke this whenever a migration changes (because of the "Use proactively" phrasing).&lt;/p&gt;

&lt;h3&gt;
  
  
  Built-in agent types
&lt;/h3&gt;

&lt;p&gt;Claude Code ships with a few built-in agent types you can use without defining your own:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Explore&lt;/code&gt;&lt;/strong&gt; — fast codebase search, great for "find every place X is used"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;general-purpose&lt;/code&gt;&lt;/strong&gt; — the default, for any research task&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to use a subagent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Research/exploration&lt;/strong&gt; that will generate lots of output you don't want in your main context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Independent review&lt;/strong&gt; where you want an unbiased second opinion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelizable work&lt;/strong&gt; — three questions that don't depend on each other&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When NOT to use a subagent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Small targeted tasks&lt;/strong&gt; — spawning a subagent has overhead; for a 30-second task, just do it yourself&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tasks where you want to stay in conversation&lt;/strong&gt; — the subagent returns one message and is gone; you can't iterate with it&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Next time you need to understand an unfamiliar area of your codebase, prompt the main agent: "Spawn an Explore subagent to find every place &lt;code&gt;authenticate()&lt;/code&gt; is called, what it returns, and who handles its errors. Give me a short report." Notice how your main context stays clean.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  11. 💡 Skills
&lt;/h2&gt;

&lt;p&gt;Skills are &lt;strong&gt;reusable procedures&lt;/strong&gt; that Claude loads when relevant. Unlike slash commands (user-triggered) or subagents (fresh-context delegation), skills layer procedural knowledge onto your current conversation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Directory layout
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.claude/skills/
  release-notes/
    SKILL.md          ← required
    template.md       ← optional supporting files
    scripts/
      extract.sh      ← optional helper scripts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  SKILL.md format
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Generate release notes from git log since the last tag. Invoke when the user asks for a changelog, release notes, or summary of recent changes.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Release Notes Skill&lt;/span&gt;

When invoked:
&lt;span class="p"&gt;
1.&lt;/span&gt; Find the last tag: &lt;span class="sb"&gt;`git describe --tags --abbrev=0`&lt;/span&gt;
&lt;span class="p"&gt;2.&lt;/span&gt; Run &lt;span class="sb"&gt;`git log &amp;lt;tag&amp;gt;..HEAD --pretty=format:'%h %s (%an)' --no-merges`&lt;/span&gt;
&lt;span class="p"&gt;3.&lt;/span&gt; Group commits by conventional prefix (feat/fix/chore/docs/refactor)
&lt;span class="p"&gt;4.&lt;/span&gt; Format using &lt;span class="sb"&gt;`template.md`&lt;/span&gt; in this skill directory
&lt;span class="p"&gt;5.&lt;/span&gt; Output the final markdown — do not write to a file unless asked
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Auto-discovery
&lt;/h3&gt;

&lt;p&gt;Claude reads the &lt;code&gt;description&lt;/code&gt; of every available skill at session start. When your prompt matches a skill's description, it's invoked automatically.&lt;/p&gt;

&lt;p&gt;This is why the description matters more than anything else in the skill. Write it like a trigger: when should this fire?&lt;/p&gt;

&lt;p&gt;Bad: "Release notes generator."&lt;br&gt;
Good: "Generate release notes from commits since the last git tag. Invoke when the user asks for a changelog, release notes, or summary of changes since vX.Y.Z."&lt;/p&gt;
&lt;h3&gt;
  
  
  Skill vs slash command — the quick heuristic
&lt;/h3&gt;

&lt;p&gt;Ask: "Will the user explicitly type &lt;code&gt;/skill-name&lt;/code&gt;?"&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Yes → make it a slash command&lt;/li&gt;
&lt;li&gt;No, but Claude should apply it when a situation arises → make it a skill&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Built-in skills
&lt;/h3&gt;

&lt;p&gt;Claude Code ships with several built-in skills (update-config, simplify, pr, lint, test, gen, build, migrate, review, security-review, etc.). Run &lt;code&gt;/help&lt;/code&gt; to see what's currently registered.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Pick a procedure your team does routinely — releasing a version, writing a migration, auditing a feature — and turn it into a skill. Craft the description carefully so Claude auto-invokes it.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  12. 🪝 Hooks
&lt;/h2&gt;

&lt;p&gt;Hooks are shell commands the &lt;strong&gt;harness&lt;/strong&gt; (not Claude) runs in response to events. They run outside Claude's decision loop, which makes them the right tool for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enforcing&lt;/strong&gt; something deterministically (auto-format after edit)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Injecting&lt;/strong&gt; context (show branch name on session start)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blocking&lt;/strong&gt; unsafe actions (refuse writes to &lt;code&gt;/etc/&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A memory instruction like "always format Go files" is advisory — Claude might forget. A hook is a hard guarantee.&lt;/p&gt;
&lt;h3&gt;
  
  
  Shape
&lt;/h3&gt;

&lt;p&gt;Hooks live in &lt;code&gt;.claude/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"&amp;lt;EventName&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;pattern&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;shell&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Events you'll actually use
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Fires when&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PreToolUse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Before a tool runs. Exit 2 blocks it.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PostToolUse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;After a tool succeeds. Good for auto-format.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;UserPromptSubmit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;After user sends a prompt. Can inject context.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SessionStart&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Session boot.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;End of Claude's turn.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Exit codes and I/O
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;0&lt;/code&gt; — success; stdout becomes context for Claude&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2&lt;/code&gt; — &lt;strong&gt;blocking&lt;/strong&gt;; stderr is shown to Claude as the reason to retry/adjust&lt;/li&gt;
&lt;li&gt;other — non-blocking error; logged and ignored&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hooks receive the event payload as JSON on stdin. Parse with &lt;code&gt;jq&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.tool_input.file_path // empty'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 1 — auto-format after edit
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write|Edit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"file=$(jq -r '.tool_input.file_path // empty'); [[ &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$file&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; == *.go ]] &amp;amp;&amp;amp; gofmt -w &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$file&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;; [[ &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$file&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; == *.py ]] &amp;amp;&amp;amp; ruff format &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$file&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;; true"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 2 — block writes to generated files
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write|Edit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"file=$(jq -r '.tool_input.file_path // empty'); if [[ &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$file&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; == *generated/* ]]; then echo 'Do not edit generated files — re-run code generation instead.' &amp;gt;&amp;amp;2; exit 2; fi"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 3 — inject branch info at session start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SessionStart"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"echo &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Branch: $(git branch --show-current)&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  When hooks are the wrong tool
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;"Remind Claude to write tests" → put it in &lt;code&gt;CLAUDE.md&lt;/code&gt;, not a hook&lt;/li&gt;
&lt;li&gt;"Run tests only when I say" → slash command, not a hook&lt;/li&gt;
&lt;li&gt;"Apply procedure X in situation Y" → skill, not a hook&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hooks are for the things the &lt;strong&gt;harness must enforce regardless of what Claude decides&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔥 Try this now&lt;/strong&gt;&lt;br&gt;
Add a &lt;code&gt;PostToolUse&lt;/code&gt; hook that auto-formats any file Claude edits in your project's language. One concrete step that will improve every future session.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  🏆 Part 4 — Hero
&lt;/h1&gt;

&lt;h2&gt;
  
  
  13. 🔌 MCP servers
&lt;/h2&gt;

&lt;p&gt;Model Context Protocol (MCP) servers extend Claude with external tools — Slack, Linear, Sentry, Postgres, your internal APIs. Configure them in &lt;code&gt;.claude/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"postgres"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-postgres"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"postgres://dev:dev@localhost:5432/app"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Popular servers: &lt;code&gt;postgres&lt;/code&gt;, &lt;code&gt;github&lt;/code&gt;, &lt;code&gt;slack&lt;/code&gt;, &lt;code&gt;linear&lt;/code&gt;, &lt;code&gt;sentry&lt;/code&gt;, &lt;code&gt;puppeteer&lt;/code&gt;, &lt;code&gt;fetch&lt;/code&gt;. See the &lt;a href="https://github.com/modelcontextprotocol/servers" rel="noopener noreferrer"&gt;MCP server directory&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before adding an MCP server, ask:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;"Would a shell command work?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;psql&lt;/code&gt; invocation in your permission allowlist is simpler than a Postgres MCP server 90% of the time. MCP is right when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The tool requires stateful auth (OAuth flows, API keys that rotate)&lt;/li&gt;
&lt;li&gt;The output is richly structured (JSON APIs where Claude needs field-level access)&lt;/li&gt;
&lt;li&gt;You want Claude to call the tool across many turns without re-establishing connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're adding an MCP server "because I saw one for X," check whether &lt;code&gt;curl&lt;/code&gt;, &lt;code&gt;gh&lt;/code&gt;, &lt;code&gt;psql&lt;/code&gt;, or &lt;code&gt;aws&lt;/code&gt; already solves the problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  14. ✍️ Prompting for real work
&lt;/h2&gt;

&lt;p&gt;Prompting advice is often generic. Here's what actually matters for Claude Code specifically:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Scope bounded, context explicit
&lt;/h3&gt;

&lt;p&gt;Bad: "Fix the tests."&lt;br&gt;
Good: "Fix the flaky test in &lt;code&gt;backend-go/internal/service/user_test.go&lt;/code&gt;. It fails about one run in ten. The last failure log is in &lt;code&gt;/tmp/test-output.log&lt;/code&gt;."&lt;/p&gt;

&lt;p&gt;The more of the context you provide up front, the less Claude has to spend tokens discovering.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. State constraints before action
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;"Don't change the public API."&lt;/li&gt;
&lt;li&gt;"Only modify files under &lt;code&gt;frontend/src/components/&lt;/code&gt;."&lt;/li&gt;
&lt;li&gt;"If the fix requires more than 50 lines of changes, stop and tell me — don't barrel through."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These constraints prevent whole classes of "this isn't what I wanted" outcomes.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Ask for a plan first on anything non-trivial
&lt;/h3&gt;

&lt;p&gt;Covered in section 7. Plan mode is free; re-doing work is expensive.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Be honest about your knowledge
&lt;/h3&gt;

&lt;p&gt;"I don't know this codebase well; help me understand X before we change it" produces better results than pretending to know.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Push back on mediocre output
&lt;/h3&gt;

&lt;p&gt;If Claude produces something that works but isn't good, say so: "That works, but it has three issues: (a) ..., (b) ..., (c) .... Please revise." Don't accept first-draft output on things that matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Use "one-shot" framing for tightly-scoped tasks
&lt;/h3&gt;

&lt;p&gt;"Write a &lt;code&gt;bash&lt;/code&gt; script that does X. One shot — no back-and-forth. Output the whole script." forces a more thorough first response than open-ended prompting.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. For debugging, state the hypothesis
&lt;/h3&gt;

&lt;p&gt;Bad: "Why is this broken?"&lt;br&gt;
Good: "I think this is broken because the deploy step doesn't wait for the DB migration to finish, but I'm not sure. Verify or refute that hypothesis first."&lt;/p&gt;

&lt;p&gt;Hypothesis-first debugging is 3× faster than open-ended investigation.&lt;/p&gt;




&lt;h2&gt;
  
  
  15. 👥 Team setup
&lt;/h2&gt;

&lt;p&gt;Taking Claude Code from "a tool each person uses" to "a shared team asset" is high-leverage. Here's the checklist:&lt;/p&gt;

&lt;h3&gt;
  
  
  Committed to the repo
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CLAUDE.md&lt;/code&gt; at root, per-service where useful&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/settings.json&lt;/code&gt; — shared permissions and hooks&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/commands/&lt;/code&gt; — team slash commands (&lt;code&gt;/test&lt;/code&gt;, &lt;code&gt;/lint&lt;/code&gt;, &lt;code&gt;/pr&lt;/code&gt;, &lt;code&gt;/review&lt;/code&gt;, &lt;code&gt;/deploy-staging&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/agents/&lt;/code&gt; — team subagents (reviewers, explorers)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.claude/skills/&lt;/code&gt; — team procedures (release notes, migration review, on-call runbook)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Gitignored
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.claude/settings.local.json&lt;/code&gt; — personal overrides&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.env&lt;/code&gt; — secrets&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Onboarding new engineers
&lt;/h3&gt;

&lt;p&gt;Every new hire spends the first 30 minutes of day one:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reading &lt;code&gt;CLAUDE.md&lt;/code&gt; and every nested &lt;code&gt;CLAUDE.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Skimming &lt;code&gt;.claude/commands/&lt;/code&gt; — what shortcuts exist&lt;/li&gt;
&lt;li&gt;Copying &lt;code&gt;.env.example&lt;/code&gt; → &lt;code&gt;.env&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Running &lt;code&gt;make dev&lt;/code&gt; (or equivalent) to verify the stack boots&lt;/li&gt;
&lt;li&gt;Adding personal approvals to &lt;code&gt;.claude/settings.local.json&lt;/code&gt; (not the shared file)&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Keeping it fresh
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Review &lt;code&gt;CLAUDE.md&lt;/code&gt; quarterly.&lt;/strong&gt; Stale guidance is worse than no guidance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a pitfall every time someone gets burned.&lt;/strong&gt; Post-incident: "update &lt;code&gt;CLAUDE.md&lt;/code&gt;?" should be on the checklist.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prune.&lt;/strong&gt; If a slash command isn't used in a month, delete it. If &lt;code&gt;CLAUDE.md&lt;/code&gt; has grown past ~300 lines, split it.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  16. 🐛 Debugging Claude
&lt;/h2&gt;

&lt;p&gt;When things go wrong, the problem is almost always one of these:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Symptom&lt;/th&gt;
&lt;th&gt;Likely cause&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude keeps asking to run a command I want pre-approved&lt;/td&gt;
&lt;td&gt;Not in &lt;code&gt;permissions.allow&lt;/code&gt;, or pattern too narrow&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;Bash(cmd:*)&lt;/code&gt; to &lt;code&gt;.claude/settings.json&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;My hook isn't firing&lt;/td&gt;
&lt;td&gt;Wrong event name, wrong matcher, or bad JSON&lt;/td&gt;
&lt;td&gt;Validate JSON; re-check event name; test the command in isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude edits files but skips formatting&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;PostToolUse&lt;/code&gt; hook crashed silently&lt;/td&gt;
&lt;td&gt;End hook command with `&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slash command not found&lt;/td&gt;
&lt;td&gt;File naming collision (flat namespace)&lt;/td&gt;
&lt;td&gt;Rename file; put it directly in {% raw %}&lt;code&gt;.claude/commands/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt; isn't being read&lt;/td&gt;
&lt;td&gt;It is — but per-project &lt;code&gt;CLAUDE.md&lt;/code&gt; takes precedence for overlapping topics&lt;/td&gt;
&lt;td&gt;Split clearly: global in &lt;code&gt;~/.claude&lt;/code&gt;, project in repo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subagent gives a different answer than main agent&lt;/td&gt;
&lt;td&gt;Expected — it has isolated context by design&lt;/td&gt;
&lt;td&gt;If you want shared context, don't use a subagent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude forgets between sessions&lt;/td&gt;
&lt;td&gt;Facts were only in chat, not in &lt;code&gt;CLAUDE.md&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Move enduring facts to &lt;code&gt;CLAUDE.md&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hook blocks every command unexpectedly&lt;/td&gt;
&lt;td&gt;Exit code 2 is a hard block&lt;/td&gt;
&lt;td&gt;Reserve exit 2 for real blocks; use 0 for non-blocking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude keeps making the same mistake&lt;/td&gt;
&lt;td&gt;Context is polluted with contradictory info&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;/clear&lt;/code&gt; and re-prompt with explicit constraints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response is generic and useless&lt;/td&gt;
&lt;td&gt;Claude lacks context to answer well&lt;/td&gt;
&lt;td&gt;Give it a file to read, a command to run, or narrow the scope&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Using &lt;code&gt;--debug&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Run Claude Code with &lt;code&gt;--debug&lt;/code&gt; to see detailed logs — hook invocations, tool calls, timing. Essential when something silent is misbehaving.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "read it back" trick
&lt;/h3&gt;

&lt;p&gt;When Claude produces output you disagree with, prompt: "Read back to me your understanding of what I asked for." Usually reveals the specific misinterpretation in seconds. Cheaper than re-arguing.&lt;/p&gt;




&lt;h2&gt;
  
  
  17. 💪 Habits
&lt;/h2&gt;

&lt;p&gt;A list of habits that separate the heroes from the tourists. None of them are secret. All of them are practiced, not learned.&lt;/p&gt;

&lt;h3&gt;
  
  
  ✅ Do
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plan before you build.&lt;/strong&gt; For any non-trivial task, use plan mode or ask for a plan.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review every diff before accepting.&lt;/strong&gt; Claude is fast; a fast wrong change is still wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear context aggressively.&lt;/strong&gt; Start new sessions for unrelated work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update &lt;code&gt;CLAUDE.md&lt;/code&gt; when you get burned.&lt;/strong&gt; The next person (or your future self) shouldn't repeat the mistake.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use subagents for exploration.&lt;/strong&gt; Keep your main context clean.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push back on mediocre output.&lt;/strong&gt; Say "that's not quite right because X; revise."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit your &lt;code&gt;.claude/&lt;/code&gt; configuration.&lt;/strong&gt; It's team infrastructure, not personal config.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep denies aggressive, allows minimal.&lt;/strong&gt; &lt;code&gt;sudo&lt;/code&gt;, &lt;code&gt;rm -rf /&lt;/code&gt;, &lt;code&gt;git push --force&lt;/code&gt; should always be blocked.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ❌ Don't
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Don't blanket-allow &lt;code&gt;Bash(*)&lt;/code&gt;.&lt;/strong&gt; The cost of permission prompts is much less than the cost of one accidental destructive command.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't paste 2000-line logs.&lt;/strong&gt; Paste the relevant 20 lines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't accept first-draft output on things that matter.&lt;/strong&gt; Iterate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't keep one session running forever.&lt;/strong&gt; Context rot is real.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't put secrets in &lt;code&gt;CLAUDE.md&lt;/code&gt; or the chat.&lt;/strong&gt; Ever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't skip hooks (&lt;code&gt;--no-verify&lt;/code&gt;) without a specific reason.&lt;/strong&gt; Fix the real issue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't build elaborate configuration before you've used the defaults for a week.&lt;/strong&gt; Start simple. Add surfaces when you feel the pain.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  📈 The progression
&lt;/h3&gt;

&lt;p&gt;You're a tourist if you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Type freeform prompts, approve everything one by one, never configure anything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You're an apprentice if you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have a &lt;code&gt;CLAUDE.md&lt;/code&gt;, permission allow list, and use plan mode regularly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You're a journeyman if you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have slash commands for your common workflows, use subagents for exploration, and have at least one hook enforcing a guarantee&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You're a hero if you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Treat &lt;code&gt;.claude/&lt;/code&gt; as team infrastructure, prune it regularly, and notice when you're using the wrong surface (skill when it should be a command, hook when it should be memory)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📚 Further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Official docs&lt;/strong&gt; — &lt;a href="https://docs.claude.com/en/docs/claude-code" rel="noopener noreferrer"&gt;https://docs.claude.com/en/docs/claude-code&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server directory&lt;/strong&gt; — &lt;a href="https://github.com/modelcontextprotocol/servers" rel="noopener noreferrer"&gt;https://github.com/modelcontextprotocol/servers&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📋 One-page summary
&lt;/h2&gt;

&lt;p&gt;If you remember nothing else:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Put enduring facts in &lt;code&gt;CLAUDE.md&lt;/code&gt;. Keep it tight.&lt;/li&gt;
&lt;li&gt;Pre-approve your common commands in &lt;code&gt;.claude/settings.json&lt;/code&gt;. Never &lt;code&gt;Bash(*)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;plan mode&lt;/strong&gt; (&lt;code&gt;Shift+Tab&lt;/code&gt;) before any non-trivial task.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear context&lt;/strong&gt; between unrelated tasks (&lt;code&gt;/clear&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Turn your repeat prompts into &lt;strong&gt;slash commands&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;subagents&lt;/strong&gt; for big searches and independent reviews.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;skills&lt;/strong&gt; for procedures Claude should apply automatically.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;hooks&lt;/strong&gt; when you need a hard guarantee, not a polite suggestion.&lt;/li&gt;
&lt;li&gt;Review every diff. Push back on mediocre output.&lt;/li&gt;
&lt;li&gt;Update configuration when you get burned — once, not twice.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;🚢 Happy shipping.&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>llm</category>
    </item>
    <item>
      <title>🤖 Learn Harness Engineering by Building a Mini Openclaw 🦞</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Sat, 25 Apr 2026 08:34:29 +0000</pubDate>
      <link>https://dev.to/truongpx396/learn-harness-engineering-by-building-a-mini-openclaw-bdm</link>
      <guid>https://dev.to/truongpx396/learn-harness-engineering-by-building-a-mini-openclaw-bdm</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🔀 Origin &amp;amp; Modifications&lt;/li&gt;
&lt;li&gt;🤔 What is this?&lt;/li&gt;
&lt;li&gt;🏗️ Architecture&lt;/li&gt;
&lt;li&gt;🔗 Section Dependencies&lt;/li&gt;
&lt;li&gt;⚡ Quick Start&lt;/li&gt;
&lt;li&gt;🗺️ Learning Path&lt;/li&gt;
&lt;li&gt;📋 Section Details&lt;/li&gt;
&lt;li&gt;📁 Repository Structure&lt;/li&gt;
&lt;li&gt;📦 Prerequisites&lt;/li&gt;
&lt;li&gt;🧩 Dependencies&lt;/li&gt;
&lt;li&gt;🔗 Related Projects&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Git repo: &lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-openclaw" rel="noopener noreferrer"&gt;truongpx396/learn-harness-engineering-by-building-mini-openclaw&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🔀 Origin &amp;amp; Modifications
&lt;/h2&gt;

&lt;p&gt;This repository is a fork of &lt;a href="https://github.com/shareAI-lab/claw0" rel="noopener noreferrer"&gt;shareAI-lab/claw0&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Changes made in this fork:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔄 &lt;strong&gt;SDK migration&lt;/strong&gt;: Migrated from the Anthropic SDK to the &lt;a href="https://github.com/openai/openai-python" rel="noopener noreferrer"&gt;OpenAI SDK&lt;/a&gt;, making all sections compatible with any OpenAI-compatible endpoint.&lt;/li&gt;
&lt;li&gt;🖥️ &lt;strong&gt;Local model support&lt;/strong&gt;: Added setup guides for running fully offline with &lt;a href="https://lmstudio.ai" rel="noopener noreferrer"&gt;LM Studio&lt;/a&gt;, &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;, and &lt;a href="https://www.nomic.ai/gpt4all" rel="noopener noreferrer"&gt;GPT4All&lt;/a&gt; — no cloud API required.&lt;/li&gt;
&lt;li&gt;⚙️ &lt;strong&gt;&lt;code&gt;.env&lt;/code&gt;-based configuration&lt;/strong&gt;: Introduced &lt;code&gt;OPENAI_BASE_URL&lt;/code&gt; and &lt;code&gt;MODEL_ID&lt;/code&gt; environment variables so you can point any section at a different provider or local server without touching the code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All credit for the original curriculum, architecture, and teaching approach goes to &lt;a href="https://github.com/shareAI-lab" rel="noopener noreferrer"&gt;shareAI-lab&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;🚀 &lt;strong&gt;From Zero to One: Build an AI Agent Gateway&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;10 progressive sections -- every section is a single, runnable Python file.&lt;br&gt;
code + docs co-located.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🤔 What is this?
&lt;/h2&gt;

&lt;p&gt;Most agent tutorials stop at "call an API once." This repository starts from that while loop and takes you all the way to a production-grade gateway.&lt;/p&gt;

&lt;p&gt;Build a minimal AI agent gateway from scratch, section by section. 10 sections, 10 core concepts, ~7,000 lines of Python. Each section introduces exactly one new idea while keeping all prior code intact. After all 10, you can read OpenClaw's production codebase with confidence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;s01: Agent Loop           &lt;span class="nt"&gt;--&lt;/span&gt; The foundation: &lt;span class="k"&gt;while&lt;/span&gt; + finish_reason
s02: Tool Use             &lt;span class="nt"&gt;--&lt;/span&gt; Let the model call tools: dispatch table
s03: Sessions &amp;amp; Context   &lt;span class="nt"&gt;--&lt;/span&gt; Persist conversations, handle overflow
s04: Channels             &lt;span class="nt"&gt;--&lt;/span&gt; Telegram + Feishu: real channel pipelines
s05: Gateway &amp;amp; Routing    &lt;span class="nt"&gt;--&lt;/span&gt; 5-tier binding, session isolation
s06: Intelligence         &lt;span class="nt"&gt;--&lt;/span&gt; Soul, memory, skills, prompt assembly
s07: Heartbeat &amp;amp; Cron     &lt;span class="nt"&gt;--&lt;/span&gt; Proactive agent + scheduled tasks
s08: Delivery             &lt;span class="nt"&gt;--&lt;/span&gt; Reliable message queue with backoff
s09: Resilience           &lt;span class="nt"&gt;--&lt;/span&gt; 3-layer retry onion + auth profile rotation
s10: Concurrency          &lt;span class="nt"&gt;--&lt;/span&gt; Named lanes serialize the chaos
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🏗️ Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+--- agent layers ---+
|                                                     |
|  s10: Concurrency  (named lanes, generation track)  |
|  s09: Resilience   (auth rotation, overflow compact)|
|  s08: Delivery     (write-ahead queue, backoff)     |
|  s07: Heartbeat    (lane lock, cron scheduler)      |
|  s06: Intelligence (8-layer prompt, hybrid memory)  |
|  s05: Gateway      (WebSocket, 5-tier routing)      |
|  s04: Channels     (Telegram pipeline, Feishu hook) |
|  s03: Sessions     (JSONL persistence, 3-stage retry)|
|  s02: Tools        (dispatch table, 4 tools)        |
|  s01: Agent Loop   (while True + finish_reason)     |
|                                                     |
+-----------------------------------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔗 Section Dependencies
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;s01 &lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s02 &lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s03 &lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s04 &lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s05
                 |               |
                 v               v
                s06 &lt;span class="nt"&gt;----------&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s07 &lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s08
                 |               |
                 v               v
                s09 &lt;span class="nt"&gt;----------&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; s10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;s01-s02: Foundation (no dependencies)&lt;/li&gt;
&lt;li&gt;s03: Builds on s02 (adds persistence to the tool loop)&lt;/li&gt;
&lt;li&gt;s04: Builds on s03 (channels produce InboundMessages for sessions)&lt;/li&gt;
&lt;li&gt;s05: Builds on s04 (routes channel messages to agents)&lt;/li&gt;
&lt;li&gt;s06: Builds on s03 (uses sessions for context, adds prompt layers)&lt;/li&gt;
&lt;li&gt;s07: Builds on s06 (heartbeat uses soul/memory for prompt)&lt;/li&gt;
&lt;li&gt;s08: Builds on s07 (heartbeat output flows through delivery queue)&lt;/li&gt;
&lt;li&gt;s09: Builds on s03+s06 (reuses ContextGuard for overflow, model config)&lt;/li&gt;
&lt;li&gt;s10: Builds on s07 (replaces single Lock with named lane system)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  ⚡ Quick Start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Clone and enter&lt;/span&gt;
git clone https://github.com/truongpx396/learn-harness-engineering-by-building-mini-openclaw &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;learn-harness-engineering-by-building-mini-openclaw

&lt;span class="c"&gt;# 2. Create and activate a virtual environment&lt;/span&gt;
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate        &lt;span class="c"&gt;# macOS / Linux&lt;/span&gt;
&lt;span class="c"&gt;# .venv\Scripts\activate         # Windows&lt;/span&gt;

&lt;span class="c"&gt;# 3. Install dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# 4. Configure&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;span class="c"&gt;# Edit .env: set OPENAI_API_KEY, MODEL_ID, and OPENAI_BASE_URL&lt;/span&gt;

&lt;span class="c"&gt;# 5. Run any section&lt;/span&gt;
python sessions/en/s01_agent_loop.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🗺️ Learning Path
&lt;/h2&gt;

&lt;p&gt;Each section adds exactly one new concept. All prior code stays intact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Phase 1: FOUNDATION     Phase 2: CONNECTIVITY     Phase 3: BRAIN        Phase 4: AUTONOMY       Phase 5: PRODUCTION
+----------------+      +-------------------+     +-----------------+   +-----------------+   +-----------------+
| s01: Loop      |      | s03: Sessions     |     | s06: Intelligence|  | s07: Heartbeat  |   | s09: Resilience |
| s02: Tools     | ---&amp;gt; | s04: Channels     | --&amp;gt; |   soul, memory, | -&amp;gt;|   &amp;amp; Cron        |--&amp;gt;|   &amp;amp; Concurrency |
|                |      | s05: Gateway      |     |   skills, prompt |  | s08: Delivery   |   | s10: Lanes      |
+----------------+      +-------------------+     +-----------------+   +-----------------+   +-----------------+
 while + dispatch        persist + route            personality + recall  proactive + reliable  retry + serialize
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📋 Section Details
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Section&lt;/th&gt;
&lt;th&gt;Core Concept&lt;/th&gt;
&lt;th&gt;Lines&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;01&lt;/td&gt;
&lt;td&gt;Agent Loop&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;while True&lt;/code&gt; + &lt;code&gt;finish_reason&lt;/code&gt; -- that's an agent&lt;/td&gt;
&lt;td&gt;~175&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;02&lt;/td&gt;
&lt;td&gt;Tool Use&lt;/td&gt;
&lt;td&gt;Tools = schema dict + handler map. Model picks a name, you look it up&lt;/td&gt;
&lt;td&gt;~445&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;03&lt;/td&gt;
&lt;td&gt;Sessions&lt;/td&gt;
&lt;td&gt;JSONL: append on write, replay on read. Too big? Summarize old parts&lt;/td&gt;
&lt;td&gt;~890&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;04&lt;/td&gt;
&lt;td&gt;Channels&lt;/td&gt;
&lt;td&gt;Every platform differs, but they all produce the same &lt;code&gt;InboundMessage&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;~780&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;05&lt;/td&gt;
&lt;td&gt;Gateway&lt;/td&gt;
&lt;td&gt;Binding table maps (channel, peer) to agent. Most specific wins&lt;/td&gt;
&lt;td&gt;~625&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;06&lt;/td&gt;
&lt;td&gt;Intelligence&lt;/td&gt;
&lt;td&gt;System prompt = files on disk. Swap files, change personality&lt;/td&gt;
&lt;td&gt;~750&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;07&lt;/td&gt;
&lt;td&gt;Heartbeat &amp;amp; Cron&lt;/td&gt;
&lt;td&gt;Timer thread: "should I run?" + queue work alongside user messages&lt;/td&gt;
&lt;td&gt;~660&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;08&lt;/td&gt;
&lt;td&gt;Delivery&lt;/td&gt;
&lt;td&gt;Write to disk first, then send. Crashes can't lose messages&lt;/td&gt;
&lt;td&gt;~870&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;09&lt;/td&gt;
&lt;td&gt;Resilience&lt;/td&gt;
&lt;td&gt;3-layer retry onion: auth rotation, overflow compaction, tool-use loop&lt;/td&gt;
&lt;td&gt;~1130&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;Named lanes with FIFO queues, generation tracking, Future-based results&lt;/td&gt;
&lt;td&gt;~900&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  📁 Repository Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;learn-harness-engineering-by-building-a-mini-openclaw/
  README.md              English README
  .env.example           Configuration template
  requirements.txt       Python dependencies
  sessions/              All teaching sessions (code + docs)
    en/                  English
      s01_agent_loop.py  s01_agent_loop.md
      s02_tool_use.py    s02_tool_use.md
      ...                (10 .py + 10 .md)
  workspace/             Shared workspace samples
    SOUL.md  IDENTITY.md  TOOLS.md  USER.md
    HEARTBEAT.md  BOOTSTRAP.md  AGENTS.md  MEMORY.md
    CRON.json
    skills/example-skill/SKILL.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📦 Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.11+&lt;/li&gt;
&lt;li&gt;An OpenAI-compatible API key (e.g. GitHub Models, Azure OpenAI, or any provider)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  💻 Running Locally (no cloud API required)
&lt;/h3&gt;

&lt;p&gt;All agents speak the OpenAI chat-completions protocol. Any local server that exposes a compatible endpoint works out of the box — no GPU required, CPU-only inference is supported by all three options below.&lt;/p&gt;




&lt;h4&gt;
  
  
  Option A — 🖥️ LM Studio
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://lmstudio.ai" rel="noopener noreferrer"&gt;LM Studio&lt;/a&gt; provides a GUI for downloading and serving models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install &amp;amp; load a model&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download and install &lt;a href="https://lmstudio.ai" rel="noopener noreferrer"&gt;LM Studio&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Discover&lt;/strong&gt; tab, search for a small instruction-tuned model.
Good CPU-friendly choices: &lt;code&gt;Qwen2.5-7B-Instruct&lt;/code&gt;, &lt;code&gt;Mistral-7B-Instruct&lt;/code&gt;, &lt;code&gt;Phi-3-mini&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Download&lt;/strong&gt; next to your chosen model.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Start the local server&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open the &lt;strong&gt;Developer&lt;/strong&gt; tab (&lt;code&gt;&amp;lt;/&amp;gt;&lt;/code&gt; icon in the left sidebar).&lt;/li&gt;
&lt;li&gt;Select your model from the dropdown and click &lt;strong&gt;Start Server&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;LM Studio listens at &lt;code&gt;http://localhost:1234/v1&lt;/code&gt;. Copy the model identifier shown (e.g. &lt;code&gt;lmstudio-community/Qwen2.5-7B-Instruct-GGUF&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Configure &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;lm-studio        &lt;span class="c"&gt;# any non-empty string works&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:1234/v1
&lt;span class="nv"&gt;MODEL_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;lmstudio-community/Qwen2.5-7B-Instruct-GGUF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h4&gt;
  
  
  Option B — 🦙 Ollama
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; is a lightweight CLI that manages and serves models with a single command.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install Ollama&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS / Linux&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh

&lt;span class="c"&gt;# Windows: download the installer from https://ollama.com/download&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Pull a model &amp;amp; start the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5:7b          &lt;span class="c"&gt;# or: mistral, phi3, llama3.2, gemma2:2b …&lt;/span&gt;
ollama serve                    &lt;span class="c"&gt;# starts at http://localhost:11434&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;If you ran &lt;code&gt;ollama pull&lt;/code&gt; without &lt;code&gt;ollama serve&lt;/code&gt;, the server is already running in the background — no extra step needed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;3. Configure &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ollama           &lt;span class="c"&gt;# any non-empty string works&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:11434/v1
&lt;span class="nv"&gt;MODEL_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;qwen2.5:7b            &lt;span class="c"&gt;# must match the name you pulled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h4&gt;
  
  
  Option C — 🌐 GPT4All
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://www.nomic.ai/gpt4all" rel="noopener noreferrer"&gt;GPT4All&lt;/a&gt; offers a desktop app with a built-in API server mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install GPT4All&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download and install the desktop app from &lt;a href="https://www.nomic.ai/gpt4all" rel="noopener noreferrer"&gt;nomic.ai/gpt4all&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Download a model &amp;amp; enable the API server&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;strong&gt;Models&lt;/strong&gt; → browse and download a model (e.g. &lt;code&gt;Mistral 7B Instruct&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Open &lt;strong&gt;Settings → API Server&lt;/strong&gt;, toggle &lt;strong&gt;Enable API Server&lt;/strong&gt; on.&lt;/li&gt;
&lt;li&gt;The server starts at &lt;code&gt;http://localhost:4891/v1&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Configure &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gpt4all          &lt;span class="c"&gt;# any non-empty string works&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:4891/v1
&lt;span class="nv"&gt;MODEL_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Mistral 7B Instruct   &lt;span class="c"&gt;# must match the model name shown in the app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;4. Run (same for all options)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python sessions/en/s01_agent_loop.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tips for CPU inference&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Under 8 GB RAM:&lt;/strong&gt; use 1.5B–3B models — e.g. &lt;code&gt;Qwen2.5-1.5B-Instruct&lt;/code&gt;, &lt;code&gt;Llama-3.2-1B-Instruct&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8 GB–16 GB RAM:&lt;/strong&gt; use 4-bit quantized 7B–8B models — e.g. &lt;code&gt;Llama-3.1-8B-Instruct (Q4)&lt;/code&gt;, &lt;code&gt;Mistral-7B-Instruct (Q4)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;16 GB+ RAM:&lt;/strong&gt; standard 7B–13B models work well without extra quantization.&lt;/li&gt;
&lt;li&gt;Keep context length at 4096 or lower in your server settings to reduce RAM pressure.&lt;/li&gt;
&lt;li&gt;The agents already cap &lt;code&gt;max_tokens&lt;/code&gt; at 8096, so small models won't be overwhelmed.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🧩 Dependencies
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="err"&gt;openai&amp;gt;=1.0.0&lt;/span&gt;
&lt;span class="err"&gt;python-dotenv&amp;gt;=1.0.0&lt;/span&gt;
&lt;span class="err"&gt;websockets&amp;gt;=12.0&lt;/span&gt;
&lt;span class="err"&gt;croniter&amp;gt;=2.0.0&lt;/span&gt;
&lt;span class="err"&gt;python-telegram-bot&amp;gt;=21.0&lt;/span&gt;
&lt;span class="err"&gt;httpx&amp;gt;=0.27.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔗 Related Projects
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/shareAI-lab/learn-claude-code" rel="noopener noreferrer"&gt;learn-claude-code&lt;/a&gt;&lt;/strong&gt; -- A companion teaching repo that builds an agent &lt;strong&gt;framework&lt;/strong&gt; (nano Claude Code) from scratch in 12 progressive sessions. Where learn-harness-engineering-by-building-a-mini-openclaw focuses on gateway routing, channels, and proactive behavior, learn-claude-code dives deep into the agent's internal design: structured planning (TodoManager + nag), context compression (3-layer compact), file-based task persistence with dependency graphs, team coordination (JSONL mailboxes, shutdown/plan-approval FSM), autonomous self-organization, and git worktree isolation for parallel execution. If you want to understand how a production-grade unit agent works inside, start there.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  👥 About
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fuser-attachments%2Fassets%2Ffe8b852b-97da-4061-a467-9694906b5edf" class="article-body-image-wrapper"&gt;&lt;img width="1280" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fuser-attachments%2Fassets%2Ffe8b852b-97da-4061-a467-9694906b5edf" height="1280"&gt;&lt;/a&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Scan with Wechat to fellow us,&lt;br&gt;&lt;br&gt;
or fellow on X: &lt;a href="https://x.com/baicai003" rel="noopener noreferrer"&gt;shareAI-Lab&lt;/a&gt;  &lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>llm</category>
    </item>
    <item>
      <title>🤖 Learn Harness Engineering by Building a Mini Claude Code 💻</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Wed, 22 Apr 2026 06:03:42 +0000</pubDate>
      <link>https://dev.to/truongpx396/learn-harness-engineering-by-building-a-mini-claude-code-45a9</link>
      <guid>https://dev.to/truongpx396/learn-harness-engineering-by-building-a-mini-claude-code-45a9</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Git Repo:&lt;/strong&gt; &lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code" rel="noopener noreferrer"&gt;truongpx396/learn-harness-engineering-by-building-mini-claude-code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is a fork of &lt;a href="https://github.com/shareAI-lab/learn-claude-code" rel="noopener noreferrer"&gt;shareAI-lab/learn-claude-code&lt;/a&gt; with all agents migrated from the Anthropic SDK to the &lt;strong&gt;OpenAI SDK&lt;/strong&gt; (&lt;code&gt;openai&lt;/code&gt; Python package). All agents work with any OpenAI-compatible endpoint — cloud APIs (OpenAI, GitHub Models, Azure OpenAI) or local servers (LM Studio, Ollama, GPT4All).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
🧠 Agency Comes from the Model. An Agent Product = Model + Harness.

&lt;ul&gt;
&lt;li&gt;📜 Where Agency Comes From&lt;/li&gt;
&lt;li&gt;❌ What an Agent Is NOT&lt;/li&gt;
&lt;li&gt;💡 The Mind Shift: From "Developing Agents" to Developing Harness&lt;/li&gt;
&lt;li&gt;🔧 What Harness Engineers Actually Do&lt;/li&gt;
&lt;li&gt;🎓 Why Claude Code — A Masterclass in Harness Engineering&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;🌌 The Vision: Fill the Universe with Real Agents&lt;/li&gt;

&lt;li&gt;⚙️ The Core Pattern&lt;/li&gt;

&lt;li&gt;⚠️ Scope (Important)&lt;/li&gt;

&lt;li&gt;

🚀 Quick Start

&lt;ul&gt;
&lt;li&gt;💻 Running Locally (no cloud API required)&lt;/li&gt;
&lt;li&gt;Option A — 🖥️ LM Studio&lt;/li&gt;
&lt;li&gt;Option B — 🦙 Ollama&lt;/li&gt;
&lt;li&gt;Option C — 🌐 GPT4All&lt;/li&gt;
&lt;li&gt;🌍 Web Platform&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;🗺️ Learning Path&lt;/li&gt;

&lt;li&gt;🏗️ Architecture&lt;/li&gt;

&lt;li&gt;📖 Documentation&lt;/li&gt;

&lt;li&gt;🔭 What's Next — from understanding to shipping&lt;/li&gt;

&lt;li&gt;Sister Repo: from &lt;em&gt;on-demand sessions&lt;/em&gt; to &lt;em&gt;always-on assistant&lt;/em&gt;
&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 Agency Comes from the Model. An Agent Product = Model + Harness.
&lt;/h2&gt;

&lt;p&gt;Before we talk about code, let's get one thing straight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agency -- the ability to perceive, reason, and act -- comes from model training, not from external code orchestration.&lt;/strong&gt; But a working agent product needs both the model and the harness. The model is the driver, the harness is the vehicle. This repo teaches you how to build the vehicle.&lt;/p&gt;

&lt;h3&gt;
  
  
  📜 Where Agency Comes From
&lt;/h3&gt;

&lt;p&gt;At the core of every agent is a neural network -- a Transformer, an RNN, a learned function -- that has been trained, through billions of gradient updates on action-sequence data, to perceive an environment, reason about goals, and take actions. Agency is never granted by the surrounding code. It is learned by the model during training.&lt;/p&gt;

&lt;p&gt;Humans are the best example. A biological neural network shaped by millions of years of evolutionary training, perceiving the world through senses, reasoning through a brain, acting through a body. When DeepMind, OpenAI, or Anthropic say "agent," the core of what they mean is always the same thing: &lt;strong&gt;a model that has learned to act, plus the infrastructure that lets it operate in a specific environment.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The proof is written in history:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;2013 -- DeepMind DQN plays Atari.&lt;/strong&gt; A single neural network, receiving only raw pixels and game scores, learned to play 7 Atari 2600 games -- surpassing all prior algorithms and beating human experts on 3 of them. By 2015, the same architecture scaled to &lt;a href="https://www.nature.com/articles/nature14236" rel="noopener noreferrer"&gt;49 games and matched professional human testers&lt;/a&gt;, published in &lt;em&gt;Nature&lt;/em&gt;. No game-specific rules. No decision trees. One model, learning from experience. That model was the agent.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;2019 -- OpenAI Five conquers Dota 2.&lt;/strong&gt; Five neural networks, having played &lt;a href="https://openai.com/index/openai-five-defeats-dota-2-world-champions/" rel="noopener noreferrer"&gt;45,000 years of Dota 2&lt;/a&gt; against themselves in 10 months, defeated &lt;strong&gt;OG&lt;/strong&gt; -- the reigning TI8 world champions -- 2-0 on a San Francisco livestream. In a subsequent public arena, the AI won 99.4% of 42,729 games against all comers. No scripted strategies. No meta-programmed team coordination. The models learned teamwork, tactics, and real-time adaptation entirely through self-play.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;2019 -- DeepMind AlphaStar masters StarCraft II.&lt;/strong&gt; AlphaStar &lt;a href="https://deepmind.google/blog/alphastar-mastering-the-real-time-strategy-game-starcraft-ii/" rel="noopener noreferrer"&gt;beat professional players 10-1&lt;/a&gt; in a closed-door match, and later achieved &lt;a href="https://www.nature.com/articles/d41586-019-03298-6" rel="noopener noreferrer"&gt;Grandmaster status&lt;/a&gt; on European servers -- top 0.15% of 90,000 players. A game with imperfect information, real-time decisions, and a combinatorial action space that dwarfs chess and Go. The agent? A model. Trained. Not scripted.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;2019 -- Tencent Jueyu dominates Honor of Kings.&lt;/strong&gt; Tencent AI Lab's "Jueyu" &lt;a href="https://www.jiemian.com/article/3371171.html" rel="noopener noreferrer"&gt;defeated KPL professional players&lt;/a&gt; in a full 5v5 match at the World Champion Cup. In 1v1 mode, pros won only &lt;a href="https://developer.aliyun.com/article/851058" rel="noopener noreferrer"&gt;1 out of 15 games and never survived past 8 minutes&lt;/a&gt;. Training intensity: one day equaled 440 human years. By 2021, Jueyu surpassed KPL pros across the full hero pool. No handcrafted matchup tables. No scripted compositions. A model that learned the entire game from scratch through self-play.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;2024-2025 -- LLM agents reshape software engineering.&lt;/strong&gt; Claude, GPT, Gemini -- large language models trained on the entirety of human code and reasoning -- are deployed as coding agents. They read codebases, write implementations, debug failures, coordinate in teams. The architecture is identical to every agent before them: a trained model, placed in an environment, given tools to perceive and act. The only difference is the scale of what they've learned and the generality of the tasks they solve.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of these milestones points to the same fact: &lt;strong&gt;agency -- the ability to perceive, reason, and act -- is trained, not coded.&lt;/strong&gt; But every agent also needed an environment to operate in: the Atari emulator, the Dota 2 client, the StarCraft II engine, the IDE and terminal. The model provides intelligence. The environment provides the action space. Together they form a complete agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ What an Agent Is NOT
&lt;/h3&gt;

&lt;p&gt;The word "agent" has been hijacked by an entire cottage industry of prompt plumbing.&lt;/p&gt;

&lt;p&gt;Drag-and-drop workflow builders. No-code "AI agent" platforms. Prompt-chain orchestration libraries. They all share the same delusion: that wiring together LLM API calls with if-else branches, node graphs, and hardcoded routing logic constitutes "building an agent."&lt;/p&gt;

&lt;p&gt;It doesn't. What they build is a Rube Goldberg machine -- an over-engineered, brittle pipeline of procedural rules, with an LLM wedged in as a glorified text-completion node. That is not an agent. That is a shell script with delusions of grandeur.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt plumbing "agents" are the fantasy of programmers who don't train models.&lt;/strong&gt; They attempt to brute-force intelligence by stacking procedural logic -- massive rule trees, node graphs, chain-of-prompt waterfalls -- and praying that enough glue code will somehow emergently produce autonomous behavior. It won't. You cannot engineer your way to agency. Agency is learned, not programmed.&lt;/p&gt;

&lt;p&gt;Those systems are dead on arrival: fragile, unscalable, fundamentally incapable of generalization. They are the modern resurrection of GOFAI (Good Old-Fashioned AI) -- the symbolic rule systems the field abandoned decades ago, now spray-painted with an LLM veneer. Different packaging, same dead end.&lt;/p&gt;

&lt;h3&gt;
  
  
  💡 The Mind Shift: From "Developing Agents" to Developing Harness
&lt;/h3&gt;

&lt;p&gt;When someone says "I'm developing an agent," they can only mean one of two things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Training the model.&lt;/strong&gt; Adjusting weights through reinforcement learning, fine-tuning, RLHF, or other gradient-based methods. Collecting task-process data -- the actual sequences of perception, reasoning, and action in real domains -- and using it to shape the model's behavior. This is what DeepMind, OpenAI, Tencent AI Lab, and Anthropic do. This is agent development in the truest sense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Building the harness.&lt;/strong&gt; Writing the code that gives the model an environment to operate in. This is what most of us do, and it is the focus of this repository.&lt;/p&gt;

&lt;p&gt;A harness is everything the agent needs to function in a specific domain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Harness = Tools + Knowledge + Observation + Action Interfaces + Permissions

    Tools:          file I/O, shell, network, database, browser
    Knowledge:      product docs, domain references, API specs, style guides
    Observation:    git diff, error logs, browser state, sensor data
    Action:         CLI commands, API calls, UI interactions
    Permissions:    sandboxing, approval workflows, trust boundaries
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model decides. The harness executes. The model reasons. The harness provides context. The model is the driver. The harness is the vehicle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A coding agent's harness is its IDE, terminal, and filesystem access.&lt;/strong&gt; A farm agent's harness is its sensor array, irrigation controls, and weather data feeds. A hotel agent's harness is its booking system, guest communication channels, and facility management APIs. The agent -- the intelligence, the decision-maker -- is always the model. The harness changes per domain. The agent generalizes across them.&lt;/p&gt;

&lt;p&gt;This repo teaches you to build vehicles. Vehicles for coding. But the design patterns generalize to any domain: farm management, hotel operations, manufacturing, logistics, healthcare, education, scientific research. Anywhere a task needs to be perceived, reasoned about, and acted upon -- an agent needs a harness.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 What Harness Engineers Actually Do
&lt;/h3&gt;

&lt;p&gt;If you are reading this repository, you are likely a harness engineer -- and that is a powerful thing to be. Here is your real job:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Implement tools.&lt;/strong&gt; Give the agent hands. File read/write, shell execution, API calls, browser control, database queries. Each tool is an action the agent can take in its environment. Design them to be atomic, composable, and well-described.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Curate knowledge.&lt;/strong&gt; Give the agent domain expertise. Product documentation, architectural decision records, style guides, regulatory requirements. Load them on-demand (s05), not upfront. The agent should know what's available and pull what it needs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Manage context.&lt;/strong&gt; Give the agent clean memory. Subagent isolation (s04) prevents noise from leaking. Context compression (s06) prevents history from overwhelming. Task systems (s07) persist goals beyond any single conversation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Control permissions.&lt;/strong&gt; Give the agent boundaries. Sandbox file access. Require approval for destructive operations. Enforce trust boundaries between the agent and external systems. This is where safety engineering meets harness engineering.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Collect task-process data.&lt;/strong&gt; Every action sequence the agent executes in your harness is training signal. The perception-reasoning-action traces from real deployments are the raw material for fine-tuning the next generation of agent models. Your harness doesn't just serve the agent -- it can help improve the agent.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You are not writing the intelligence. You are building the world the intelligence inhabits. The quality of that world -- how clearly the agent can perceive, how precisely it can act, how rich its available knowledge is -- directly determines how effectively the intelligence can express itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build great harnesses. The agent will do the rest.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🎓 Why Claude Code — A Masterclass in Harness Engineering
&lt;/h3&gt;

&lt;p&gt;Why does this repository dissect Claude Code specifically?&lt;/p&gt;

&lt;p&gt;Because Claude Code is the most elegant and fully-realized agent harness we have seen. Not because of any single clever trick, but because of what it &lt;em&gt;doesn't&lt;/em&gt; do: it doesn't try to be the agent. It doesn't impose rigid workflows. It doesn't second-guess the model with elaborate decision trees. It provides the model with tools, knowledge, context management, and permission boundaries -- then gets out of the way.&lt;/p&gt;

&lt;p&gt;Look at what Claude Code actually is, stripped to its essence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code = one agent loop
            + tools (bash, read, write, edit, glob, grep, browser...)
            + on-demand skill loading
            + context compression
            + subagent spawning
            + task system with dependency graph
            + team coordination with async mailboxes
            + worktree isolation for parallel execution
            + permission governance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. That's the entire architecture. Every component is a harness mechanism -- a piece of the world built for the agent to inhabit. The agent itself? It's Claude. A model. Trained by Anthropic on the full breadth of human reasoning and code. The harness doesn't make Claude smart. Claude is already smart. The harness gives Claude hands, eyes, and a workspace.&lt;/p&gt;

&lt;p&gt;This is why Claude Code is the ideal teaching subject: &lt;strong&gt;it demonstrates what happens when you trust the model and focus your engineering on the harness.&lt;/strong&gt; Every session in this repository (s01-s12) reverse-engineers one harness mechanism from Claude Code's architecture. By the end, you understand not just how Claude Code works, but the universal principles of harness engineering that apply to any agent in any domain.&lt;/p&gt;

&lt;p&gt;The lesson is not "copy Claude Code." The lesson is: &lt;strong&gt;the best agent products are built by engineers who understand that their job is harness, not intelligence.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🌌 The Vision: Fill the Universe with Real Agents
&lt;/h2&gt;

&lt;p&gt;This is not just about coding agents.&lt;/p&gt;

&lt;p&gt;Every domain where humans perform complex, multi-step, judgment-intensive work is a domain where agents can operate -- given the right harness. The patterns in this repository are universal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Estate management agent    = model + property sensors + maintenance tools + tenant comms
Agricultural agent         = model + soil/weather data + irrigation controls + crop knowledge
Hotel operations agent     = model + booking system + guest channels + facility APIs
Medical research agent     = model + literature search + lab instruments + protocol docs
Manufacturing agent        = model + production line sensors + quality controls + logistics
Education agent            = model + curriculum knowledge + student progress + assessment tools
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The loop is always the same. The tools change. The knowledge changes. The permissions change. The agent -- the model -- generalizes.&lt;/p&gt;

&lt;p&gt;Every harness engineer reading this repository is learning patterns that apply far beyond software engineering. You are learning to build the infrastructure for an intelligent, automated future. Every well-designed harness deployed in a real domain is one more place where an agent can perceive, reason, and act.&lt;/p&gt;

&lt;p&gt;First we fill the workshops. Then the farms, the hospitals, the factories. Then the cities. Then the planet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bash is all you need. Real agents are all the universe needs.&lt;/strong&gt;&lt;/p&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    THE AGENT PATTERN
                    =================

    User --&amp;gt; messages[] --&amp;gt; LLM --&amp;gt; response
                                      |
                            msg.tool_calls?
                           /              \
                         yes               no
                          |                 |
                    execute tools       return text
                    append results
                    loop back -------&amp;gt; messages[]


    That's the minimal loop. Every AI agent needs this loop.
    The MODEL decides when to call tools and when to stop.
    The CODE just executes what the model asks for.
    This repo teaches you to build what surrounds this loop --
    the harness that makes the agent effective in a specific domain.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;12 progressive sessions, from a simple loop to isolated autonomous execution.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Each session adds one harness mechanism. Each mechanism has one motto.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔁 &lt;strong&gt;s01&lt;/strong&gt;   &lt;em&gt;"One loop &amp;amp; Bash is all you need"&lt;/em&gt; — one tool + one loop = an agent&lt;/p&gt;

&lt;p&gt;🛠️ &lt;strong&gt;s02&lt;/strong&gt;   &lt;em&gt;"Adding a tool means adding one handler"&lt;/em&gt; — the loop stays the same; new tools register into the dispatch map&lt;/p&gt;

&lt;p&gt;📋 &lt;strong&gt;s03&lt;/strong&gt;   &lt;em&gt;"An agent without a plan drifts"&lt;/em&gt; — list the steps first, then execute; completion doubles&lt;/p&gt;

&lt;p&gt;🪄 &lt;strong&gt;s04&lt;/strong&gt;   &lt;em&gt;"Break big tasks down; each subtask gets a clean context"&lt;/em&gt; — subagents use independent messages[], keeping the main conversation clean&lt;/p&gt;

&lt;p&gt;📚 &lt;strong&gt;s05&lt;/strong&gt;   &lt;em&gt;"Load knowledge when you need it, not upfront"&lt;/em&gt; — inject via tool_result, not the system prompt&lt;/p&gt;

&lt;p&gt;🗜️ &lt;strong&gt;s06&lt;/strong&gt;   &lt;em&gt;"Context will fill up; you need a way to make room"&lt;/em&gt; — three-layer compression strategy for infinite sessions&lt;/p&gt;

&lt;p&gt;📁 &lt;strong&gt;s07&lt;/strong&gt;   &lt;em&gt;"Break big goals into small tasks, order them, persist to disk"&lt;/em&gt; — a file-based task graph with dependencies, laying the foundation for multi-agent collaboration&lt;/p&gt;

&lt;p&gt;⚡ &lt;strong&gt;s08&lt;/strong&gt;   &lt;em&gt;"Run slow operations in the background; the agent keeps thinking"&lt;/em&gt; — daemon threads run commands, inject notifications on completion&lt;/p&gt;

&lt;p&gt;👥 &lt;strong&gt;s09&lt;/strong&gt;   &lt;em&gt;"When the task is too big for one, delegate to teammates"&lt;/em&gt; — persistent teammates + async mailboxes&lt;/p&gt;

&lt;p&gt;📡 &lt;strong&gt;s10&lt;/strong&gt;   &lt;em&gt;"Teammates need shared communication rules"&lt;/em&gt; — one request-response pattern drives all negotiation&lt;/p&gt;

&lt;p&gt;🤝 &lt;strong&gt;s11&lt;/strong&gt;   &lt;em&gt;"Teammates scan the board and claim tasks themselves"&lt;/em&gt; — no need for the lead to assign each one&lt;/p&gt;

&lt;p&gt;🌿 &lt;strong&gt;s12&lt;/strong&gt;   &lt;em&gt;"Each works in its own directory, no interference"&lt;/em&gt; — tasks manage goals, worktrees manage directories, bound by ID&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  ⚙️ The Core Pattern
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;agent_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;max_completion_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TOOL_HANDLERS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Every session layers one harness mechanism on top of this loop -- without changing the loop itself. The loop belongs to the agent. The mechanisms belong to the harness.&lt;/p&gt;
&lt;h2&gt;
  
  
  ⚠️ Scope (Important)
&lt;/h2&gt;

&lt;p&gt;This repository is a 0-&amp;gt;1 learning project for harness engineering -- building the environment that surrounds an agent model.&lt;br&gt;
It intentionally simplifies or omits several production mechanisms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full event/hook buses (for example PreToolUse, SessionStart/End, ConfigChange).
s12 includes only a minimal append-only lifecycle event stream for teaching.&lt;/li&gt;
&lt;li&gt;Rule-based permission governance and trust workflows&lt;/li&gt;
&lt;li&gt;Session lifecycle controls (resume/fork) and advanced worktree lifecycle controls&lt;/li&gt;
&lt;li&gt;Full MCP runtime details (transport/OAuth/resource subscribe/polling)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Treat the team JSONL mailbox protocol in this repo as a teaching implementation, not a claim about any specific production internals.&lt;/p&gt;
&lt;h2&gt;
  
  
  🚀 Quick Start
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code
&lt;span class="nb"&gt;cd &lt;/span&gt;learn-harness-engineering-by-building-mini-claude-code
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate          &lt;span class="c"&gt;# Windows: .venv\Scripts\activate&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env               &lt;span class="c"&gt;# Edit .env with your OPENAI_API_KEY&lt;/span&gt;

python agents/s01_agent_loop.py       &lt;span class="c"&gt;# Start here&lt;/span&gt;
python agents/s12_worktree_task_isolation.py  &lt;span class="c"&gt;# Full progression endpoint&lt;/span&gt;
python agents/s_full.py               &lt;span class="c"&gt;# Capstone: all mechanisms combined&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  💻 Running Locally (no cloud API required)
&lt;/h3&gt;

&lt;p&gt;All agents speak the OpenAI chat-completions protocol. Any local server that exposes a compatible endpoint works out of the box — no GPU required, CPU-only inference is supported by all three options below.&lt;/p&gt;


&lt;h4&gt;
  
  
  Option A — 🖥️ LM Studio
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://lmstudio.ai" rel="noopener noreferrer"&gt;LM Studio&lt;/a&gt; provides a GUI for downloading and serving models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install &amp;amp; load a model&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download and install &lt;a href="https://lmstudio.ai" rel="noopener noreferrer"&gt;LM Studio&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Discover&lt;/strong&gt; tab, search for a small instruction-tuned model.
Good CPU-friendly choices: &lt;code&gt;Qwen2.5-7B-Instruct&lt;/code&gt;, &lt;code&gt;Mistral-7B-Instruct&lt;/code&gt;, &lt;code&gt;Phi-3-mini&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Download&lt;/strong&gt; next to your chosen model.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Start the local server&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open the &lt;strong&gt;Developer&lt;/strong&gt; tab (&lt;code&gt;&amp;lt;/&amp;gt;&lt;/code&gt; icon in the left sidebar).&lt;/li&gt;
&lt;li&gt;Select your model from the dropdown and click &lt;strong&gt;Start Server&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;LM Studio listens at &lt;code&gt;http://localhost:1234/v1&lt;/code&gt;. Copy the model identifier shown (e.g. &lt;code&gt;lmstudio-community/Qwen2.5-7B-Instruct-GGUF&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Configure &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;lm-studio        &lt;span class="c"&gt;# any non-empty string works&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:1234/v1
&lt;span class="nv"&gt;MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;lmstudio-community/Qwen2.5-7B-Instruct-GGUF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h4&gt;
  
  
  Option B — 🦙 Ollama
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; is a lightweight CLI that manages and serves models with a single command.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install Ollama&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS / Linux&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh

&lt;span class="c"&gt;# Windows: download the installer from https://ollama.com/download&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Pull a model &amp;amp; start the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5:7b          &lt;span class="c"&gt;# or: mistral, phi3, llama3.2, gemma2:2b …&lt;/span&gt;
ollama serve                    &lt;span class="c"&gt;# starts at http://localhost:11434&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;If you ran &lt;code&gt;ollama pull&lt;/code&gt; without &lt;code&gt;ollama serve&lt;/code&gt;, the server is already running in the background — no extra step needed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;3. Configure &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ollama           &lt;span class="c"&gt;# any non-empty string works&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:11434/v1
&lt;span class="nv"&gt;MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;qwen2.5:7b                &lt;span class="c"&gt;# must match the name you pulled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h4&gt;
  
  
  Option C — 🌐 GPT4All
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://www.nomic.ai/gpt4all" rel="noopener noreferrer"&gt;GPT4All&lt;/a&gt; offers a desktop app with a built-in API server mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install GPT4All&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download and install the desktop app from &lt;a href="https://www.nomic.ai/gpt4all" rel="noopener noreferrer"&gt;nomic.ai/gpt4all&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Download a model &amp;amp; enable the API server&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;strong&gt;Models&lt;/strong&gt; → browse and download a model (e.g. &lt;code&gt;Mistral 7B Instruct&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Open &lt;strong&gt;Settings → API Server&lt;/strong&gt;, toggle &lt;strong&gt;Enable API Server&lt;/strong&gt; on.&lt;/li&gt;
&lt;li&gt;The server starts at &lt;code&gt;http://localhost:4891/v1&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Configure &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gpt4all          &lt;span class="c"&gt;# any non-empty string works&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:4891/v1
&lt;span class="nv"&gt;MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Mistral 7B Instruct       &lt;span class="c"&gt;# must match the model name shown in the app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;4. Run (same for all options)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python agents/s01_agent_loop.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tips for CPU inference&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Under 8 GB RAM:&lt;/strong&gt; use 1.5B–3B models — e.g. &lt;code&gt;Qwen2.5-1.5B-Instruct&lt;/code&gt;, &lt;code&gt;Llama-3.2-1B-Instruct&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8 GB–16 GB RAM:&lt;/strong&gt; use 4-bit quantized 7B–8B models — e.g. &lt;code&gt;Llama-3.1-8B-Instruct (Q4)&lt;/code&gt;, &lt;code&gt;Mistral-7B-Instruct (Q4)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;16 GB+ RAM:&lt;/strong&gt; standard 7B–13B models work well without extra quantization.&lt;/li&gt;
&lt;li&gt;Keep context length at 4096 or lower in your server settings to reduce RAM pressure.&lt;/li&gt;
&lt;li&gt;The agents already cap &lt;code&gt;max_completion_tokens&lt;/code&gt; at 4096, which is compatible with all small models.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🌍 Web Platform
&lt;/h3&gt;

&lt;p&gt;Interactive visualizations, step-through diagrams, source viewer, and documentation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;web &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm run dev   &lt;span class="c"&gt;# http://localhost:3000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🗺️ Learning Path
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Phase 1: THE LOOP                    Phase 2: PLANNING &amp;amp; KNOWLEDGE
==================                   ==============================
s01  The Agent Loop          [1]     s03  TodoWrite               [5]
     while + tool_calls check            TodoManager + nag reminder
     |                                    |
     +-&amp;gt; s02  Tool Use            [4]     s04  Subagents            [5]
              dispatch map: name-&amp;gt;handler     fresh messages[] per child
                                              |
                                         s05  Skills               [5]
                                              SKILL.md via tool_result
                                              |
                                         s06  Context Compact      [5]
                                              3-layer compression

Phase 3: PERSISTENCE                 Phase 4: TEAMS
==================                   =====================
s07  Tasks                   [8]     s09  Agent Teams             [9]
     file-based CRUD + deps graph         teammates + JSONL mailboxes
     |                                    |
s08  Background Tasks        [6]     s10  Team Protocols          [12]
     daemon threads + notify queue        shutdown + plan approval FSM
                                          |
                                     s11  Autonomous Agents       [14]
                                          idle cycle + auto-claim
                                     |
                                     s12  Worktree Isolation      [16]
                                          task coordination + optional isolated execution lanes

                                     [N] = number of tools
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🏗️ Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;learn-harness-engineering-by-building-mini-claude-code/&lt;/span&gt;
&lt;span class="pi"&gt;|&lt;/span&gt;
&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="s"&gt;-- agents/                        # Python reference implementations (s01-s12 + s_full capstone)&lt;/span&gt;
&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="s"&gt;-- docs/{en}/               # Mental-model-first documentation (3 languages)&lt;/span&gt;
&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="s"&gt;-- web/                           # Interactive learning platform (Next.js)&lt;/span&gt;
&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="s"&gt;-- skills/                        # Skill files for s05&lt;/span&gt;
&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="s"&gt;-- .github/workflows/ci.yml      # CI: typecheck + build&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📖 Documentation
&lt;/h2&gt;

&lt;p&gt;Mental-model-first: problem, solution, ASCII diagram, minimal code.&lt;br&gt;
Available in &lt;a href="//./docs/en/"&gt;English&lt;/a&gt; &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Session&lt;/th&gt;
&lt;th&gt;Topic&lt;/th&gt;
&lt;th&gt;Motto&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s01-the-agent-loop.md" rel="noopener noreferrer"&gt;s01&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;The Agent Loop&lt;/td&gt;
&lt;td&gt;&lt;em&gt;One loop &amp;amp; Bash is all you need&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s02-tool-use.md" rel="noopener noreferrer"&gt;s02&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Tool Use&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Adding a tool means adding one handler&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s03-todo-write.md" rel="noopener noreferrer"&gt;s03&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;TodoWrite&lt;/td&gt;
&lt;td&gt;&lt;em&gt;An agent without a plan drifts&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s04-subagent.md" rel="noopener noreferrer"&gt;s04&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Subagents&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Break big tasks down; each subtask gets a clean context&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s05-skill-loading.md" rel="noopener noreferrer"&gt;s05&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Skills&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Load knowledge when you need it, not upfront&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s06-context-compact.md" rel="noopener noreferrer"&gt;s06&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Context Compact&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Context will fill up; you need a way to make room&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s07-task-system.md" rel="noopener noreferrer"&gt;s07&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Tasks&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Break big goals into small tasks, order them, persist to disk&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s08-background-tasks.md" rel="noopener noreferrer"&gt;s08&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Background Tasks&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Run slow operations in the background; the agent keeps thinking&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s09-agent-teams.md" rel="noopener noreferrer"&gt;s09&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Agent Teams&lt;/td&gt;
&lt;td&gt;&lt;em&gt;When the task is too big for one, delegate to teammates&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s10-team-protocols.md" rel="noopener noreferrer"&gt;s10&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Team Protocols&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Teammates need shared communication rules&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s11-autonomous-agents.md" rel="noopener noreferrer"&gt;s11&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Autonomous Agents&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Teammates scan the board and claim tasks themselves&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/truongpx396/learn-harness-engineering-by-building-mini-claude-code/blob/main/docs/en/s12-worktree-task-isolation.md" rel="noopener noreferrer"&gt;s12&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Worktree + Task Isolation&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Each works in its own directory, no interference&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  🔭 What's Next — from understanding to shipping
&lt;/h2&gt;

&lt;p&gt;After the 12 sessions you understand how harness engineering works inside out. Two ways to put that knowledge to work:&lt;/p&gt;
&lt;h3&gt;
  
  
  Kode Agent CLI -- Open-Source Coding Agent CLI
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;npm i -g @shareai-lab/kode&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Skill &amp;amp; LSP support, Windows-ready, pluggable with GLM / MiniMax / DeepSeek and other open models. Install and go.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;strong&gt;&lt;a href="https://github.com/shareAI-lab/Kode-cli" rel="noopener noreferrer"&gt;shareAI-lab/Kode-cli&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Kode Agent SDK -- Embed Agent Capabilities in Your App
&lt;/h3&gt;

&lt;p&gt;The official Claude Code Agent SDK communicates with a full CLI process under the hood -- each concurrent user means a separate terminal process. Kode SDK is a standalone library with no per-user process overhead, embeddable in backends, browser extensions, embedded devices, or any runtime.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;strong&gt;&lt;a href="https://github.com/shareAI-lab/Kode-agent-sdk" rel="noopener noreferrer"&gt;shareAI-lab/Kode-agent-sdk&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Sister Repo: from &lt;em&gt;on-demand sessions&lt;/em&gt; to &lt;em&gt;always-on assistant&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;The harness this repo teaches is &lt;strong&gt;use-and-discard&lt;/strong&gt; -- open a terminal, give the agent a task, close when done, next session starts blank. That is the Claude Code model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; proved another possibility: on top of the same agent core, two harness mechanisms turn the agent from "poke it to make it move" into "it wakes up every 30 seconds to look for work":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Heartbeat&lt;/strong&gt; -- every 30s the harness sends the agent a message to check if there is anything to do. Nothing? Go back to sleep. Something? Act immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cron&lt;/strong&gt; -- the agent can schedule its own future tasks, executed automatically when the time comes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Add multi-channel IM routing (WhatsApp / Telegram / Slack / Discord, 13+ platforms), persistent context memory, and a Soul personality system, and the agent goes from a disposable tool to an always-on personal AI assistant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/shareAI-lab/claw0" rel="noopener noreferrer"&gt;claw0&lt;/a&gt;&lt;/strong&gt; is our companion teaching repo that deconstructs these harness mechanisms from scratch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;claw agent = agent core + heartbeat + cron + IM chat + memory + soul
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;learn-claude-code                   claw0
(agent harness core:                (proactive always-on harness:
 loop, tools, planning,              heartbeat, cron, IM channels,
 teams, worktree isolation)          memory, soul personality)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  About
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fuser-attachments%2Fassets%2Ffe8b852b-97da-4061-a467-9694906b5edf" class="article-body-image-wrapper"&gt;&lt;img width="1280" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fuser-attachments%2Fassets%2Ffe8b852b-97da-4061-a467-9694906b5edf" height="1280"&gt;&lt;/a&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Scan with WeChat to follow us,&lt;br&gt;
or follow on X: &lt;a href="https://x.com/baicai003" rel="noopener noreferrer"&gt;shareAI-Lab&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Agency comes from the model. The harness makes agency real. Build great harnesses. The model will do the rest.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bash is all you need. Real agents are all the universe needs.&lt;/strong&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>🛠️ Harness Engineering — Quick Actionable Guide 🤖</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Mon, 20 Apr 2026 07:09:25 +0000</pubDate>
      <link>https://dev.to/truongpx396/harness-engineering-quick-actionable-guide-2b93</link>
      <guid>https://dev.to/truongpx396/harness-engineering-quick-actionable-guide-2b93</guid>
      <description>&lt;p&gt;&lt;em&gt;Distilled from the &lt;a href="https://github.com/walkinglabs/learn-harness-engineering" rel="noopener noreferrer"&gt;Learn Harness Engineering&lt;/a&gt; course by WalkingLabs, which synthesizes harness engineering theory and practice from &lt;a href="https://openai.com/index/harness-engineering/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;Anthropic&lt;/a&gt;, and industry practitioners.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The model is smart. The harness makes it reliable.&lt;/strong&gt;&lt;br&gt;
— A well-harnessed agent spent \$200/6hrs and built a working game. Without a harness, same model spent \$9/20min and produced garbage. The model didn't change. The harness did.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;📊 Hard numbers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📉 SWE-bench Verified top agents: ~50-60% pass rate — on &lt;em&gt;curated&lt;/em&gt; tasks with clear descriptions. In real repos? Even lower.&lt;/li&gt;
&lt;li&gt;🏗️ OpenAI's million-line experiment: 3 engineers + Codex → ~1,500 PRs over 5 months, ~3.5 PRs/person/day — but only after investing heavily in harness infrastructure.&lt;/li&gt;
&lt;li&gt;📄 A team added &lt;code&gt;AGENTS.md&lt;/code&gt; to a FastAPI project: same model went from failing all 3 runs to succeeding all 3 runs, with ~60% better context efficiency.&lt;/li&gt;
&lt;li&gt;🚀 A TypeScript + React project went from 20% success rate (bare repo) → 100% (full harness). Model never changed.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📑 Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🔍 What Is Harness Engineering?&lt;/li&gt;
&lt;li&gt;🧩 The 5 Subsystems of a Harness&lt;/li&gt;
&lt;li&gt;🚀 Quick Start: Your Minimal Harness (Do This Today)&lt;/li&gt;
&lt;li&gt;
📐 The 12 Core Principles

&lt;ul&gt;
&lt;li&gt;Principle 1: Strong Models ≠ Reliable Execution&lt;/li&gt;
&lt;li&gt;Principle 2: A Harness Is 5 Subsystems, Not a Better Prompt&lt;/li&gt;
&lt;li&gt;Principle 3: The Repo Is the Single Source of Truth&lt;/li&gt;
&lt;li&gt;Principle 4: Split Instructions — Don't Use One Giant File&lt;/li&gt;
&lt;li&gt;Principle 5: Persist Context Across Sessions&lt;/li&gt;
&lt;li&gt;Principle 6: Initialize Before Every Session&lt;/li&gt;
&lt;li&gt;Principle 7: One Feature at a Time — No Overreach (WIP=1)&lt;/li&gt;
&lt;li&gt;Principle 8: Feature Lists Are Harness Primitives&lt;/li&gt;
&lt;li&gt;Principle 9: Don't Let Agents Declare Victory Early&lt;/li&gt;
&lt;li&gt;Principle 10: Only Full-Pipeline Verification Counts&lt;/li&gt;
&lt;li&gt;Principle 11: Make the Agent's Runtime Observable&lt;/li&gt;
&lt;li&gt;Principle 12: Every Session Must Leave a Clean State&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;🔄 The Agent Session Lifecycle&lt;/li&gt;

&lt;li&gt;⚖️ Without Harness vs. With Harness&lt;/li&gt;

&lt;li&gt;📝 Additional Templates Worth Adding&lt;/li&gt;

&lt;li&gt;🎯 Key Takeaways&lt;/li&gt;

&lt;li&gt;🔬 How to Diagnose Harness Quality&lt;/li&gt;

&lt;li&gt;📚 References&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔍 What Is Harness Engineering?
&lt;/h2&gt;

&lt;p&gt;Harness engineering is &lt;strong&gt;building a complete working environment around an AI coding agent&lt;/strong&gt; so it produces reliable results. It's NOT about writing better prompts. It's about designing the &lt;strong&gt;system&lt;/strong&gt; the model operates inside.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You → give task → Agent reads harness files → Agent executes
                                                |
                                      harness governs every step:
                                      ├── Instructions: what to do, in what order
                                      ├── Scope: one feature at a time
                                      ├── State: progress log, feature list, git history
                                      ├── Verification: tests, lint, type-check
                                      └── Lifecycle: init at start, clean state at end
                                                |
                                                v
                                      Agent stops ONLY when
                                      verification passes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧩 The 5 Subsystems of a Harness
&lt;/h2&gt;

&lt;p&gt;Every effective harness has exactly five parts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Subsystem&lt;/th&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;th&gt;Key Files&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;📋 &lt;strong&gt;Instructions&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Tell the agent what to do, in what order, what to read first&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;AGENTS.md&lt;/code&gt;, &lt;code&gt;CLAUDE.md&lt;/code&gt;, &lt;code&gt;docs/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;💾 &lt;strong&gt;State&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Track what's done, in-progress, and next. Persisted to disk so the next session picks up exactly where the last left off&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;claude-progress.md&lt;/code&gt;, &lt;code&gt;feature_list.json&lt;/code&gt;, &lt;code&gt;git log&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Verification&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Only passing tests count as evidence. Agent cannot declare victory without proof&lt;/td&gt;
&lt;td&gt;tests, lint, type-check, smoke runs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;🎯 &lt;strong&gt;Scope&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Constrain agent to ONE feature at a time. No overreach. No half-finishing three things&lt;/td&gt;
&lt;td&gt;feature boundaries, definition of done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;🔄 &lt;strong&gt;Session Lifecycle&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Initialize at start. Clean up at end. Leave a clean restart path&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;init.sh&lt;/code&gt;, handoff notes, clean commits&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🚀 Quick Start: Your Minimal Harness (Do This Today)
&lt;/h2&gt;

&lt;p&gt;Drop these 4 files into your project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;YOUR PROJECT ROOT
├── AGENTS.md              ← the agent's operating manual
├── init.sh                ← runs install + verify + health check
├── feature_list.json      ← what features exist, which are done
├── claude-progress.md     ← what happened each session
└── src/                   ← your actual code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  1. &lt;code&gt;AGENTS.md&lt;/code&gt; — The Operating Manual
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Agent Instructions&lt;/span&gt;

&lt;span class="gu"&gt;## Before Starting Any Work&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Run &lt;span class="sb"&gt;`./init.sh`&lt;/span&gt; to verify environment health
&lt;span class="p"&gt;2.&lt;/span&gt; Read &lt;span class="sb"&gt;`claude-progress.md`&lt;/span&gt; for context from last session
&lt;span class="p"&gt;3.&lt;/span&gt; Read &lt;span class="sb"&gt;`feature_list.json`&lt;/span&gt; to see what's done and what's next
&lt;span class="p"&gt;4.&lt;/span&gt; Check &lt;span class="sb"&gt;`git log --oneline -10`&lt;/span&gt; for recent changes

&lt;span class="gu"&gt;## Rules&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Work on exactly ONE feature at a time
&lt;span class="p"&gt;-&lt;/span&gt; Never declare "done" without passing tests
&lt;span class="p"&gt;-&lt;/span&gt; Run the full test suite before committing
&lt;span class="p"&gt;-&lt;/span&gt; Update &lt;span class="sb"&gt;`claude-progress.md`&lt;/span&gt; after every session
&lt;span class="p"&gt;-&lt;/span&gt; Update &lt;span class="sb"&gt;`feature_list.json`&lt;/span&gt; when a feature status changes
&lt;span class="p"&gt;-&lt;/span&gt; Commit only when the project is in a clean, resumable state

&lt;span class="gu"&gt;## Verification Checklist&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [ ] All tests pass
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Linter passes
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Type-check passes
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Feature works as specified
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. &lt;code&gt;init.sh&lt;/code&gt; — Environment Health Check
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== Installing dependencies ==="&lt;/span&gt;
npm &lt;span class="nb"&gt;install
echo&lt;/span&gt; &lt;span class="s2"&gt;"=== Running tests ==="&lt;/span&gt;
npm &lt;span class="nb"&gt;test
echo&lt;/span&gt; &lt;span class="s2"&gt;"=== Type checking ==="&lt;/span&gt;
npx tsc &lt;span class="nt"&gt;--noEmit&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== Environment healthy ==="&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. &lt;code&gt;feature_list.json&lt;/code&gt; — Machine-Readable Scope
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"features"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"User login"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"done"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"src/auth.test.ts"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F002"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Document import"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"in-progress"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"src/import.test.ts"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F003"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"not-started"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. &lt;code&gt;claude-progress.md&lt;/code&gt; — Session Memory
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Progress Log&lt;/span&gt;

&lt;span class="gu"&gt;## Session 3 — 2026-04-20&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Completed: F001 (user login) — all tests passing
&lt;span class="p"&gt;-&lt;/span&gt; In progress: F002 (document import) — parser done, validation pending
&lt;span class="p"&gt;-&lt;/span&gt; Blocked: none
&lt;span class="p"&gt;-&lt;/span&gt; Next session should: finish F002 validation logic, then run full test suite
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  📐 The 12 Core Principles
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Principle 1: Strong Models ≠ Reliable Execution
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Models ace benchmarks but fail on real multi-file engineering tasks.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Real tasks need multi-step coordination, not one-shot answers. Agents fail at five specific layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Task specification&lt;/strong&gt; — vague requirements → agent guesses wrong&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context provision&lt;/strong&gt; — implicit conventions not written down → agent violates them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution environment&lt;/strong&gt; — missing deps, wrong versions → agent wastes context on setup&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verification feedback&lt;/strong&gt; — no tests → agent says "done" when it's not&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State management&lt;/strong&gt; — no progress tracking → next session starts from zero&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Action:&lt;/strong&gt; When things fail, &lt;strong&gt;fix the harness first, not the model.&lt;/strong&gt; Attribute every failure to one of these five layers. One &lt;code&gt;AGENTS.md&lt;/code&gt; file might be more effective than upgrading to a more expensive model.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Diagnostic Loop:&lt;/strong&gt; Execute → observe failure → attribute to a specific harness layer → fix that layer → re-execute. After a few rounds, your harness gets stronger and agent performance stabilizes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Principle 2: A Harness Is 5 Subsystems, Not a Better Prompt
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; People think "better prompt = better results."&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Prompts don't persist state, verify work, or control scope.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Implement all 5 subsystems: instructions, state, verification, scope, lifecycle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real data:&lt;/strong&gt; A team added subsystems one at a time to a TypeScript + React project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stage 1 (bare repo): 20% success rate&lt;/li&gt;
&lt;li&gt;Stage 2 (+AGENTS.md): 60%&lt;/li&gt;
&lt;li&gt;Stage 3 (+verification commands): 80%&lt;/li&gt;
&lt;li&gt;Stage 4 (+progress files): 80-100%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The feedback/verification subsystem has the lowest investment and highest ROI.&lt;/strong&gt; Start there.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Constrain, don't micromanage.&lt;/strong&gt; Use executable rules, not step-by-step instructions. OpenAI: "enforce invariants, don't micromanage implementation."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Harness debt is real.&lt;/strong&gt; Harness rots like code does. Audit regularly — remove outdated rules, update stale docs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Principle 3: The Repo Is the Single Source of Truth
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; If the agent can't see it in the repo, it doesn't exist. Your Slack history, Jira tickets, Confluence pages, verbal agreements — the agent sees &lt;strong&gt;none&lt;/strong&gt; of it.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Agents don't remember conversations. They read files. They can't ask a colleague.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Put everything the agent needs INTO the repo: instructions, progress, feature definitions, docs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cold-Start Test&lt;/strong&gt; — open a fresh agent session (no verbal context) and see if it can answer:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What is this system?&lt;/li&gt;
&lt;li&gt;How is it organized?&lt;/li&gt;
&lt;li&gt;How do I run it?&lt;/li&gt;
&lt;li&gt;How do I verify it?&lt;/li&gt;
&lt;li&gt;What's the current progress?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If it can't answer → your repo has blind spots. Where the map is blank, the agent guesses — and guessing creates bugs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ACID Principles for Agent State:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Atomicity&lt;/strong&gt; — Each logical operation gets one git commit. If it fails midway, &lt;code&gt;git stash&lt;/code&gt; to roll back.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt; — All tests pass, lint reports zero errors before committing. Inconsistent states don't get committed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Isolation&lt;/strong&gt; — Multiple agents use separate progress files or git branches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durability&lt;/strong&gt; — Cross-session knowledge must be persisted to files. What's only in memory doesn't count.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Place knowledge near code.&lt;/strong&gt; A 50-line &lt;code&gt;ARCHITECTURE.md&lt;/code&gt; in &lt;code&gt;src/api/&lt;/code&gt; is more useful than a 500-page Confluence doc nobody maintains.&lt;/p&gt;
&lt;h3&gt;
  
  
  Principle 4: Split Instructions — Don't Use One Giant File
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; One massive instruction file overwhelms the agent's context. A team's &lt;code&gt;AGENTS.md&lt;/code&gt; grew from 50 → 600 lines. Their agent success rate dropped from 72% to 45%.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Three killers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lost in the Middle Effect&lt;/strong&gt; (Liu et al., 2023): LLMs use info in the middle of long texts far less effectively than at the beginning or end. A critical rule at line 300 of 600 will likely be ignored.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context budget eaten alive&lt;/strong&gt;: A 600-line file consumes 10-20K tokens before the agent even starts working.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority conflicts&lt;/strong&gt;: Hard constraints, soft guidelines, and historical notes all look identical. The agent can't distinguish.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Action:&lt;/strong&gt; Keep &lt;code&gt;AGENTS.md&lt;/code&gt; at &lt;strong&gt;50-200 lines&lt;/strong&gt; as a routing file. Split details into topic docs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AGENTS.md (50-200 lines — overview + hard constraints + links)
├── docs/api-patterns.md        ← read when adding endpoints
├── docs/database-rules.md      ← read when modifying DB operations  
├── docs/testing-standards.md   ← reference when writing tests
└── docs/deployment.md          ← read only for deployment tasks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rules for the entry file:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Put critical rules at the &lt;strong&gt;top or bottom&lt;/strong&gt;, never the middle&lt;/li&gt;
&lt;li&gt;Max 15 non-negotiable hard constraints&lt;/li&gt;
&lt;li&gt;Every instruction should have: a source (why), applicability (when), and expiry (when to remove)&lt;/li&gt;
&lt;li&gt;Audit regularly — delete outdated rules like you delete unused dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; After splitting, a team's success rate went from 45% → 72%, and security constraint compliance went from 60% → 95%.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 5: Persist Context Across Sessions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Session 2 starts fresh. Agent has no memory of Session 1. Tasks over 30 minutes see failure rates spike sharply without state.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Without persisted state, the agent re-does work, reverses deliberate decisions, or drifts from requirements like a game of telephone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context Anxiety&lt;/strong&gt; (discovered by Anthropic): When agents sense context is running low, they rush — skipping verification, choosing simple solutions over correct ones. Like guessing on remaining exam questions when time is almost up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Action:&lt;/strong&gt; Use three continuity artifacts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;claude-progress.md&lt;/code&gt;&lt;/strong&gt; — What's done, what's in progress, what's blocked, next steps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;DECISIONS.md&lt;/code&gt;&lt;/strong&gt; — Why option B was chosen over A, with date and reasoning. Prevents the next session from reversing deliberate choices.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git commits as checkpoints&lt;/strong&gt; — Commit after each atomic unit of work with clear messages explaining what and why.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Key metric: Rebuild Cost&lt;/strong&gt; — how long a new session takes to reach an executable state. Good harnesses: ~3 minutes. Bad harnesses: 15-20 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real data:&lt;/strong&gt; A 12-feature blog system over 5 sessions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Without progress files: 58% features completed, 43% hidden defect rate&lt;/li&gt;
&lt;li&gt;With progress files: 100% features completed, 8% hidden defect rate, rebuild time reduced ~78%
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WITHOUT STATE                        WITH STATE
Session 1: does work                 Session 1: does work, writes progress
Session 2: starts from zero          Session 2: reads progress, continues
Result: rework &amp;amp; drift               Result: steady forward progress
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Principle 6: Initialize Before Every Session
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agent starts coding in a broken environment (missing deps, failing tests). Code written before the test framework is configured = code without verification.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Initialization and implementation have different goals. Mixing them = doing both poorly. Anthropic's data: dedicated initialization phase → &lt;strong&gt;31% higher feature completion&lt;/strong&gt; in multi-session scenarios.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Treat initialization as a &lt;strong&gt;separate phase&lt;/strong&gt;. The first session does ONLY initialization — no feature code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bootstrap Contract&lt;/strong&gt; — initialization is complete when four conditions are met:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;✅ Can start (&lt;code&gt;make setup&lt;/code&gt; succeeds)&lt;/li&gt;
&lt;li&gt;✅ Can test (at least one example test passes)&lt;/li&gt;
&lt;li&gt;✅ Can see progress (task breakdown file exists)&lt;/li&gt;
&lt;li&gt;✅ Can pick up next steps (progress file is readable)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Warm start &amp;gt;&amp;gt; Cold start.&lt;/strong&gt; Use project templates to preset standard structure. Starting from a template is 10x better than starting from an empty directory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Time invested in initialization is fully recovered in the next 3-4 sessions.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Principle 7: One Feature at a Time — No Overreach (WIP=1)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agents try to do 3 things at once, finish none of them properly. Context capacity C divided by k tasks = each task gets C/k attention. When that drops below minimum, nothing finishes.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Overreach and under-finish amplify each other. More code written ≠ more features completed — Anthropic found they're &lt;strong&gt;negatively correlated&lt;/strong&gt;. Agents using "small next step" (WIP=1) show &lt;strong&gt;37% higher task completion&lt;/strong&gt;.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Enforce WIP=1 (Work-in-Progress Limit from Kanban). Write it explicitly in &lt;code&gt;AGENTS.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Work Rules&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Work on one feature at a time
&lt;span class="p"&gt;-&lt;/span&gt; Only start the next feature after the current one passes end-to-end verification
&lt;span class="p"&gt;-&lt;/span&gt; Don't "also refactor" feature B while implementing feature A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real data:&lt;/strong&gt; REST API with 8 features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unconstrained: 5 features activated, ~800 lines across 12 files, 20% end-to-end pass → 37.5% completion by session 3&lt;/li&gt;
&lt;li&gt;WIP=1: 1 feature at a time, ~200 lines across 4 files, 100% pass → 87.5% completion by session 4&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;"Do less but finish" always beats "do more but leave half-done."&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 8: Feature Lists Are Harness Primitives
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Vague scope = vague results. "Shopping cart mostly done" — what does "mostly" mean? Which tests passed? Nobody knows.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Feature lists aren't memos — they're the &lt;strong&gt;backbone&lt;/strong&gt; of the harness. The scheduler, verifier, and handoff reporter all depend on them.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Every feature entry must be a &lt;strong&gt;triple&lt;/strong&gt;: &lt;code&gt;(behavior description, verification command, current state)&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F03"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"behavior"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"POST /cart/items with {product_id, quantity} returns 201"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"verification"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"curl -X POST localhost:3000/api/cart/items -d '{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;product_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:1}' | jq .status"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"not_started"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;State machine&lt;/strong&gt; — four states, controlled by the harness:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;not_started&lt;/code&gt; → &lt;code&gt;active&lt;/code&gt; (agent picks it up)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;active&lt;/code&gt; → &lt;code&gt;passing&lt;/code&gt; (only when verification command succeeds — &lt;strong&gt;pass-state gating&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;active&lt;/code&gt; → &lt;code&gt;blocked&lt;/code&gt; (dependency issue)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;passing&lt;/code&gt; is &lt;strong&gt;irreversible&lt;/strong&gt; — once verified, it stays verified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Granularity rule:&lt;/strong&gt; Each feature should be completable in one session. "User can add items to cart" = good. "Implement the shopping cart" = too broad. "Create the name field on the Cart model" = too narrow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Structured feature lists show &lt;strong&gt;45% higher completion rate&lt;/strong&gt; than free-form tracking, with zero duplicate implementations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 9: Don't Let Agents Declare Victory Early
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agent says "done!" but tests fail, edge cases are broken, or code doesn't compile. Anthropic found that &lt;strong&gt;agents confidently praise their own work&lt;/strong&gt; — you must separate "the person who does the work" from "the person who checks the work."&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; &lt;strong&gt;Verification Gap&lt;/strong&gt; = the gap between the agent's confidence and actual correctness. This is the #1 failure mode.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Write an explicit &lt;strong&gt;Definition of Done&lt;/strong&gt; for every task. Not "add a search feature" but:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Completion criteria:
&lt;span class="p"&gt;-&lt;/span&gt; New endpoint GET /api/search?q=xxx
&lt;span class="p"&gt;-&lt;/span&gt; Supports pagination, default 20 items
&lt;span class="p"&gt;-&lt;/span&gt; Results include highlighted snippets
&lt;span class="p"&gt;-&lt;/span&gt; All new code passes pytest
&lt;span class="p"&gt;-&lt;/span&gt; Type checking passes (mypy --strict)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"Done" = verification passes. "The code looks fine" does NOT count. &lt;code&gt;curl returns 201&lt;/code&gt; DOES count.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 10: Only Full-Pipeline Verification Counts
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agent runs one unit test. Claims everything works.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Partial verification misses integration issues, type errors, and regressions.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Require the full pipeline: &lt;code&gt;tests + lint + type-check + build + smoke run&lt;/code&gt;. All must pass.&lt;/p&gt;
&lt;h3&gt;
  
  
  Principle 11: Make the Agent's Runtime Observable
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; You can't fix what you can't see. Missing observability wastes &lt;strong&gt;30-50% of session time&lt;/strong&gt; on redundant diagnosis.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Without observability: agents can't distinguish "correct" from "looks correct," retries become blind guesses, and evaluation becomes subjective.&lt;br&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Build &lt;strong&gt;two layers&lt;/strong&gt; of observability:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1 — Runtime signals:&lt;/strong&gt; Application lifecycle, feature path execution, errors with full context. The harness collects these automatically — don't rely on the agent to log its own actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2 — Process observability:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sprint contracts&lt;/strong&gt; — before each task, define: what to change, what NOT to change, pass criteria, exclusions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evaluator rubrics&lt;/strong&gt; — turn "is it good?" into structured scoring:&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;A&lt;/th&gt;
&lt;th&gt;B&lt;/th&gt;
&lt;th&gt;C&lt;/th&gt;
&lt;th&gt;D&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Code correctness&lt;/td&gt;
&lt;td&gt;All tests pass&lt;/td&gt;
&lt;td&gt;Main flow passes&lt;/td&gt;
&lt;td&gt;Partial pass&lt;/td&gt;
&lt;td&gt;Build fails&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test coverage&lt;/td&gt;
&lt;td&gt;Main + edge cases&lt;/td&gt;
&lt;td&gt;Main flow only&lt;/td&gt;
&lt;td&gt;Skeleton only&lt;/td&gt;
&lt;td&gt;No tests&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Real data:&lt;/strong&gt; A "dark mode" task — without observability: 3-4 blind retries, 45 minutes. With sprint contract + rubric: 1 iteration, 15 minutes. &lt;strong&gt;3x efficiency.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Principle 12: Every Session Must Leave a Clean State
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agent leaves half-committed code, broken tests, uncommitted changes. "Clean up later" = never clean up.&lt;br&gt;
&lt;strong&gt;Why:&lt;/strong&gt; Entropy grows by default (Lehman's Laws). Without cleanup, a 12-week project degrades:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Week 1&lt;/th&gt;
&lt;th&gt;Week 12 (no cleanup)&lt;/th&gt;
&lt;th&gt;Week 12 (with cleanup)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Build pass rate&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;68%&lt;/td&gt;
&lt;td&gt;97%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test pass rate&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;61%&lt;/td&gt;
&lt;td&gt;95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session startup&lt;/td&gt;
&lt;td&gt;5 min&lt;/td&gt;
&lt;td&gt;60+ min&lt;/td&gt;
&lt;td&gt;9 min&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Action:&lt;/strong&gt; Session completion = task passes verification &lt;strong&gt;AND&lt;/strong&gt; clean state check passes. Five dimensions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Session Exit Checklist&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Build passes (npm run build)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] All tests pass (npm test)  
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Feature list + progress updated
&lt;span class="p"&gt;-&lt;/span&gt; [ ] No debug code remaining (console.log, debugger, TODO)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Standard startup path works (npm run dev)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Quality Document&lt;/strong&gt; — maintain an active scorecard for each module (A/B/C/D). New sessions read it and know where to prioritize.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Periodically simplify the harness.&lt;/strong&gt; As models improve, some constraints become unnecessary overhead. Monthly: disable one harness component, run benchmarks. If results don't degrade → remove it permanently.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔄 The Agent Session Lifecycle (Follow This Every Time)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;START
  1. Agent reads AGENTS.md
  2. Agent runs init.sh (install, verify, health check)
  3. Agent reads claude-progress.md (what happened last time)
  4. Agent reads feature_list.json (what's done, what's next)
  5. Agent checks git log (recent changes)

SELECT
  6. Agent picks exactly ONE unfinished feature
  7. Agent works ONLY on that feature

EXECUTE
  8. Agent implements the feature
  9. Agent runs verification (tests, lint, type-check)
  10. If verification fails → fix and re-run
  11. If verification passes → record evidence

WRAP UP
  12. Agent updates claude-progress.md
  13. Agent updates feature_list.json
  14. Agent records what's still broken or unverified
  15. Agent commits (only when safe to resume)
  16. Agent leaves clean restart path for next session
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚖️ Without Harness vs. With Harness
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Without Harness&lt;/th&gt;
&lt;th&gt;With Harness&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;Session start&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Agent starts fresh, no context&lt;/td&gt;
&lt;td&gt;Agent reads progress, picks up where it left off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;Scope&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Agent does random things&lt;/td&gt;
&lt;td&gt;Agent works on one specific feature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;"Done"&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Agent says "looks good"&lt;/td&gt;
&lt;td&gt;Tests pass, lint clean, types check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;Session end&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Half-committed mess&lt;/td&gt;
&lt;td&gt;Clean state, progress logged, ready for next session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;Your role&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Rescue &amp;amp; cleanup&lt;/td&gt;
&lt;td&gt;Review &amp;amp; approve&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;Result&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;You spend more time fixing than if you did it yourself&lt;/td&gt;
&lt;td&gt;Agent does the work, you verify the result&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  📝 Additional Templates Worth Adding
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;DECISIONS.md&lt;/code&gt; — Prevent Next Session From Reversing Your Choices
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Design Decisions&lt;/span&gt;

&lt;span class="gu"&gt;## 2026-04-15: Use Redis for user preferences caching&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Reason: High read frequency (every API call), small data size
&lt;span class="p"&gt;-&lt;/span&gt; Rejected: PostgreSQL materialized view (high change frequency)
&lt;span class="p"&gt;-&lt;/span&gt; Constraint: Cache TTL of 5 minutes, active invalidation on write

&lt;span class="gu"&gt;## 2026-04-18: Use Vitest over Jest&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Reason: Native ESM support, faster execution
&lt;span class="p"&gt;-&lt;/span&gt; Constraint: All test files use .test.ts extension
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sprint Contract — For Complex Tasks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Sprint Contract: Dark Mode Support&lt;/span&gt;

&lt;span class="gu"&gt;## Scope&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Modify the theme toggle component
&lt;span class="p"&gt;-&lt;/span&gt; Update global CSS variables
&lt;span class="p"&gt;-&lt;/span&gt; Add dark mode tests

&lt;span class="gu"&gt;## Verification Standards&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Visual regression tests pass
&lt;span class="p"&gt;-&lt;/span&gt; Main flow E2E tests pass
&lt;span class="p"&gt;-&lt;/span&gt; No flash of unstyled content (FOUC)

&lt;span class="gu"&gt;## Exclusions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; NOT handling print styles
&lt;span class="p"&gt;-&lt;/span&gt; NOT handling third-party component dark mode
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Richer &lt;code&gt;feature_list.json&lt;/code&gt; — With Verification Evidence
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"features"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"behavior"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"POST /api/register with {email, password} returns 201"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"verification"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"curl -X POST /api/register -d '{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;email&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;test@example.com&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}' | jq .status"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"passing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"evidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"commit abc123, test output log"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"F02"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"behavior"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GET /api/search?q=xxx returns paginated results (default 20)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"verification"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pytest tests/test_search.py -x"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"evidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎯 Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;🔧 &lt;strong&gt;The harness doesn't make the model smarter — it makes its output reliable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;📂 &lt;strong&gt;Everything the agent needs must live in the repo&lt;/strong&gt; (if it can't see it, it doesn't exist)&lt;/li&gt;
&lt;li&gt;🎯 &lt;strong&gt;One feature at a time&lt;/strong&gt; — scope is the most underrated lever&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Never trust "done" without verification evidence&lt;/strong&gt; — tests must pass&lt;/li&gt;
&lt;li&gt;🧹 &lt;strong&gt;Every session must leave a clean state&lt;/strong&gt; — the next session's success depends on it&lt;/li&gt;
&lt;li&gt;💾 &lt;strong&gt;State must persist to disk&lt;/strong&gt; — memory dies between sessions, files don't&lt;/li&gt;
&lt;li&gt;🔄 &lt;strong&gt;Initialize before work, verify during work, clean up after work&lt;/strong&gt; — this is the lifecycle&lt;/li&gt;
&lt;li&gt;🔍 &lt;strong&gt;When things fail, fix the harness first&lt;/strong&gt; — attribute failures to one of five layers, fix that layer, re-run&lt;/li&gt;
&lt;li&gt;1️⃣ &lt;strong&gt;WIP=1&lt;/strong&gt; — finish one feature before starting the next. Less code, more completed features&lt;/li&gt;
&lt;li&gt;⚠️ &lt;strong&gt;Harness debt is real&lt;/strong&gt; — audit and simplify regularly. Rules that helped last month may be unnecessary overhead today&lt;/li&gt;
&lt;li&gt;🏁 &lt;strong&gt;"Do less but finish" beats "do more but leave half-done"&lt;/strong&gt; — always&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🔬 How to Diagnose Harness Quality
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Isometric Model Control:&lt;/strong&gt; Keep the model fixed. Remove harness components one at a time. Measure which removal causes the biggest performance drop. That's your bottleneck — focus your effort there.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Remove this...&lt;/th&gt;
&lt;th&gt;If performance drops significantly...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AGENTS.md&lt;/td&gt;
&lt;td&gt;Your instructions are critical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verification commands&lt;/td&gt;
&lt;td&gt;Your feedback loop is carrying the weight&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Progress files&lt;/td&gt;
&lt;td&gt;Your state management is essential&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feature list&lt;/td&gt;
&lt;td&gt;Your scope control is doing the heavy lifting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;init.sh&lt;/td&gt;
&lt;td&gt;Your initialization is preventing cascading failures&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  📚 References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/harness-engineering/" rel="noopener noreferrer"&gt;OpenAI: Harness Engineering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;Anthropic: Effective Harnesses for Long-Running Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/engineering/harness-design-long-running-apps" rel="noopener noreferrer"&gt;Anthropic: Harness Design for Long-Running Apps&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.langchain.com/the-anatomy-of-an-agent-harness/" rel="noopener noreferrer"&gt;LangChain: The Anatomy of an Agent Harness&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html" rel="noopener noreferrer"&gt;Thoughtworks: Harness Engineering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.humanlayer.dev/blog/skill-issue-harness-engineering-for-coding-agents" rel="noopener noreferrer"&gt;HumanLayer: Skill Issue — Harness Engineering for Coding Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2307.03172" rel="noopener noreferrer"&gt;Lost in the Middle: How Language Models Use Long Contexts (Liu et al., 2023)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://walkinglabs.github.io/learn-harness-engineering/en/" rel="noopener noreferrer"&gt;Course Website&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;a href="https://github.com/walkinglabs/learn-harness-engineering" rel="noopener noreferrer"&gt;GitHub Repo&lt;/a&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
      <category>webdev</category>
    </item>
    <item>
      <title>🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Thu, 16 Apr 2026 08:21:39 +0000</pubDate>
      <link>https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-mgc</link>
      <guid>https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-mgc</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf" class="crayons-story__hidden-navigation-link"&gt;🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/truongpx396" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" alt="truongpx396 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/truongpx396" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Truong Phung
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Truong Phung
                
              
              &lt;div id="story-author-preview-content-3503756" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/truongpx396" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Truong Phung&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Apr 15&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf" id="article-link-3503756"&gt;
          🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agents"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agents&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/llm"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;llm&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/programming"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;programming&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            20 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>🏗️ 📐 Harness Engineering: The Emerging Discipline of Making AI Agents Reliable 🤖</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Wed, 15 Apr 2026 08:44:19 +0000</pubDate>
      <link>https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf</link>
      <guid>https://dev.to/truongpx396/harness-engineering-the-emerging-discipline-of-making-ai-agents-reliable-42gf</guid>
      <description>&lt;p&gt;&lt;em&gt;A comprehensive guide to the practice of shaping the environment around AI agents so they can work dependably — based on references from the &lt;a href="https://github.com/walkinglabs/awesome-harness-engineering" rel="noopener noreferrer"&gt;Awesome Harness Engineering&lt;/a&gt; collection.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What Is Harness Engineering?&lt;/li&gt;
&lt;li&gt;Why It Matters Now&lt;/li&gt;
&lt;li&gt;The Core Equation: Agent = Model + Harness&lt;/li&gt;
&lt;li&gt;Foundations &amp;amp; Key Mental Models&lt;/li&gt;
&lt;li&gt;Context Engineering: The Working Memory Budget&lt;/li&gt;
&lt;li&gt;Constraints, Guardrails &amp;amp; Safe Autonomy&lt;/li&gt;
&lt;li&gt;Specs, Agent Files &amp;amp; Workflow Design&lt;/li&gt;
&lt;li&gt;Evals &amp;amp; Observability&lt;/li&gt;
&lt;li&gt;Runtimes, Harnesses &amp;amp; Reference Implementations&lt;/li&gt;
&lt;li&gt;Benchmarks: Measuring Harness Quality&lt;/li&gt;
&lt;li&gt;Practical Playbook: Engineering Your Own Harness&lt;/li&gt;
&lt;li&gt;The Future of Harness Engineering&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;li&gt;References &amp;amp; Further Reading&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  1. What Is Harness Engineering?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Harness engineering&lt;/strong&gt; is the practice of designing, building, and iterating on the environment, tooling, constraints, and feedback loops that surround an AI agent — everything that isn't the model itself. The term gained widespread traction in early 2026, popularized by field reports from OpenAI, Anthropic, LangChain, Thoughtworks, and HumanLayer, all converging on the same insight: &lt;em&gt;the reliability of an AI agent depends less on the model and more on the system wrapped around it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;As &lt;a href="https://blog.langchain.com/the-anatomy-of-an-agent-harness/" rel="noopener noreferrer"&gt;LangChain's Vivek Trivedy&lt;/a&gt; crystallized it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Agent = Model + Harness.&lt;/strong&gt; If you're not the model, you're the harness.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A harness includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System prompts&lt;/strong&gt; — the instructions that shape the agent's persona and constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools, skills, and MCP servers&lt;/strong&gt; — capabilities the agent can invoke&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bundled infrastructure&lt;/strong&gt; — filesystem, sandboxes, browsers, observability stacks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orchestration logic&lt;/strong&gt; — sub-agent spawning, handoffs, model routing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hooks and middleware&lt;/strong&gt; — deterministic control flow for compaction, continuation, lint checks, and verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory and state management&lt;/strong&gt; — progress files, git history, structured knowledge bases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A raw model is &lt;em&gt;not&lt;/em&gt; an agent. It becomes one only when a harness gives it state, tool execution, feedback loops, and enforceable constraints. Harness engineering is the discipline of making all of that work well.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Why It Matters Now
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The "Skill Issue" Realization
&lt;/h3&gt;

&lt;p&gt;As &lt;a href="https://www.humanlayer.dev/blog/skill-issue-harness-engineering-for-coding-agents" rel="noopener noreferrer"&gt;HumanLayer argued&lt;/a&gt;, teams that blame weak agent results on model limitations are usually wrong. After hundreds of agent sessions across dozens of projects, the pattern is consistent:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It's not a model problem. It's a configuration problem.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every time a team instinctively says "GPT-6 will fix it" or "we just need better instruction-following," the real fix is almost always in the harness — better context management, smarter tool selection, proper verification loops, or cleaner handoff artifacts.&lt;/p&gt;

&lt;h3&gt;
  
  
  The OpenAI Proof Point
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://openai.com/index/harness-engineering/" rel="noopener noreferrer"&gt;OpenAI's flagship field report&lt;/a&gt; provided dramatic evidence. A three-person engineering team built and shipped an internal product with &lt;strong&gt;zero manually-written code&lt;/strong&gt; — roughly a million lines across application logic, tests, CI, documentation, and tooling — all generated by Codex agents. The team averaged 3.5 merged PRs per engineer per day, and Codex runs regularly worked autonomously for &lt;strong&gt;six hours or more&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The key insight was that early progress was &lt;em&gt;slower&lt;/em&gt; than expected — not because Codex was incapable, but because &lt;strong&gt;the environment was underspecified&lt;/strong&gt;. The primary job of human engineers became enabling agents to do useful work: building the harness.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Because the only way to make progress was to get Codex to do the work, human engineers always stepped into the task and asked: 'what capability is missing, and how do we make it both legible and enforceable for the agent?'"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Harness Changes Move Benchmarks
&lt;/h3&gt;

&lt;p&gt;LangChain demonstrated that &lt;a href="https://blog.langchain.com/improving-deep-agents-with-harness-engineering/" rel="noopener noreferrer"&gt;harness changes alone can significantly improve benchmark performance&lt;/a&gt; — moving their coding agent from Top 30 to Top 5 on Terminal-Bench 2.0 by only changing the harness, not the model. Anthropic showed that &lt;a href="https://www.anthropic.com/engineering/infrastructure-noise" rel="noopener noreferrer"&gt;infrastructure configuration can move coding benchmark scores by more than many leaderboard gaps&lt;/a&gt;. The implication is profound: benchmarks often measure harness quality as much as — or more than — model quality.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. The Core Equation: Agent = Model + Harness
&lt;/h2&gt;

&lt;p&gt;LangChain's &lt;a href="https://blog.langchain.com/the-anatomy-of-an-agent-harness/" rel="noopener noreferrer"&gt;Anatomy of an Agent Harness&lt;/a&gt; provides the clearest decomposition. Working backwards from what models &lt;em&gt;cannot&lt;/em&gt; do natively reveals why each harness component exists:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What We Want&lt;/th&gt;
&lt;th&gt;What Models Can't Do Natively&lt;/th&gt;
&lt;th&gt;Harness Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Persistent memory&lt;/td&gt;
&lt;td&gt;Maintain durable state across interactions&lt;/td&gt;
&lt;td&gt;Filesystem, git, progress files, AGENTS.md&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autonomous problem-solving&lt;/td&gt;
&lt;td&gt;Execute arbitrary code&lt;/td&gt;
&lt;td&gt;Bash tool, code execution sandboxes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real-time knowledge&lt;/td&gt;
&lt;td&gt;Access information beyond training cutoff&lt;/td&gt;
&lt;td&gt;Web search, MCP tools, Context7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Safe operation&lt;/td&gt;
&lt;td&gt;Understand risk boundaries&lt;/td&gt;
&lt;td&gt;Sandboxes, allow-lists, network isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-horizon coherence&lt;/td&gt;
&lt;td&gt;Work across multiple context windows&lt;/td&gt;
&lt;td&gt;Compaction, Ralph Loops, planning files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-verification&lt;/td&gt;
&lt;td&gt;Know if their work is correct&lt;/td&gt;
&lt;td&gt;Test runners, browser automation, linters&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The filesystem emerges as the most foundational harness primitive because it unlocks everything else: agents get a workspace, work can be incrementally persisted, and multiple agents can coordinate through shared files. Git adds versioning so agents can track work, rollback errors, and branch experiments.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Foundations &amp;amp; Key Mental Models
&lt;/h2&gt;

&lt;h3&gt;
  
  
  4.1 Feedforward and Feedback (Thoughtworks)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html" rel="noopener noreferrer"&gt;Birgitta Böckeler's framework at Thoughtworks&lt;/a&gt; provides the most rigorous mental model for harness engineering. She frames it through two control mechanisms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Guides (feedforward controls)&lt;/strong&gt; — anticipate the agent's behavior and steer it &lt;em&gt;before&lt;/em&gt; it acts. They increase the probability of good results on the first attempt. Examples: AGENTS.md files, architecture documentation, skills, coding conventions, reference applications.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sensors (feedback controls)&lt;/strong&gt; — observe &lt;em&gt;after&lt;/em&gt; the agent acts and help it self-correct. Most powerful when they produce signals optimized for LLM consumption. Examples: linters with custom error messages, test suites, code review agents, browser screenshots.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without guides, the agent keeps repeating mistakes. Without sensors, the agent encodes rules but never finds out whether they worked. A good harness requires both.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.2 Computational vs. Inferential
&lt;/h3&gt;

&lt;p&gt;Each control can be either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Computational&lt;/strong&gt; — deterministic and fast, run by the CPU. Tests, linters, type checkers, structural analysis. Milliseconds to seconds; results are reliable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inferential&lt;/strong&gt; — semantic analysis, AI code review, "LLM as judge." Slower, more expensive, non-deterministic — but capable of richer judgment.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Control&lt;/th&gt;
&lt;th&gt;Direction&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Coding conventions&lt;/td&gt;
&lt;td&gt;Feedforward&lt;/td&gt;
&lt;td&gt;Inferential&lt;/td&gt;
&lt;td&gt;AGENTS.md, Skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Structural tests&lt;/td&gt;
&lt;td&gt;Feedback&lt;/td&gt;
&lt;td&gt;Computational&lt;/td&gt;
&lt;td&gt;ArchUnit tests checking module boundaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code review agent&lt;/td&gt;
&lt;td&gt;Feedback&lt;/td&gt;
&lt;td&gt;Inferential&lt;/td&gt;
&lt;td&gt;A review skill using a strong model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bootstrap scripts&lt;/td&gt;
&lt;td&gt;Feedforward&lt;/td&gt;
&lt;td&gt;Both&lt;/td&gt;
&lt;td&gt;Skill with instructions and a bootstrap script&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code mods&lt;/td&gt;
&lt;td&gt;Feedforward&lt;/td&gt;
&lt;td&gt;Computational&lt;/td&gt;
&lt;td&gt;OpenRewrite recipes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  4.3 The Cybernetic Governor
&lt;/h3&gt;

&lt;p&gt;The harness acts as a &lt;strong&gt;cybernetic governor&lt;/strong&gt; — combining feedforward and feedback to regulate the codebase toward its desired state. Böckeler identifies three regulation dimensions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Maintainability harness&lt;/strong&gt; — internal code quality (linters, complexity checks, coverage). The most mature category with extensive pre-existing tooling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture fitness harness&lt;/strong&gt; — system characteristics (performance, observability, security). Essentially &lt;a href="https://www.thoughtworks.com/en-de/radar/techniques/architectural-fitness-function" rel="noopener noreferrer"&gt;architectural fitness functions&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Behaviour harness&lt;/strong&gt; — functional correctness. The hardest category: how do we verify that the application does what we need? This remains the elephant in the room.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  4.4 The Three Pillars (Thoughtworks)
&lt;/h3&gt;

&lt;p&gt;Thoughtworks frames harness work into three pillars:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Context engineering&lt;/strong&gt; — managing what the agent knows and when&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architectural constraints&lt;/strong&gt; — enforcing invariants mechanically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Garbage collection&lt;/strong&gt; — fighting entropy and drift continuously&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  4.5 Control–Agency–Runtime (CAR) Decomposition
&lt;/h3&gt;

&lt;p&gt;An &lt;a href="https://www.preprints.org/manuscript/202603.1756" rel="noopener noreferrer"&gt;academic position paper&lt;/a&gt; proposes treating the harness layer as a first-class research object with three dimensions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Control&lt;/strong&gt; — constraints, guardrails, permissions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agency&lt;/strong&gt; — planning, decision-making, self-evaluation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime&lt;/strong&gt; — execution environment, tools, infrastructure&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  5. Context Engineering: The Working Memory Budget
&lt;/h2&gt;

&lt;p&gt;Context engineering is the practice of managing the agent's context window as a &lt;strong&gt;working memory budget&lt;/strong&gt; rather than a dumping ground. It is arguably the most critical aspect of harness engineering.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.1 The One Big File Anti-Pattern
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://openai.com/index/harness-engineering/" rel="noopener noreferrer"&gt;OpenAI learned the hard way&lt;/a&gt; that a monolithic AGENTS.md doesn't scale:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Context is a scarce resource.&lt;/strong&gt; A giant instruction file crowds out the task, the code, and the relevant docs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Too much guidance becomes non-guidance.&lt;/strong&gt; When everything is "important," nothing is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It rots instantly.&lt;/strong&gt; A monolithic manual turns into a graveyard of stale rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's hard to verify.&lt;/strong&gt; A single blob doesn't lend itself to mechanical checks.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Their solution: treat &lt;code&gt;AGENTS.md&lt;/code&gt; as a &lt;strong&gt;table of contents&lt;/strong&gt; (~100 lines) that points to deeper sources of truth in a structured &lt;code&gt;docs/&lt;/code&gt; directory. This enables &lt;strong&gt;progressive disclosure&lt;/strong&gt; — agents start with a small, stable entry point and are taught where to look next.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.2 Progressive Disclosure
&lt;/h3&gt;

&lt;p&gt;Progressive disclosure is the principle that agents should only receive specific instructions, knowledge, or tools when they actually need them. Loading everything upfront pushes the agent into what HumanLayer calls &lt;strong&gt;"the dumb zone"&lt;/strong&gt; — where context window fill degrades performance even on simple tasks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://research.trychroma.com/context-rot" rel="noopener noreferrer"&gt;Chroma's research on context rot&lt;/a&gt; provides empirical backing: models perform measurably worse at longer context lengths, and degradation is steeper when there's low semantic similarity between the query and the relevant information in context.&lt;/p&gt;

&lt;p&gt;Skills solve this: they're activated on demand, bringing in focused knowledge only when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.3 Context-Efficient Backpressure
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.humanlayer.dev/blog/context-efficient-backpressure" rel="noopener noreferrer"&gt;HumanLayer's backpressure philosophy&lt;/a&gt; is essential: verification mechanisms must be &lt;strong&gt;context-efficient&lt;/strong&gt;. Running a full test suite after every change floods the context window with thousands of lines of passing tests. The agent loses track of its actual task.&lt;/p&gt;

&lt;p&gt;The rule: &lt;strong&gt;success is silent, only failures produce output.&lt;/strong&gt; Swallow the output of passing checks and only surface errors.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.4 Sub-Agents as Context Firewalls
&lt;/h3&gt;

&lt;p&gt;Sub-agents provide &lt;strong&gt;context isolation&lt;/strong&gt; — each gets a fresh, small, high-relevance context window for its task, and only the condensed result flows back to the parent. This is far more powerful than simply making context windows bigger:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"A bigger context window doesn't make the model better at finding the needle — it just makes the haystack bigger."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Effective sub-agent tasks: codebase exploration, grep/search operations, tracing information flow, research tasks — anything with a straightforward question and simple answer that requires many intermediate tool calls.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.5 Lessons from Manus
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus" rel="noopener noreferrer"&gt;Manus' playbook&lt;/a&gt; contributed specific techniques: KV-cache locality optimization, tool masking, filesystem memory, and keeping useful failures in-context while discarding noise.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.6 OpenHands Context Condensation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://openhands.dev/blog/openhands-context-condensensation-for-more-efficient-ai-agents" rel="noopener noreferrer"&gt;OpenHands' approach&lt;/a&gt; to bounded conversation memory preserves goals, progress, critical files, and failing tests while condensing everything else — keeping long-running coding sessions efficient without losing essential state.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Constraints, Guardrails &amp;amp; Safe Autonomy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6.1 Enforcing Invariants, Not Micromanaging
&lt;/h3&gt;

&lt;p&gt;OpenAI's approach is instructive: &lt;strong&gt;enforce boundaries centrally, allow autonomy locally.&lt;/strong&gt; They require Codex to parse data shapes at the boundary (&lt;a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/" rel="noopener noreferrer"&gt;parse, don't validate&lt;/a&gt;), but don't prescribe how. Each business domain follows a fixed layered architecture (Types → Config → Repo → Service → Runtime → UI) with strictly validated dependency directions enforced by &lt;strong&gt;custom linters and structural tests&lt;/strong&gt; — all Codex-generated.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This is the kind of architecture you usually postpone until you have hundreds of engineers. With coding agents, it's an early prerequisite: the constraints are what allows speed without decay."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Custom linter error messages are written to &lt;strong&gt;inject remediation instructions into agent context&lt;/strong&gt; — a positive form of prompt injection that guides self-correction.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.2 Sandboxing and Controlled Execution
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/claude-code-sandboxing" rel="noopener noreferrer"&gt;Anthropic's work on sandboxing&lt;/a&gt; focuses on reducing approval friction without losing control. Rather than prompting humans for every action, better sandboxing and policy design allow agents to work more autonomously while staying within safe boundaries.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/code-execution-with-mcp" rel="noopener noreferrer"&gt;MCP-based code execution&lt;/a&gt; gives agents controlled execution power through explicit, inspectable tool boundaries — making it clear what the agent can and cannot do.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.3 Tool Design for Safety
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/writing-tools-for-agents" rel="noopener noreferrer"&gt;Anthropic's guidance on writing tools for agents&lt;/a&gt; emphasizes that tool interfaces should be easy for models to call correctly and safely. Poorly designed tools lead to misuse; well-designed tools guide the agent toward correct behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.4 Prompt Injection Defense
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://openhands.dev/blog/mitigating-prompt-injection-attacks-in-software-agents" rel="noopener noreferrer"&gt;OpenHands' practical guide&lt;/a&gt; covers confirmation mode, analyzers, sandboxing, and hard policies for reducing prompt-injection risk. This is especially important given that MCP server tool descriptions are injected into system prompts — never connect to one you don't trust.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.5 Quality Checks in the Loop
&lt;/h3&gt;

&lt;p&gt;Rather than relying on after-the-fact manual review, &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/ccmenu-quality.html" rel="noopener noreferrer"&gt;Thoughtworks advocates&lt;/a&gt; moving quality checks into the agent's own loop. &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/anchoring-to-reference.html" rel="noopener noreferrer"&gt;Anchoring agents to reference applications&lt;/a&gt; constrains output with concrete exemplars. The question for humans becomes not "how do I review every line?" but &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/humans-and-agents.html" rel="noopener noreferrer"&gt;"where should I strengthen the harness?"&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Specs, Agent Files &amp;amp; Workflow Design
&lt;/h2&gt;

&lt;h3&gt;
  
  
  7.1 AGENTS.md and Agent Files
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/agentsmd/agents.md" rel="noopener noreferrer"&gt;AGENTS.md&lt;/a&gt; is a lightweight open format for repo-local instructions that tell agents how to work inside a codebase. A related effort, &lt;a href="https://github.com/agentmd/agent.md" rel="noopener noreferrer"&gt;agent.md&lt;/a&gt;, pursues machine-readable agent instructions across projects and tools.&lt;/p&gt;

&lt;p&gt;The key insights from practitioners:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Keep it concise&lt;/strong&gt; — under 60 lines is a good target. HumanLayer's CLAUDE.md is under 60 lines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid auto-generating it&lt;/strong&gt; — LLM-generated agent files actually &lt;em&gt;hurt&lt;/em&gt; performance while costing 20%+ more tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't include directory listings&lt;/strong&gt; — agents discover repository structure on their own just fine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use progressive disclosure&lt;/strong&gt; — point to deeper resources rather than inlining everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Make instructions universally applicable&lt;/strong&gt; — avoid conditional rules that confuse the model.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  7.2 Spec-Driven Development
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/github/spec-kit" rel="noopener noreferrer"&gt;GitHub's Spec Kit&lt;/a&gt; enables spec-driven development where agents execute against explicit product and engineering specifications. &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html" rel="noopener noreferrer"&gt;Thoughtworks' analysis&lt;/a&gt; explains why strong specs make AI-assisted delivery more dependable: they give agents unambiguous goals to work toward.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.3 The 12-Factor Agent
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.humanlayer.dev/blog/12-factor-agents" rel="noopener noreferrer"&gt;HumanLayer's 12-Factor Agents&lt;/a&gt; establishes operating principles for production agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explicit prompts over implicit behavior&lt;/li&gt;
&lt;li&gt;State ownership — agents manage their own state&lt;/li&gt;
&lt;li&gt;Clean pause-resume behavior&lt;/li&gt;
&lt;li&gt;Context discipline&lt;/li&gt;
&lt;li&gt;Reproducible workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The companion &lt;a href="https://www.12factoragentops.com/" rel="noopener noreferrer"&gt;12-Factor AgentOps&lt;/a&gt; extends these principles to operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.4 Feature Lists for Long-Running Work
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;Anthropic's approach&lt;/a&gt; to long-running agents uses an &lt;strong&gt;initializer agent&lt;/strong&gt; that generates a comprehensive feature list (200+ features for a web app), all initially marked as "failing." Subsequent coding agents work through features one at a time, marking them as passing only after verification. This prevents two critical failure modes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The one-shot attempt&lt;/strong&gt; — the agent tries to build everything at once, runs out of context mid-implementation, and leaves a broken state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Premature victory&lt;/strong&gt; — the agent sees existing progress and declares the job done.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The feature list uses JSON rather than Markdown because models are less likely to inappropriately modify structured JSON files.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Evals &amp;amp; Observability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  8.1 Why Evals Are Hard for Agents
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents" rel="noopener noreferrer"&gt;Anthropic's guidance on demystifying evals&lt;/a&gt; highlights the fundamental challenge: agents have &lt;strong&gt;many possible trajectories&lt;/strong&gt; to success or failure. A single task can be completed through vastly different tool-call sequences, making traditional input-output evaluation insufficient.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.2 Eval Taxonomies
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://developers.openai.com/blog/eval-skills/" rel="noopener noreferrer"&gt;OpenAI's eval guide&lt;/a&gt; introduces turning agent traces into repeatable evals with JSONL logs and deterministic checks. &lt;a href="https://blog.langchain.com/evaluating-deep-agents-our-learnings/" rel="noopener noreferrer"&gt;LangChain's breakdown&lt;/a&gt; distinguishes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single-step evals&lt;/strong&gt; — does one tool call produce the right result?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full-run evals&lt;/strong&gt; — does the complete task get solved?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-turn evals&lt;/strong&gt; — does the agent handle conversations and evolving goals?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  8.3 Trace Grading
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://platform.openai.com/docs/guides/trace-grading" rel="noopener noreferrer"&gt;OpenAI's trace grading&lt;/a&gt; enables grading agent traces directly — especially helpful for long multi-step tasks where the final output alone doesn't reveal whether the agent's process was sound.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.4 Verification Stacks
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://openhands.dev/blog/20260305-learning-to-verify-ai-generated-code" rel="noopener noreferrer"&gt;OpenHands' layered verification&lt;/a&gt; uses trajectory critics trained on production traces for reranking, early stopping, and review-time quality control. This goes beyond simple pass/fail testing to assess the quality of the agent's reasoning process.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.5 Skill-Level Evals
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://openhands.dev/blog/evaluating-agent-skills" rel="noopener noreferrer"&gt;OpenHands' playbook&lt;/a&gt; emphasizes measuring whether a specific skill actually helps using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bounded tasks&lt;/li&gt;
&lt;li&gt;Deterministic verifiers&lt;/li&gt;
&lt;li&gt;No-skill baselines&lt;/li&gt;
&lt;li&gt;Trace review&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  8.6 Infrastructure Noise
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/engineering/infrastructure-noise" rel="noopener noreferrer"&gt;Anthropic's research on infrastructure noise&lt;/a&gt; shows that runtime configuration can move coding benchmark scores by more than many leaderboard gaps — meaning that benchmark results may reflect infrastructure choices more than model intelligence.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Runtimes, Harnesses &amp;amp; Reference Implementations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  9.1 The Framework/Runtime/Harness Distinction
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://blog.langchain.com/agent-frameworks-runtimes-and-harnesses-oh-my/" rel="noopener noreferrer"&gt;LangChain's decomposition&lt;/a&gt; clarifies what belongs where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Framework&lt;/strong&gt; — reusable abstractions for building agents (LangGraph, CrewAI)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime&lt;/strong&gt; — execution infrastructure (sandboxes, state management, scheduling)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Harness&lt;/strong&gt; — the task-specific configuration and environment around a particular agent deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  9.2 Notable Implementations
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/SWE-agent/SWE-agent" rel="noopener noreferrer"&gt;SWE-agent&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Mature research coding agent with inspectable harness, prompt, tools, and environment design&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://claude.com/blog/building-agents-with-the-claude-agent-sdk" rel="noopener noreferrer"&gt;Claude Agent SDK&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Production-oriented SDK with sessions, tools, and orchestration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/langchain-ai/deepagents" rel="noopener noreferrer"&gt;deepagents&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;LangChain's open-source project for building deeper, longer-running agents with middleware and harness patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/SethGammon/Citadel" rel="noopener noreferrer"&gt;Citadel&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Harness for Claude Code and Codex with isolated worktrees, multi-agent coordination, and persisted memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/inngest/agent-kit" rel="noopener noreferrer"&gt;AgentKit&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;TypeScript toolkit for building durable, workflow-aware agents on event-driven infrastructure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/harbor-framework/harbor" rel="noopener noreferrer"&gt;Harbor&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Generalized harness for evaluating and improving agents at scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/raphaelchristi/harness-evolver" rel="noopener noreferrer"&gt;Harness Evolver&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Claude Code plugin that autonomously evolves agent harnesses using multi-agent proposers and evaluation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/SWE-agent/SWE-ReX" rel="noopener noreferrer"&gt;SWE-ReX&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Sandboxed code execution infrastructure for AI agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/olo-dot-io/Uni-CLI" rel="noopener noreferrer"&gt;Uni-CLI&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Universal CLI connecting agents to 134 sites via 711 declarative YAML pipelines with self-repair loop&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  9.3 Skills Ecosystem
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://skills.sh/" rel="noopener noreferrer"&gt;skills.sh&lt;/a&gt; is a community marketplace for discovering, sharing, and installing reusable AI agent skills across runtimes like Claude Code — making harness capabilities portable and composable. However, caution is warranted: skill registries have already been caught distributing malicious skills, so treat them like &lt;code&gt;npm install random-package&lt;/code&gt; and read what you're installing.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Benchmarks: Measuring Harness Quality
&lt;/h2&gt;

&lt;p&gt;Benchmarks are especially useful when you want to compare &lt;strong&gt;harness quality&lt;/strong&gt;, not just model quality. They stress context handling, tool calling, environment control, verification logic, and the runtime scaffolding around the model.&lt;/p&gt;

&lt;h3&gt;
  
  
  10.1 Software Engineering Benchmarks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Focus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.swebench.com/" rel="noopener noreferrer"&gt;SWE-bench Verified&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Real GitHub issues and tests; makes harness choices around retrieval, patching, and validation highly visible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.tbench.ai/" rel="noopener noreferrer"&gt;Terminal-Bench&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Terminal-native agents in shells and filesystems; especially useful for comparing coding-agent harnesses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://openhands.dev/blog/evoclaw-benchmark" rel="noopener noreferrer"&gt;EvoClaw&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Dependent milestone sequences from real repo history; surfaces regression accumulation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  10.2 Web &amp;amp; Browser Benchmarks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Focus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://webarena.dev/" rel="noopener noreferrer"&gt;WebArena&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Self-hostable web environment for evaluating autonomous agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://jykoh.com/vwa" rel="noopener noreferrer"&gt;VisualWebArena&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Multimodal web agents with image and screenshot inputs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://huggingface.co/spaces/ServiceNow/browsergym-leaderboard" rel="noopener noreferrer"&gt;BrowserGym&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Reproducible framework comparing harnesses across multiple web benchmarks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ServiceNow/WorkArena" rel="noopener noreferrer"&gt;WorkArena&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Enterprise-style web workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  10.3 General &amp;amp; Multi-Domain Benchmarks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Focus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/THUDM/AgentBench" rel="noopener noreferrer"&gt;AgentBench&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Cross-environment: OS, databases, knowledge graphs, web browsing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://huggingface.co/datasets/gaia-benchmark/GAIA" rel="noopener noreferrer"&gt;GAIA&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;General AI assistant tasks comparing harness-level choices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://os-world.github.io/" rel="noopener noreferrer"&gt;OSWorld&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Real computer-use across Ubuntu, Windows, macOS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://appworld.dev/" rel="noopener noreferrer"&gt;AppWorld&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Interactive coding agents with state-based and execution-based unit tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://clawbench.net/" rel="noopener noreferrer"&gt;ClawBench&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Search, reasoning, coding, safety, and multi-turn conversation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  10.4 MCP-Specific Benchmarks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Focus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/modelscope/MCPBench" rel="noopener noreferrer"&gt;MCP Bench&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Tool accuracy, latency, and token use across MCP server types&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/eval-sys/mcpmark" rel="noopener noreferrer"&gt;MCPMark&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Stress-testing on real-world MCP tasks (Notion, GitHub, Postgres)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://osworld-mcp.github.io/" rel="noopener noreferrer"&gt;OSWorld-MCP&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Real-world computer tasks using Model Context Protocol&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  10.5 Leaderboards
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Leaderboard&lt;/th&gt;
&lt;th&gt;Focus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.agent-arena.com/leaderboard" rel="noopener noreferrer"&gt;Agent Arena&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ELO-style ratings from head-to-head agent battles&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hal.cs.princeton.edu/" rel="noopener noreferrer"&gt;HAL&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Holistic agent evaluation with reliability, cost, and broad task coverage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://huggingface.co/spaces/galileo-ai/agent-leaderboard" rel="noopener noreferrer"&gt;Galileo Agent Leaderboard&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;LLM agents on task completion and tool calling across business domains&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  11. Practical Playbook: Engineering Your Own Harness
&lt;/h2&gt;

&lt;p&gt;Drawing from all the sources in the Awesome Harness Engineering collection, here is a practical playbook for building an effective harness.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Start with Agent Files (But Keep Them Lean)
&lt;/h3&gt;

&lt;p&gt;Create a concise &lt;code&gt;AGENTS.md&lt;/code&gt; or &lt;code&gt;CLAUDE.md&lt;/code&gt; at the root of your repository:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep it under 60 lines&lt;/li&gt;
&lt;li&gt;Treat it as a table of contents, not an encyclopedia&lt;/li&gt;
&lt;li&gt;Include: build commands, test commands, key conventions, project structure pointers&lt;/li&gt;
&lt;li&gt;Exclude: directory listings, conditional rules, auto-generated content&lt;/li&gt;
&lt;li&gt;Point to deeper &lt;code&gt;docs/&lt;/code&gt; for architectural decisions, design principles, and domain knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 2: Set Up Computational Feedback Loops
&lt;/h3&gt;

&lt;p&gt;These are your highest-leverage, lowest-cost sensors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Type checking&lt;/strong&gt; — catches structural errors deterministically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linting&lt;/strong&gt; with custom error messages that include remediation instructions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast test suites&lt;/strong&gt; — run a targeted subset, not the full suite&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structural/architectural tests&lt;/strong&gt; — enforce module boundaries and dependency directions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Critical rule: success is silent.&lt;/strong&gt; Swallow output from passing checks; only surface errors to the agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Add Verification Tools
&lt;/h3&gt;

&lt;p&gt;Give the agent ways to verify its own work as a human would:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Browser automation&lt;/strong&gt; (Puppeteer, Playwright) for end-to-end UI verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Screenshot capture&lt;/strong&gt; so the agent can visually inspect results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability stack&lt;/strong&gt; — expose logs via LogQL, metrics via PromQL, traces via TraceQL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dev server per worktree&lt;/strong&gt; — isolate each agent's environment&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Implement Hooks for Control Flow
&lt;/h3&gt;

&lt;p&gt;Use harness hooks to create deterministic checkpoints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pre-stop hooks&lt;/strong&gt;: run formatter + type checker before the agent finishes; re-engage on failure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notification hooks&lt;/strong&gt;: alert humans when the agent needs attention&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approval hooks&lt;/strong&gt;: auto-approve safe operations, deny dangerous ones (e.g., migrations)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration hooks&lt;/strong&gt;: create PRs, post to Slack, set up preview environments&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Use Sub-Agents for Context Control
&lt;/h3&gt;

&lt;p&gt;Don't use sub-agents as "frontend engineer" or "backend engineer" personas — that doesn't work. Use them as &lt;strong&gt;context firewalls&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Delegate research, grep, and exploration to sub-agents&lt;/li&gt;
&lt;li&gt;Use cheaper models (Sonnet, Haiku) for sub-agents; expensive models (Opus) for the parent&lt;/li&gt;
&lt;li&gt;Return condensed results with source citations (&lt;code&gt;filepath:line&lt;/code&gt; format)&lt;/li&gt;
&lt;li&gt;Keep the parent thread in the "smart zone" with minimal context pollution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 6: Enforce Architecture Mechanically
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Define layered architecture with fixed dependency directions&lt;/li&gt;
&lt;li&gt;Write custom linters (the agent can write them!)&lt;/li&gt;
&lt;li&gt;Add structural tests that check invariants on every commit&lt;/li&gt;
&lt;li&gt;Encode "taste" as rules: structured logging, naming conventions, file size limits&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 7: Manage Long-Running Work
&lt;/h3&gt;

&lt;p&gt;For tasks spanning multiple context windows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use an &lt;strong&gt;initializer agent&lt;/strong&gt; to set up the environment: &lt;code&gt;init.sh&lt;/code&gt;, progress file, feature list, initial git commit&lt;/li&gt;
&lt;li&gt;Each subsequent &lt;strong&gt;coding agent&lt;/strong&gt; reads progress, works on one feature, commits, and writes a summary&lt;/li&gt;
&lt;li&gt;Always start a session by reading progress files and running a basic health check&lt;/li&gt;
&lt;li&gt;Always end a session with a clean, mergeable state&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 8: Fight Entropy Continuously
&lt;/h3&gt;

&lt;p&gt;OpenAI's "garbage collection" pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Encode &lt;strong&gt;golden principles&lt;/strong&gt; directly in the repository&lt;/li&gt;
&lt;li&gt;Run recurring background agents that scan for deviations&lt;/li&gt;
&lt;li&gt;Open targeted, small refactoring PRs that can be reviewed in under a minute&lt;/li&gt;
&lt;li&gt;Track quality grades per domain and per architectural layer&lt;/li&gt;
&lt;li&gt;Treat technical debt like a high-interest loan: pay it down daily&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 9: Make Knowledge Repository-Local
&lt;/h3&gt;

&lt;p&gt;Everything the agent needs must be in the repo:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slack discussions about architecture? Encode them as markdown&lt;/li&gt;
&lt;li&gt;Design decisions? Write ADRs&lt;/li&gt;
&lt;li&gt;Onboarding knowledge? Put it in structured docs&lt;/li&gt;
&lt;li&gt;Knowledge in people's heads? Doesn't exist for the agent&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;"From the agent's point of view, anything it can't access in-context while running effectively doesn't exist."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 10: Iterate Based on Failures
&lt;/h3&gt;

&lt;p&gt;The most important meta-principle:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again." — Mitchell Hashimoto&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Don't design the ideal harness upfront. Bias toward shipping. Add configuration only when the agent actually fails. Throw away things that don't help. Distribute battle-tested configurations via repository-level config.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. The Future of Harness Engineering
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Model-Harness Co-Evolution
&lt;/h3&gt;

&lt;p&gt;Today's frontier coding agents (Claude Code, Codex) are post-trained with models and harnesses in the loop. This creates a feedback cycle where useful primitives are discovered, added to the harness, and then used when training the next generation of models. But this co-evolution has interesting side effects: models can become &lt;strong&gt;over-fitted&lt;/strong&gt; to their training harness, performing worse in alternative harness configurations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Harnessability as a Design Criterion
&lt;/h3&gt;

&lt;p&gt;Not every codebase is equally amenable to harnessing. &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html" rel="noopener noreferrer"&gt;Thoughtworks introduces "harnessability"&lt;/a&gt; — the structural properties that make a codebase governable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Strongly typed languages naturally have type-checking as a sensor&lt;/li&gt;
&lt;li&gt;Clearly definable module boundaries afford architectural constraint rules&lt;/li&gt;
&lt;li&gt;Mature frameworks abstract away details agents don't need to worry about&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Greenfield teams can bake harnessability in from day one. Legacy teams face the harder problem: the harness is most needed where it is hardest to build.&lt;/p&gt;

&lt;h3&gt;
  
  
  Harness Templates
&lt;/h3&gt;

&lt;p&gt;Enterprises have a few common service topologies (CRUD APIs, event processors, data dashboards). These may evolve into &lt;strong&gt;harness templates&lt;/strong&gt; — bundles of guides and sensors pre-configured for a topology. Teams may start picking tech stacks partly based on what harnesses are already available.&lt;/p&gt;

&lt;h3&gt;
  
  
  Autonomous Harness Evolution
&lt;/h3&gt;

&lt;p&gt;Projects like &lt;a href="https://github.com/raphaelchristi/harness-evolver" rel="noopener noreferrer"&gt;Harness Evolver&lt;/a&gt; point toward a future where agents &lt;strong&gt;autonomously improve their own harnesses&lt;/strong&gt; using multi-agent proposers, evaluation-backed selection, and git worktree isolation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Open Problems
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;How do we keep a harness coherent as it grows, with guides and sensors in sync?&lt;/li&gt;
&lt;li&gt;How far can we trust agents to make trade-offs when instructions conflict?&lt;/li&gt;
&lt;li&gt;If sensors never fire, is that high quality or inadequate detection?&lt;/li&gt;
&lt;li&gt;How do we evaluate harness coverage similar to code coverage?&lt;/li&gt;
&lt;li&gt;How does architectural coherence evolve over years in a fully agent-generated system?&lt;/li&gt;
&lt;li&gt;Can we generalize these findings beyond coding to scientific research, financial modeling, and other domains?&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  13. Conclusion
&lt;/h2&gt;

&lt;p&gt;Harness engineering represents a fundamental shift in how we think about AI-assisted software development. The discipline acknowledges a counterintuitive truth: &lt;strong&gt;the model is usually fine; the problem is the system around it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The implications reshape what it means to be a software engineer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Engineers become environment designers&lt;/strong&gt; — specifying intent, building feedback loops, and shaping constraints rather than writing code directly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture becomes an early prerequisite&lt;/strong&gt; — not a luxury for large teams, but a necessity for agent reliability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repository knowledge becomes the system of record&lt;/strong&gt; — everything the agent needs must be versioned, discoverable, and mechanically verifiable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality is enforced mechanically&lt;/strong&gt; — once encoded, standards apply everywhere at once, at every hour.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Entropy is managed continuously&lt;/strong&gt; — technical debt is treated as a high-interest loan, paid down daily by background agents.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html" rel="noopener noreferrer"&gt;Thoughtworks' Böckeler&lt;/a&gt; frames it: building this outer harness is emerging as &lt;strong&gt;an ongoing engineering practice&lt;/strong&gt;, not a one-time configuration. Harnesses externalize what human developer experience brings to the table — conventions, quality intuitions, architectural judgment, organizational alignment — making it explicit, verifiable, and continuously enforceable.&lt;/p&gt;

&lt;p&gt;The field is young, evolving rapidly, and full of open questions. But one thing is clear: the teams that invest in harness engineering — shaping the environment around their agents — will get dramatically better results than those waiting for the next model release to solve their problems.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Building software still demands discipline, but the discipline shows up more in the scaffolding rather than the code." — OpenAI&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  14. References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Foundational Articles
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://openai.com/index/harness-engineering/" rel="noopener noreferrer"&gt;Harness engineering: leveraging Codex in an agent-first world&lt;/a&gt; — OpenAI&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;Effective harnesses for long-running agents&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/harness-design-long-running-apps" rel="noopener noreferrer"&gt;Harness design for long-running application development&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://blog.langchain.com/the-anatomy-of-an-agent-harness/" rel="noopener noreferrer"&gt;The Anatomy of an Agent Harness&lt;/a&gt; — LangChain&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html" rel="noopener noreferrer"&gt;Harness Engineering&lt;/a&gt; — Thoughtworks&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/building-effective-agents" rel="noopener noreferrer"&gt;Building effective agents&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.humanlayer.dev/blog/skill-issue-harness-engineering-for-coding-agents" rel="noopener noreferrer"&gt;Skill Issue: Harness Engineering for Coding Agents&lt;/a&gt; — HumanLayer&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.inngest.com/blog/your-agent-needs-a-harness-not-a-framework" rel="noopener noreferrer"&gt;Your Agent Needs a Harness, Not a Framework&lt;/a&gt; — Inngest&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.preprints.org/manuscript/202603.1756" rel="noopener noreferrer"&gt;Harness Engineering for Language Agents (CAR Decomposition)&lt;/a&gt; — Academic Paper&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Context &amp;amp; Memory
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents" rel="noopener noreferrer"&gt;Effective context engineering for AI agents&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus" rel="noopener noreferrer"&gt;Context Engineering for AI Agents: Lessons from Building Manus&lt;/a&gt; — Manus&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://martinfowler.com/articles/exploring-gen-ai/context-engineering-coding-agents.html" rel="noopener noreferrer"&gt;Context Engineering for Coding Agents&lt;/a&gt; — Thoughtworks&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.humanlayer.dev/blog/advanced-context-engineering" rel="noopener noreferrer"&gt;Advanced Context Engineering for Coding Agents&lt;/a&gt; — HumanLayer&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.humanlayer.dev/blog/context-efficient-backpressure" rel="noopener noreferrer"&gt;Context-Efficient Backpressure for Coding Agents&lt;/a&gt; — HumanLayer&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://openhands.dev/blog/openhands-context-condensensation-for-more-efficient-ai-agents" rel="noopener noreferrer"&gt;OpenHands Context Condensation&lt;/a&gt; — OpenHands&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.humanlayer.dev/blog/writing-a-good-claude-md" rel="noopener noreferrer"&gt;Writing a good CLAUDE.md&lt;/a&gt; — HumanLayer&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Safety &amp;amp; Constraints
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/claude-code-sandboxing" rel="noopener noreferrer"&gt;Beyond permission prompts: making Claude Code more secure&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/code-execution-with-mcp" rel="noopener noreferrer"&gt;Code execution with MCP&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/writing-tools-for-agents" rel="noopener noreferrer"&gt;Writing effective tools for agents&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://openhands.dev/blog/mitigating-prompt-injection-attacks-in-software-agents" rel="noopener noreferrer"&gt;Mitigating Prompt Injection Attacks&lt;/a&gt; — OpenHands&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://code.claude.com/docs" rel="noopener noreferrer"&gt;Claude Code: Best practices for agentic coding&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Workflow Design
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://github.com/agentsmd/agents.md" rel="noopener noreferrer"&gt;AGENTS.md&lt;/a&gt; — Open Standard&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/github/spec-kit" rel="noopener noreferrer"&gt;GitHub Spec Kit&lt;/a&gt; — GitHub&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.humanlayer.dev/blog/12-factor-agents" rel="noopener noreferrer"&gt;12 Factor Agents&lt;/a&gt; — HumanLayer&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.12factoragentops.com/" rel="noopener noreferrer"&gt;12-Factor AgentOps&lt;/a&gt; — AgentOps&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Evals &amp;amp; Observability
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://developers.openai.com/blog/eval-skills/" rel="noopener noreferrer"&gt;Testing Agent Skills Systematically with Evals&lt;/a&gt; — OpenAI&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents" rel="noopener noreferrer"&gt;Demystifying Evals for AI Agents&lt;/a&gt; — Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://blog.langchain.com/evaluating-deep-agents-our-learnings/" rel="noopener noreferrer"&gt;Evaluating Deep Agents&lt;/a&gt; — LangChain&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://blog.langchain.com/improving-deep-agents-with-harness-engineering/" rel="noopener noreferrer"&gt;Improving Deep Agents with harness engineering&lt;/a&gt; — LangChain&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Courses
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://github.com/walkinglabs/learn-harness-engineering" rel="noopener noreferrer"&gt;walkinglabs/learn-harness-engineering&lt;/a&gt; — A project-based course on making Codex and Claude Code more reliable&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Curated Collection
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://github.com/walkinglabs/awesome-harness-engineering" rel="noopener noreferrer"&gt;walkinglabs/awesome-harness-engineering&lt;/a&gt; — The comprehensive, community-maintained list that informed this article&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;This article synthesizes insights from the &lt;a href="https://github.com/walkinglabs/awesome-harness-engineering" rel="noopener noreferrer"&gt;Awesome Harness Engineering&lt;/a&gt; collection — a curated list maintained by Walking Labs. All referenced works are credited to their original authors and organizations.&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>llm</category>
      <category>programming</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Sun, 22 Mar 2026 09:52:03 +0000</pubDate>
      <link>https://dev.to/truongpx396/-3m7i</link>
      <guid>https://dev.to/truongpx396/-3m7i</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/nocobase/top-11-open-source-no-code-ai-tools-with-the-most-github-stars-12pp" class="crayons-story__hidden-navigation-link"&gt;Top 11 Open Source No-Code AI Tools with the Most GitHub Stars&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/nocobase" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1349233%2Ffe1061e9-2897-4210-a0b2-a96c044ac3b2.jpg" alt="nocobase profile" class="crayons-avatar__image" width="400" height="400"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/nocobase" class="crayons-story__secondary fw-medium m:hidden"&gt;
              NocoBase
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                NocoBase
                
              
              &lt;div id="story-author-preview-content-2945998" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/nocobase" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1349233%2Ffe1061e9-2897-4210-a0b2-a96c044ac3b2.jpg" class="crayons-avatar__image" alt="" width="400" height="400"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;NocoBase&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/nocobase/top-11-open-source-no-code-ai-tools-with-the-most-github-stars-12pp" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Oct 21 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/nocobase/top-11-open-source-no-code-ai-tools-with-the-most-github-stars-12pp" id="article-link-2945998"&gt;
          Top 11 Open Source No-Code AI Tools with the Most GitHub Stars
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/opensource"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;opensource&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/github"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;github&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/nocode"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;nocode&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/nocobase/top-11-open-source-no-code-ai-tools-with-the-most-github-stars-12pp" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;3&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/nocobase/top-11-open-source-no-code-ai-tools-with-the-most-github-stars-12pp#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            11 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>opensource</category>
      <category>github</category>
      <category>ai</category>
      <category>nocode</category>
    </item>
    <item>
      <title>🚀 Awesome Resources For Learning About System Design ⚡</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Sun, 26 Oct 2025 07:00:11 +0000</pubDate>
      <link>https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-39go</link>
      <guid>https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-39go</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" class="crayons-story__hidden-navigation-link"&gt;🚀 Awesome Resources For Learning About System Design ⚡&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/truongpx396" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" alt="truongpx396 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/truongpx396" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Truong Phung
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Truong Phung
                
              
              &lt;div id="story-author-preview-content-2086476" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/truongpx396" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Truong Phung&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Nov 8 '24&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" id="article-link-2086476"&gt;
          🚀 Awesome Resources For Learning About System Design ⚡
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/kubernetes"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;kubernetes&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/tutorial"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;tutorial&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devops"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devops&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;10&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>webdev</category>
      <category>kubernetes</category>
      <category>tutorial</category>
      <category>devops</category>
    </item>
    <item>
      <title>🚀 Awesome Resources For Learning About System Design ⚡</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Fri, 17 Oct 2025 19:09:08 +0000</pubDate>
      <link>https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-4m2n</link>
      <guid>https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-4m2n</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" class="crayons-story__hidden-navigation-link"&gt;🚀 Awesome Resources For Learning About System Design ⚡&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/truongpx396" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" alt="truongpx396 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/truongpx396" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Truong Phung
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Truong Phung
                
              
              &lt;div id="story-author-preview-content-2086476" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/truongpx396" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Truong Phung&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Nov 8 '24&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" id="article-link-2086476"&gt;
          🚀 Awesome Resources For Learning About System Design ⚡
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/kubernetes"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;kubernetes&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/tutorial"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;tutorial&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devops"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devops&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;10&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/truongpx396/awesome-resources-for-learning-about-system-design-8je#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>webdev</category>
      <category>kubernetes</category>
      <category>tutorial</category>
      <category>devops</category>
    </item>
    <item>
      <title>🐹 Golang Integration with Kafka and Uber ZapLog 📨</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Fri, 17 Oct 2025 19:08:17 +0000</pubDate>
      <link>https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-51o4</link>
      <guid>https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-51o4</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-2bn7" class="crayons-story__hidden-navigation-link"&gt;🐹 Golang Integration with Kafka and Uber ZapLog 📨&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/truongpx396" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" alt="truongpx396 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/truongpx396" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Truong Phung
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Truong Phung
                
              
              &lt;div id="story-author-preview-content-2074715" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/truongpx396" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Truong Phung&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-2bn7" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Nov 3 '24&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-2bn7" id="article-link-2074715"&gt;
          🐹 Golang Integration with Kafka and Uber ZapLog 📨
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/go"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;go&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/kafka"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;kafka&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/tutorial"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;tutorial&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-2bn7" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;15&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/truongpx396/golang-integration-with-kafka-and-uber-zaplog-2bn7#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              4&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            10 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>webdev</category>
      <category>go</category>
      <category>kafka</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>📘 TypeScript with ReactJS All in One ⚛️</title>
      <dc:creator>Truong Phung</dc:creator>
      <pubDate>Fri, 17 Oct 2025 08:12:12 +0000</pubDate>
      <link>https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-b0b</link>
      <guid>https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-b0b</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-1oed" class="crayons-story__hidden-navigation-link"&gt;📘 TypeScript with ReactJS All in One ⚛️&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/truongpx396" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" alt="truongpx396 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/truongpx396" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Truong Phung
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Truong Phung
                
              
              &lt;div id="story-author-preview-content-2099186" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/truongpx396" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2215325%2Ff0dca1b8-525d-45b6-bafc-f3d3141bc934.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Truong Phung&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-1oed" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Nov 12 '24&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-1oed" id="article-link-2099186"&gt;
          📘 TypeScript with ReactJS All in One ⚛️
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/javascript"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;javascript&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/react"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;react&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/tutorial"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;tutorial&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-1oed" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;8&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/truongpx396/typescript-with-reactjs-all-in-one-1oed#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            5 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>react</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
