<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dibyanshu kumar</title>
    <description>The latest articles on DEV Community by Dibyanshu kumar (@dibyanshu_kumar).</description>
    <link>https://dev.to/dibyanshu_kumar</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3839151%2F584d08f6-a9e7-4a37-bc46-f26cd6e8f3cd.jpg</url>
      <title>DEV Community: Dibyanshu kumar</title>
      <link>https://dev.to/dibyanshu_kumar</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dibyanshu_kumar"/>
    <language>en</language>
    <item>
      <title>How I Made Claude Code Finish Tasks That Outlast Its Memory</title>
      <dc:creator>Dibyanshu kumar</dc:creator>
      <pubDate>Thu, 02 Apr 2026 15:44:40 +0000</pubDate>
      <link>https://dev.to/dibyanshu_kumar/how-i-made-claude-code-finish-tasks-that-outlast-its-memory-39ji</link>
      <guid>https://dev.to/dibyanshu_kumar/how-i-made-claude-code-finish-tasks-that-outlast-its-memory-39ji</guid>
      <description>&lt;h1&gt;
  
  
  How I Made Claude Code Finish Tasks That Outlast Its Memory
&lt;/h1&gt;

&lt;p&gt;You know the moment. You're 45 minutes into a big task — processing a batch of files, refactoring a module, running a multi-phase workflow. Claude has been crushing it. Then the responses get vague. It starts repeating itself. It forgets what it already did. Context window: full.&lt;/p&gt;

&lt;p&gt;You start a new conversation. Re-explain everything. Hope it picks up where it left off. It doesn't — it redoes half the work, skips the other half, and misses the files that were tricky the first time around.&lt;/p&gt;

&lt;p&gt;I got tired of being the relay mechanism between conversations. So I built one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Idea: Tag-Team Relay
&lt;/h2&gt;

&lt;p&gt;The concept is stolen from wrestling. When a wrestler is gassed, they tag in a fresh partner. The fresh partner knows the game plan, knows what's been tried, and picks up where the last one left off.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/tag-team&lt;/code&gt; does this for Claude Code. A lightweight dispatcher spawns worker agents one at a time. Each worker makes as much progress as possible, then writes a structured handoff file before its context fills up. The next worker reads that file and continues.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/tag-team Extract JSON docs from all 200 files in /tmp/batch.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Three workers later, all 200 files are processed. Zero re-prompting.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Happens
&lt;/h2&gt;

&lt;p&gt;The dispatcher — your main conversation — stays lean. It never does the real work. It just:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Spawns Worker 1 with the task&lt;/li&gt;
&lt;li&gt;Worker 1 processes files until context gets high, writes a handoff file, returns&lt;/li&gt;
&lt;li&gt;Dispatcher reads the result, spawns Worker 2 with a pointer to the handoff&lt;/li&gt;
&lt;li&gt;Worker 2 reads the handoff, continues from exactly where Worker 1 stopped&lt;/li&gt;
&lt;li&gt;Repeat until done or max iterations reached&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight: the Agent tool returns only a short summary to the dispatcher. So even after 10 iterations, the dispatcher's context has barely grown. It can keep spawning workers indefinitely.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Part That Actually Matters: Handoff Files
&lt;/h2&gt;

&lt;p&gt;The handoff file is the entire trick. Without it, this is just "start a new conversation and hope for the best." With it, each worker has perfect context about what happened before — without needing conversation history.&lt;/p&gt;

&lt;p&gt;Here's what a real handoff looks like (abbreviated):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Mission&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Goal: Extract JSON documentation from all 200 API endpoint files
&lt;span class="p"&gt;-&lt;/span&gt; Status: 40% complete (80 of 200 files processed)
&lt;span class="p"&gt;-&lt;/span&gt; Next step: Continue from file 81 (src/api/payments/refund.ts)

&lt;span class="gu"&gt;## Key Decisions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Decisions made: Using the compact schema format, not the verbose one.
  Output goes to docs/api/{module}/{endpoint}.json
&lt;span class="p"&gt;-&lt;/span&gt; Dead ends: Tried batch-reading 20 files at once — hit context limits
  fast. Switched to batches of 6 with interleaved writes.

&lt;span class="gu"&gt;## Progress&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Completed: Files 1-80 (auth/&lt;span class="ge"&gt;*, users/*&lt;/span&gt;, orders/&lt;span class="err"&gt;*&lt;/span&gt;)
&lt;span class="p"&gt;-&lt;/span&gt; In progress: None (clean handoff)
&lt;span class="p"&gt;-&lt;/span&gt; Remaining: Files 81-200 (payments/&lt;span class="ge"&gt;*, inventory/*&lt;/span&gt;, shipping/&lt;span class="ge"&gt;*, admin/*&lt;/span&gt;)

&lt;span class="gu"&gt;## Resume Instructions&lt;/span&gt;
Read the file list from /tmp/batch.txt. Skip to line 81. Process files
in batches of 6: read 6 files, write their JSON docs, repeat. Output
path pattern: docs/api/{module}/{endpoint}.json. Use the compact schema
format — see docs/api/auth/login.json for reference.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the &lt;strong&gt;Dead ends&lt;/strong&gt; section. Worker 1 tried batch-reading 20 files and it blew up. Worker 2 doesn't repeat that mistake — it already knows to use batches of 6. This is the thing that separates tag-team from naive restarts. Each worker inherits not just the progress, but the &lt;em&gt;lessons&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context Warning Protocol
&lt;/h2&gt;

&lt;p&gt;Workers don't just blindly run until they crash. The proxy monitors context usage and injects warnings at configurable thresholds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;70%&lt;/strong&gt; — Finish your current file. Don't start new large operations. Start composing your handoff mentally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;80%&lt;/strong&gt; — Stop immediately. Write the handoff file. Return.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;90%&lt;/strong&gt; — Emergency. Dump whatever state you have and get out.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's also a safety net: if a worker makes 50+ tool calls without seeing a context warning, it hands off anyway. Belt and suspenders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Other Skills
&lt;/h2&gt;

&lt;p&gt;The part I didn't expect to be useful — but turned out to be the most useful — is the &lt;code&gt;--skill&lt;/code&gt; flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/tag-team PROJ-12345 --skill develop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This wraps my &lt;code&gt;/develop&lt;/code&gt; skill (a 12-phase Jira-to-PR orchestrator) inside the tag-team relay. If &lt;code&gt;/develop&lt;/code&gt; runs out of context mid-implementation, it doesn't die — it hands off to a fresh worker who picks up at the same phase.&lt;/p&gt;

&lt;p&gt;Any skill that might outlive a single context window becomes automatically resilient. The skill doesn't need to know about tag-team. Tag-team doesn't need to know about the skill. They compose.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resume and Status
&lt;/h2&gt;

&lt;p&gt;Sessions survive crashes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/tag-team resume     &lt;span class="c"&gt;# Pick up from the latest handoff file&lt;/span&gt;
/tag-team status     &lt;span class="c"&gt;# Show progress log and list all handoffs&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Old sessions get archived automatically when you start a new one, so you never lose historical state.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;For a real batch of 200 files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Without tag-team&lt;/strong&gt;: Manual restart 4-5 times, re-explaining context each time, ~30% of work redone across restarts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With tag-team&lt;/strong&gt;: 3-4 workers, fully automatic, zero rework, progress file showing exactly what happened at each stage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The raw time is similar. The difference is I walked away and came back to a completed task instead of babysitting context windows.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use It
&lt;/h2&gt;

&lt;p&gt;Tag-team shines for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Batch processing&lt;/strong&gt;: Extracting, transforming, or generating files from a large input set&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long workflows&lt;/strong&gt;: Multi-phase skills that might exhaust context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refactoring at scale&lt;/strong&gt;: Renaming, restructuring, or updating patterns across many files&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Any task where you'd normally restart the conversation mid-way&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's overkill for tasks that fit in a single context window. If your task completes in one worker, the dispatcher overhead is wasted. The sweet spot is work that would take 2-10 context windows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The skill is four files — a dispatcher (&lt;code&gt;SKILL.md&lt;/code&gt;), worker instructions, cross-phase policies, and a config. Drop them into &lt;code&gt;.claude/skills/tag-team/&lt;/code&gt; and you have relay-capable Claude Code.&lt;/p&gt;

&lt;p&gt;The full implementation is in my &lt;a href="https://github.com/anthropics/claude-code-skills-demo/tree/main/skills/tag-team" rel="noopener noreferrer"&gt;skills repo&lt;/a&gt;. The deep dive into the architecture — handoff format, config options, how the dispatcher loop works — is in the &lt;a href="https://dev.to/dibyanshu_kumar/tag-team-deep-dive-architecture-technical-reference-55a0"&gt;companion technical reference&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is part of a series on scaling Claude Code for enterprise workflows. Previously: &lt;a href="https://dev.to/dibyanshu_kumar/how-i-stopped-losing-work-to-context-window-overflow-in-claude-code-1hll"&gt;How I Stopped Losing Work to Context Window Overflow&lt;/a&gt;, &lt;a href="https://dev.to/dibyanshu_kumar/how-i-taught-an-ai-agent-to-save-its-own-progress-2d58"&gt;How I Taught an AI Agent to Save Its Own Progress&lt;/a&gt;, and &lt;a href="//blog-3-centralized-skill-management.md"&gt;Centralized Skill Management for Claude Code&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Tag-Team Deep Dive: Architecture &amp; Technical Reference</title>
      <dc:creator>Dibyanshu kumar</dc:creator>
      <pubDate>Thu, 02 Apr 2026 15:44:14 +0000</pubDate>
      <link>https://dev.to/dibyanshu_kumar/tag-team-deep-dive-architecture-technical-reference-55a0</link>
      <guid>https://dev.to/dibyanshu_kumar/tag-team-deep-dive-architecture-technical-reference-55a0</guid>
      <description>&lt;h1&gt;
  
  
  Tag-Team Deep Dive: Architecture &amp;amp; Technical Reference
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;This is the companion technical reference for "How I Made Claude Code Finish Tasks That Outlast Its Memory." That post covers the problem and the concept — this one covers the full architecture, configuration, and implementation details.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;Tag-team has three layers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────┐
│  Dispatcher (SKILL.md)              │  ← Your conversation. Stays lean.
│  - Parses arguments                 │
│  - Manages the relay loop           │
│  - Tracks progress                  │
├─────────────────────────────────────┤
│  Workers (spawned agents)           │  ← Disposable. One at a time.
│  - Do the actual work               │
│  - Monitor their own context        │
│  - Write handoff files on exit      │
├─────────────────────────────────────┤
│  Handoff Files (.claude/tag-team/)  │  ← Persistent state on disk.
│  - Structured markdown              │
│  - Self-contained resume context    │
│  - Append-only progress log         │
└─────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dispatcher never reads large files or does real work. It spawns agents, reads their short return messages, and appends to a progress log. This is why it can run 10+ iterations without hitting its own context limit.&lt;/p&gt;

&lt;h2&gt;
  
  
  File Structure
&lt;/h2&gt;

&lt;p&gt;After a 3-worker relay, your &lt;code&gt;.claude/tag-team/&lt;/code&gt; directory looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;.claude/tag-team/
├── progress.md          &lt;span class="c"&gt;# Append-only log of all iterations&lt;/span&gt;
├── handoff-001.md       &lt;span class="c"&gt;# Worker 1's state when it handed off&lt;/span&gt;
├── handoff-002.md       &lt;span class="c"&gt;# Worker 2's state when it handed off&lt;/span&gt;
└── archive-20260401-143022/  &lt;span class="c"&gt;# Previous session (if any)&lt;/span&gt;
    ├── progress.md
    └── handoff-001.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuration
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;config.json&lt;/code&gt; controls relay behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_iterations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"handoff_dir"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".claude/tag-team"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"handoff_prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"handoff"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"progress_file"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"progress.md"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"context_thresholds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prepare_handoff"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"force_handoff"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"emergency_stop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"worker"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool_call_soft_limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bypassPermissions"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;max_iterations&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Hard cap on relay workers. Prevents runaway loops.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;context_thresholds.prepare_handoff&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;At this %, worker finishes current item and stops starting new work.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;context_thresholds.force_handoff&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;At this %, worker stops immediately and writes handoff.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;context_thresholds.emergency_stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;At this %, worker dumps partial state and exits.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;worker.tool_call_soft_limit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;If a worker makes this many tool calls without a context warning, it hands off proactively. Safety net for when the proxy monitoring is delayed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;worker.mode&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Permission mode for worker agents. &lt;code&gt;bypassPermissions&lt;/code&gt; avoids approval prompts mid-relay.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Dispatcher Loop (Pseudocode)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;parse arguments → (fresh | resume | status)

if status:
    read progress.md, list handoff files, display, stop

if resume:
    find highest-numbered handoff file
    set iteration = that number

if fresh:
    archive old session if exists
    create .claude/tag-team/
    write initial progress.md
    set iteration = 0

load worker-instructions.md

while iteration &amp;lt; max_iterations:
    build prompt:
        if iteration == 0: worker instructions + task description
        if iteration &amp;gt; 0:  worker instructions + "read handoff-{N}.md and continue"

    spawn agent(prompt, mode=bypassPermissions)

    parse result:
        "ALL_DONE: ..." → break
        "HANDOFF: ..." → increment iteration, continue
        else → check for handoff file, error if missing

    append to progress.md

display final report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Handoff File Format (Complete)
&lt;/h2&gt;

&lt;p&gt;Every handoff file follows this structure. The sections are mandatory — workers are instructed to fill all of them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Mission&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Goal: &lt;span class="nt"&gt;&amp;lt;original&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;copied&lt;/span&gt; &lt;span class="na"&gt;verbatim&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Status: &lt;span class="nt"&gt;&amp;lt;percentage&lt;/span&gt; &lt;span class="na"&gt;done&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt; &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;total&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Next step: &lt;span class="nt"&gt;&amp;lt;single&lt;/span&gt; &lt;span class="na"&gt;most&lt;/span&gt; &lt;span class="na"&gt;important&lt;/span&gt; &lt;span class="na"&gt;thing&lt;/span&gt; &lt;span class="na"&gt;for&lt;/span&gt; &lt;span class="na"&gt;the&lt;/span&gt; &lt;span class="na"&gt;next&lt;/span&gt; &lt;span class="na"&gt;worker&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;## Technical State&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Files modified: &lt;span class="nt"&gt;&amp;lt;full&lt;/span&gt; &lt;span class="na"&gt;paths&lt;/span&gt; &lt;span class="na"&gt;of&lt;/span&gt; &lt;span class="na"&gt;every&lt;/span&gt; &lt;span class="na"&gt;file&lt;/span&gt; &lt;span class="na"&gt;changed&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Files created: &lt;span class="nt"&gt;&amp;lt;full&lt;/span&gt; &lt;span class="na"&gt;paths&lt;/span&gt; &lt;span class="na"&gt;of&lt;/span&gt; &lt;span class="na"&gt;every&lt;/span&gt; &lt;span class="na"&gt;file&lt;/span&gt; &lt;span class="na"&gt;created&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Active entities: &lt;span class="nt"&gt;&amp;lt;repos&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;directories&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;configs&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;temp&lt;/span&gt; &lt;span class="na"&gt;files&lt;/span&gt; &lt;span class="na"&gt;in&lt;/span&gt; &lt;span class="na"&gt;play&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Working directory: &lt;span class="nt"&gt;&amp;lt;cwd&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt; &lt;span class="na"&gt;relevant&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;## Key Decisions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Decisions made: &lt;span class="nt"&gt;&amp;lt;choices&lt;/span&gt; &lt;span class="na"&gt;that&lt;/span&gt; &lt;span class="na"&gt;the&lt;/span&gt; &lt;span class="na"&gt;next&lt;/span&gt; &lt;span class="na"&gt;worker&lt;/span&gt; &lt;span class="na"&gt;MUST&lt;/span&gt; &lt;span class="na"&gt;preserve&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Dead ends: &lt;span class="nt"&gt;&amp;lt;approaches&lt;/span&gt; &lt;span class="na"&gt;that&lt;/span&gt; &lt;span class="na"&gt;FAILED&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="na"&gt;saves&lt;/span&gt; &lt;span class="na"&gt;the&lt;/span&gt; &lt;span class="na"&gt;next&lt;/span&gt; &lt;span class="na"&gt;worker&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt; &lt;span class="na"&gt;repeating&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Constraints: &lt;span class="nt"&gt;&amp;lt;rules&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt; &lt;span class="na"&gt;the&lt;/span&gt; &lt;span class="na"&gt;original&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;copied&lt;/span&gt; &lt;span class="na"&gt;exactly&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;## Progress&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Completed: &lt;span class="nt"&gt;&amp;lt;numbered&lt;/span&gt; &lt;span class="na"&gt;list&lt;/span&gt; &lt;span class="na"&gt;with&lt;/span&gt; &lt;span class="na"&gt;brief&lt;/span&gt; &lt;span class="na"&gt;descriptions&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; In progress: &lt;span class="nt"&gt;&amp;lt;what&lt;/span&gt; &lt;span class="na"&gt;was&lt;/span&gt; &lt;span class="na"&gt;being&lt;/span&gt; &lt;span class="na"&gt;worked&lt;/span&gt; &lt;span class="na"&gt;on&lt;/span&gt; &lt;span class="na"&gt;at&lt;/span&gt; &lt;span class="na"&gt;handoff&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="na"&gt;include&lt;/span&gt; &lt;span class="na"&gt;partial&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Remaining: &lt;span class="nt"&gt;&amp;lt;numbered&lt;/span&gt; &lt;span class="na"&gt;list&lt;/span&gt; &lt;span class="na"&gt;of&lt;/span&gt; &lt;span class="na"&gt;what&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="na"&gt;s&lt;/span&gt; &lt;span class="na"&gt;left&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;## Resume Instructions&lt;/span&gt;
&amp;lt;Written as a briefing for a new teammate who has never seen this task.
Explicit and actionable. Includes file paths, exact commands, specific
next steps. The next worker reads ONLY this section — it doesn't have
the conversation history.&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why Each Section Exists
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mission&lt;/strong&gt;: Prevents goal drift across workers. The task description is copied verbatim so Worker 5 is solving the same problem as Worker 1.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical State&lt;/strong&gt;: Workers need to know what's on disk. Without this, they waste tool calls re-discovering the file layout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Decisions / Dead ends&lt;/strong&gt;: The highest-value section. Dead ends are the difference between a smart relay and a dumb restart. If Worker 1 learned that batch-reading 20 files blows up context, Worker 2 shouldn't rediscover that.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Progress&lt;/strong&gt;: Explicit item-level tracking. "Completed files 1-80" is actionable. "Made good progress" is not.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resume Instructions&lt;/strong&gt;: Self-contained. Written for someone with zero context. This is what the next worker actually executes from.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Worker Rules
&lt;/h2&gt;

&lt;p&gt;Workers follow five operational rules designed to prevent the most common failure modes:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Interleave reads and writes
&lt;/h3&gt;

&lt;p&gt;The #1 cause of context exhaustion: reading 50 files into context, then trying to write outputs for all of them. By the time you start writing, you've forgotten the early files.&lt;/p&gt;

&lt;p&gt;Instead: read 5-8 files, write their outputs, repeat. Each batch is a self-contained unit of work that survives a handoff.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Save work to disk frequently
&lt;/h3&gt;

&lt;p&gt;Every file written to disk is progress that survives a handoff. Unwritten work that's only in the conversation context is lost when the worker exits. Write early, write often.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Prefer small committed units
&lt;/h3&gt;

&lt;p&gt;Finish one item completely before starting the next. A half-processed file is harder to resume than an unstarted one — the next worker has to figure out what's done and what isn't.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Track your own progress
&lt;/h3&gt;

&lt;p&gt;Workers maintain a mental count of items completed vs. total. This count goes into the handoff file and the return message. Without it, the dispatcher can't report meaningful progress.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Don't re-read what the previous worker summarized
&lt;/h3&gt;

&lt;p&gt;If the handoff file says "file X contains a REST endpoint for user creation," trust it. Only re-read source files when you need to generate output from them. This saves context for actual work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The --skill Flag
&lt;/h2&gt;

&lt;p&gt;The composition model is simple: when &lt;code&gt;--skill&lt;/code&gt; is provided, the task description is rewritten to invoke that skill:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Without --skill&lt;/span&gt;
/tag-team "Process all 200 files"
→ Task sent to worker: "Process all 200 files"

&lt;span class="gh"&gt;# With --skill&lt;/span&gt;
/tag-team PROJ-12345 --skill develop
→ Task sent to worker: "Run /develop PROJ-12345. This is a tag-team relay
   — follow the worker instructions for context warning handling and handoff."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The wrapped skill doesn't know it's inside a relay. It just runs normally. If it exhausts context, the worker's context-warning protocol kicks in, writes a handoff, and the next worker resumes the skill from the handoff state.&lt;/p&gt;

&lt;p&gt;This means any long-running skill becomes context-resilient without modifying the skill itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Error Handling
&lt;/h2&gt;

&lt;p&gt;Three failure modes and how each is handled:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Failure&lt;/th&gt;
&lt;th&gt;Detection&lt;/th&gt;
&lt;th&gt;Response&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Worker exits without &lt;code&gt;ALL_DONE&lt;/code&gt; or &lt;code&gt;HANDOFF&lt;/code&gt; prefix&lt;/td&gt;
&lt;td&gt;Dispatcher checks if handoff file exists at expected path&lt;/td&gt;
&lt;td&gt;If file exists: treat as handoff. If not: log error, stop relay, show full output to user.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Worker crashes mid-handoff&lt;/td&gt;
&lt;td&gt;Handoff file is incomplete or missing expected sections&lt;/td&gt;
&lt;td&gt;Dispatcher warns user and stops. Suggests &lt;code&gt;/tag-team resume&lt;/code&gt; after manual inspection.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max iterations reached&lt;/td&gt;
&lt;td&gt;Loop counter&lt;/td&gt;
&lt;td&gt;Dispatcher reports progress and suggests &lt;code&gt;/tag-team resume&lt;/code&gt; to continue.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Workers are instructed to write a handoff file even on errors — partial state is better than no state.&lt;/p&gt;

&lt;h2&gt;
  
  
  Session Management
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Fresh Start with Existing Session
&lt;/h3&gt;

&lt;p&gt;If &lt;code&gt;.claude/tag-team/&lt;/code&gt; already has files from a previous run, the dispatcher warns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Found existing tag-team session with 3 handoff files.
Use `/tag-team resume` to continue, or confirm to start fresh (will archive old files).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On confirmation, existing files are moved to &lt;code&gt;.claude/tag-team/archive-{timestamp}/&lt;/code&gt;. Nothing is deleted.&lt;/p&gt;

&lt;h3&gt;
  
  
  Resume Detection
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;/tag-team resume&lt;/code&gt; globs for &lt;code&gt;handoff-*.md&lt;/code&gt;, sorts numerically, takes the highest number, validates the file has the expected sections, then continues the loop from that iteration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Progress File
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;progress.md&lt;/code&gt; is an append-only log. After each worker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Iteration 2 (Worker-2)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Completed: 2026-04-02T14:35:22Z
&lt;span class="p"&gt;-&lt;/span&gt; Result: HANDOFF
&lt;span class="p"&gt;-&lt;/span&gt; Summary: Processed files 81-150. Handed off at context 78%. Next: file 151...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After relay completion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Relay Complete&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Total iterations: 3
&lt;span class="p"&gt;-&lt;/span&gt; Final result: COMPLETED
&lt;span class="p"&gt;-&lt;/span&gt; Completed: 2026-04-02T14:52:08Z
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When Tag-Team Adds Overhead vs. Value
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Overhead wins&lt;/strong&gt; (don't use tag-team):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Task fits in one context window&lt;/li&gt;
&lt;li&gt;Task requires deep cross-file reasoning where splitting work loses coherence&lt;/li&gt;
&lt;li&gt;Task is interactive and needs human input at multiple points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Value wins&lt;/strong&gt; (use tag-team):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Batch processing 20+ files&lt;/li&gt;
&lt;li&gt;Multi-phase workflows that routinely exhaust context&lt;/li&gt;
&lt;li&gt;Any task where you've restarted a conversation more than once&lt;/li&gt;
&lt;li&gt;Wrapping skills that sometimes fail due to context limits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The break-even point is roughly when you'd need 2+ manual restarts without tag-team.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is part of a series on scaling Claude Code for enterprise workflows. Previously: &lt;a href="https://dev.to/dibyanshu_kumar/how-i-stopped-losing-work-to-context-window-overflow-in-claude-code-1hll"&gt;How I Stopped Losing Work to Context Window Overflow&lt;/a&gt;, &lt;a href="https://dev.to/dibyanshu_kumar/how-i-taught-an-ai-agent-to-save-its-own-progress-2d58"&gt;How I Taught an AI Agent to Save Its Own Progress&lt;/a&gt;, and &lt;a href="//blog-3-centralized-skill-management.md"&gt;Centralized Skill Management for Claude Code&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Why I Built a Centralized Skill Registry Instead of Using Claude Code Plugins</title>
      <dc:creator>Dibyanshu kumar</dc:creator>
      <pubDate>Wed, 25 Mar 2026 03:46:28 +0000</pubDate>
      <link>https://dev.to/dibyanshu_kumar/why-i-built-a-centralized-skill-registry-instead-of-using-claude-code-plugins-cla</link>
      <guid>https://dev.to/dibyanshu_kumar/why-i-built-a-centralized-skill-registry-instead-of-using-claude-code-plugins-cla</guid>
      <description>&lt;h1&gt;
  
  
  Why I Built a Centralized Skill Registry Instead of Using Claude Code Plugins
&lt;/h1&gt;

&lt;p&gt;Claude Code has a plugin system. It's well-designed — namespaced skills, marketplace distribution, versioned releases. So why did I build my own centralized skill management layer on top of plain &lt;code&gt;.claude/skills/&lt;/code&gt; directories?&lt;/p&gt;

&lt;p&gt;Because plugins solve the &lt;em&gt;distribution&lt;/em&gt; problem. I needed to solve the &lt;em&gt;coordination&lt;/em&gt; problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;I work across multiple repos — different tech stacks, different teams, different build systems. Each repo needs Claude Code skills tailored to its architecture: how to run tests, how to structure PRs, what patterns to follow in code review.&lt;/p&gt;

&lt;p&gt;At first, I copied skill files into each repo's &lt;code&gt;.claude/skills/&lt;/code&gt; directory. Within a week, the copies drifted. I'd fix a bug in one repo's &lt;code&gt;/review-pr&lt;/code&gt; skill and forget to propagate it. Worse, some skills — like &lt;code&gt;/develop&lt;/code&gt; (a 12-phase Jira-to-PR orchestrator) — span 16 files and 4,000+ lines. Copy-pasting that across repos was not sustainable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Plugins?
&lt;/h2&gt;

&lt;p&gt;Claude Code plugins were the obvious answer. They're designed for exactly this: package skills into a distributable unit, install across projects. But when I evaluated them against my requirements, several gaps emerged.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Namespacing Adds Friction to Muscle Memory
&lt;/h3&gt;

&lt;p&gt;Plugin skills are namespaced: &lt;code&gt;/my-plugin:develop&lt;/code&gt; instead of &lt;code&gt;/develop&lt;/code&gt;. This is a good design decision for preventing conflicts in the ecosystem, but when you're the sole consumer of your skills across your own repos, the namespace is overhead. I want to type &lt;code&gt;/develop PROJ-123&lt;/code&gt; everywhere, not &lt;code&gt;/my-plugin:develop PROJ-123&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Standalone &lt;code&gt;.claude/skills/&lt;/code&gt; directories give you bare &lt;code&gt;/skill-name&lt;/code&gt; invocations. I wanted that simplicity &lt;em&gt;and&lt;/em&gt; centralized management.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. No Built-in Multi-Repo Coordination
&lt;/h3&gt;

&lt;p&gt;Plugins are install-and-forget — you install them per project and they work independently. But my workflow requires &lt;em&gt;cross-repo awareness&lt;/em&gt;. When a Jira ticket comes in, I need to figure out &lt;em&gt;which&lt;/em&gt; repo it belongs to before executing any skill.&lt;/p&gt;

&lt;p&gt;I built a &lt;code&gt;/dispatch&lt;/code&gt; skill that scores Jira tickets against a registry of projects using weighted signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Component match: +10 points&lt;/li&gt;
&lt;li&gt;Label match: +5 points&lt;/li&gt;
&lt;li&gt;Keyword match: +2 points (capped at +10)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the top score is &amp;gt;= 10 and &amp;gt;= 2x the runner-up, it auto-routes. Otherwise, it asks the user. Plugins have no concept of this — they don't know about other repos or how to route work between them.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Per-Project Configuration Without Forking
&lt;/h3&gt;

&lt;p&gt;Each repo has different thresholds for when a PR gets a full multi-agent review vs. a quick single-agent pass. Different build commands. Different branch naming conventions. Different default branches.&lt;/p&gt;

&lt;p&gt;Plugins handle this with plugin-level &lt;code&gt;settings.json&lt;/code&gt;, but that configuration lives &lt;em&gt;inside&lt;/em&gt; the plugin. If two repos need different thresholds for the same skill, you'd need either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two separate plugins (defeats the purpose of sharing)&lt;/li&gt;
&lt;li&gt;Configuration logic inside the skill that reads from some external source&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I went with the second approach directly: a &lt;code&gt;registry.json&lt;/code&gt; that stores per-project metadata, and a &lt;code&gt;config.json&lt;/code&gt; per project profile that tunes skill behavior. The skills are shared; the configuration is project-specific. The central repo has a directory per project profile, each with its own skills, agents, and config. A top-level &lt;code&gt;dispatch/&lt;/code&gt; skill handles cross-repo routing.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Symlinks &amp;gt; Install Cycles
&lt;/h3&gt;

&lt;p&gt;When I update a skill, the change should be live &lt;em&gt;immediately&lt;/em&gt; in every repo. No reinstall, no version bump, no marketplace push.&lt;/p&gt;

&lt;p&gt;Symlinks do this. Each project's &lt;code&gt;.claude/skills&lt;/code&gt; directory is a symlink pointing to the corresponding profile in the central repo. Edit a skill file in the central repo, and every project sees it instantly. This is critical during active development — when I'm iterating on a skill, I don't want a publish-install cycle between each test.&lt;/p&gt;

&lt;p&gt;Plugins require either re-running &lt;code&gt;--plugin-dir&lt;/code&gt; or doing &lt;code&gt;/reload-plugins&lt;/code&gt;. With symlinks and &lt;code&gt;--add-dir&lt;/code&gt;, Claude Code's live change detection picks up edits automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. The Registry as a Metadata Layer
&lt;/h3&gt;

&lt;p&gt;The real power isn't just shared skills — it's the &lt;code&gt;registry.json&lt;/code&gt; that sits above them. Each project entry contains its repo path, profile directory, tech stack, Jira routing signals (components, keywords, labels), available skills, and build commands. This enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic routing&lt;/strong&gt;: &lt;code&gt;/dispatch&lt;/code&gt; reads the registry to score and route tickets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skill validation&lt;/strong&gt;: Before executing, verify the target project actually supports that skill&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Status checks&lt;/strong&gt;: &lt;code&gt;/dispatch status&lt;/code&gt; verifies all symlinks are intact and repos exist&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Project discovery&lt;/strong&gt;: &lt;code&gt;/dispatch list&lt;/code&gt; shows all registered projects in a table&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this exists in the plugin model because plugins don't need it — they're scoped to a single project. A centralized registry is only valuable when you're managing skills &lt;em&gt;across&lt;/em&gt; projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup Script
&lt;/h2&gt;

&lt;p&gt;A setup script reads the registry and creates all the symlinks — each project's &lt;code&gt;.claude/skills&lt;/code&gt; and &lt;code&gt;.claude/agents&lt;/code&gt; directories point back to the central repo's profile for that project.&lt;/p&gt;

&lt;p&gt;Key decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Symlink subdirectories, not the entire &lt;code&gt;.claude/&lt;/code&gt;&lt;/strong&gt; — this preserves each project's local &lt;code&gt;settings.local.json&lt;/code&gt; and session artifacts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backup before replacing&lt;/strong&gt; — if a project already has a &lt;code&gt;skills/&lt;/code&gt; directory, back it up with a timestamp&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotent&lt;/strong&gt; — running it twice is safe; it skips correct symlinks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;New team member onboarding: clone the central repo, run the setup script, done. Every project gets the latest skills.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Plugins Are the Right Choice
&lt;/h2&gt;

&lt;p&gt;This approach isn't universally better than plugins. Plugins win when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You're distributing to the community&lt;/strong&gt; — namespacing and marketplaces matter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want versioned releases&lt;/strong&gt; — semver, changelogs, controlled rollouts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills are self-contained&lt;/strong&gt; — no cross-repo coordination needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You need to bundle MCP servers or hooks&lt;/strong&gt; — plugins package these alongside skills&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple consumers with different needs&lt;/strong&gt; — the marketplace model handles this well&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My approach wins when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You control all the target repos&lt;/strong&gt; — no need for marketplace discovery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills need cross-repo awareness&lt;/strong&gt; — routing, shared configuration, project metadata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want instant propagation&lt;/strong&gt; — symlinks over install cycles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bare skill names matter&lt;/strong&gt; — &lt;code&gt;/develop&lt;/code&gt; over &lt;code&gt;/plugin:develop&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-project config must be separate from skill logic&lt;/strong&gt; — registry + config.json pattern&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Hybrid Path
&lt;/h2&gt;

&lt;p&gt;These aren't mutually exclusive. You could package a centralized skill registry &lt;em&gt;as&lt;/em&gt; a plugin that manages symlinks and registry state. Or use plugins for truly standalone skills (like a generic &lt;code&gt;/explain-code&lt;/code&gt;) while using the registry pattern for workflow skills that need cross-repo context.&lt;/p&gt;

&lt;p&gt;The point isn't that plugins are wrong. It's that "how do I share skills across repos?" and "how do I coordinate AI workflows across repos?" are different problems. Plugins answer the first. A centralized registry with dispatch routing answers the second.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is part 3 of a series on scaling Claude Code for enterprise workflows. Previously: &lt;a href="https://dev.to/dibyanshu_kumar/how-i-stopped-losing-work-to-context-window-overflow-in-claude-code-1hll"&gt;How I Stopped Losing Work to Context Window Overflow&lt;/a&gt; and &lt;a href="https://dev.to/dibyanshu_kumar/how-i-taught-an-ai-agent-to-save-its-own-progress-2d58"&gt;How I Taught an AI Agent to Save Its Own Progress&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>devtools</category>
      <category>plugins</category>
    </item>
    <item>
      <title>How I Taught an AI Agent to Save Its Own Progress</title>
      <dc:creator>Dibyanshu kumar</dc:creator>
      <pubDate>Mon, 23 Mar 2026 03:20:14 +0000</pubDate>
      <link>https://dev.to/dibyanshu_kumar/how-i-taught-an-ai-agent-to-save-its-own-progress-2d58</link>
      <guid>https://dev.to/dibyanshu_kumar/how-i-taught-an-ai-agent-to-save-its-own-progress-2d58</guid>
      <description>&lt;p&gt;AI coding agents are stateless. Every time you start a new session, the agent has no memory of what happened before. If the session crashes, if you close the terminal, if context runs out — everything the agent knew is gone.&lt;/p&gt;

&lt;p&gt;I needed my agent to handle multi-hour development workflows. So I built a checkpoint system that lets the AI save and restore its own progress.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With Long Workflows
&lt;/h2&gt;

&lt;p&gt;I use Claude Code for full development cycles — not just "write a function" tasks, but the whole thing: read a Jira ticket, write a design document, get it reviewed, implement across multiple modules, run tests, create PRs.&lt;/p&gt;

&lt;p&gt;That's a lot of steps. And any one of them can fail:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The session crashes mid-implementation&lt;/li&gt;
&lt;li&gt;Context window fills up during code review&lt;/li&gt;
&lt;li&gt;I close my laptop and come back the next day&lt;/li&gt;
&lt;li&gt;A reviewer agent times out&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without checkpoints, I'd restart from scratch every time. Read the ticket again. Regenerate the design. Redo work that was already done.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I broke the development workflow into phases with two types of boundaries: &lt;strong&gt;automatic checkpoints&lt;/strong&gt; (the AI saves state on its own) and &lt;strong&gt;human gates&lt;/strong&gt; (the AI stops and waits for my approval).&lt;/p&gt;

&lt;p&gt;The workflow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gather Context → Write Design → Review Design
    → CHECKPOINT 1: I approve or edit the design →
Implement → Review Code
    → CHECKPOINT 2: I approve or reject the code →
Fix Issues → Commit → Create PR → Respond to PR Comments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each phase saves its status and artifacts to persistent storage. When a session dies, the next session picks up where the last one left off.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Checkpoints Work
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Saving State
&lt;/h3&gt;

&lt;p&gt;After each phase completes, the agent writes a checkpoint — a record of what was done, what was produced, and what comes next. The checkpoint includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Phase name and status (completed, in-progress, failed)&lt;/li&gt;
&lt;li&gt;Artifacts produced (design doc path, review report, branch names)&lt;/li&gt;
&lt;li&gt;Context needed for resumption (which modules are done, which review round we're on)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't conversation history. It's structured metadata about the workflow's progress.&lt;/p&gt;

&lt;h3&gt;
  
  
  Resuming
&lt;/h3&gt;

&lt;p&gt;When I start a new session and say "resume," the agent runs a reconciliation step:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check persistent storage for saved checkpoints&lt;/li&gt;
&lt;li&gt;Scan disk for artifacts (does the design doc exist? are there feature branches?)&lt;/li&gt;
&lt;li&gt;Reconcile — disk is the source of truth, checkpoints are supplementary&lt;/li&gt;
&lt;li&gt;Determine the first incomplete phase and jump to it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight: &lt;strong&gt;disk artifacts are more reliable than metadata.&lt;/strong&gt; If a design document exists on disk but the checkpoint says the design phase is "in progress," trust the disk. The file is there. The phase is done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Human Gates
&lt;/h3&gt;

&lt;p&gt;Two points in the workflow require my explicit approval:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After design review&lt;/strong&gt; — the agent presents the design document, review findings, and asks: approve, edit, or reject? If I say "edit," it applies my changes to the design doc and automatically re-runs the review. This loops until I approve.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After code review&lt;/strong&gt; — same pattern. Approve, fix issues, or reject. If there are critical findings, the agent auto-fixes them before I even see the checkpoint.&lt;/p&gt;

&lt;p&gt;These gates exist because some decisions shouldn't be automated. The agent can write code all day, but I decide whether the design makes sense.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes This Work
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Phase-Level Granularity
&lt;/h3&gt;

&lt;p&gt;I don't checkpoint every tool call or every message. I checkpoint at phase boundaries — after "gather context" is done, after "write design" is done, after each module is implemented. This keeps the checkpoint data small and meaningful.&lt;/p&gt;

&lt;h3&gt;
  
  
  Module-Level Progress
&lt;/h3&gt;

&lt;p&gt;Implementation can span five or six modules. The checkpoint tracks which modules are completed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Implementation progress (2/5 modules):
  [DONE] module-a
  [DONE] module-b
  [    ] module-c  ← resuming here
  [    ] module-d
  [    ] module-e
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the session dies after module 2, the next session skips straight to module 3.&lt;/p&gt;

&lt;h3&gt;
  
  
  Timeout Recovery
&lt;/h3&gt;

&lt;p&gt;Sometimes a reviewer agent times out — it hits its turn limit before finishing. Instead of re-running everything, the checkpoint records which reviewers completed and which didn't. On resume, I can choose to re-run just the failed reviewer and merge its findings into the existing report.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Checkpoints should be boring.&lt;/strong&gt; They're not a feature users interact with. They're infrastructure that makes everything else reliable. The best checkpoint system is one you never think about — sessions crash, you resume, and it just works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disk is a better source of truth than a database.&lt;/strong&gt; Files on disk are visible, auditable, and survive any kind of failure. A database record that says "design phase complete" is useless if the design file doesn't exist. Check the artifacts, not the metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human gates are the real value.&lt;/strong&gt; Automatic checkpointing is nice, but the ability to pause the workflow, inspect the output, and say "go back and fix this" — that's what makes the difference between an AI assistant and an AI that runs off and does whatever it wants.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI agents need state management, not just prompts.&lt;/strong&gt; We spend a lot of time crafting perfect prompts, but the hard problem isn't getting the AI to write good code. It's getting the AI to pick up where it left off without losing context, repeating work, or forgetting decisions that were already made.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;— DK&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agentai</category>
      <category>workflow</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How I Stopped Losing Work to Context Window Overflow in Claude Code</title>
      <dc:creator>Dibyanshu kumar</dc:creator>
      <pubDate>Mon, 23 Mar 2026 02:54:31 +0000</pubDate>
      <link>https://dev.to/dibyanshu_kumar/how-i-stopped-losing-work-to-context-window-overflow-in-claude-code-1hll</link>
      <guid>https://dev.to/dibyanshu_kumar/how-i-stopped-losing-work-to-context-window-overflow-in-claude-code-1hll</guid>
      <description>&lt;p&gt;If you use Claude Code for long coding sessions, you've probably experienced this: you're 40 minutes in, deep in a complex refactor, and the model starts forgetting things. It repeats itself. It loses track of what files it already edited. Then the session just dies — context window full, conversation over, work lost.&lt;/p&gt;

&lt;p&gt;I got tired of it and built a proxy to fix it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;LLM coding tools like Claude Code send everything — system prompts, tool definitions, project context, and your entire conversation history — in every API request. As the conversation grows, the payload approaches the model's context limit silently. There's no progress bar. No warning. The tool doesn't tell you "hey, you're at 80%, maybe wrap up."&lt;/p&gt;

&lt;p&gt;When it finally overflows, you lose the session. Whatever the model was working on, whatever context it had built up — gone. You start a new conversation from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Tried First
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Manual summarization&lt;/strong&gt; — I'd try to remember to ask the model to write a summary before context ran out. But I'd forget, or misjudge how much room was left.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shorter sessions&lt;/strong&gt; — Breaking work into tiny chunks defeats the purpose of having an AI coding assistant handle complex, multi-step tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt caching&lt;/strong&gt; — I built an entire cache optimization layer with volatility-based decomposition. Six layers, hash-based change detection, provider-specific cache hints. It was elegant in theory. In practice, it didn't meaningfully reduce costs or prevent overflows. I disabled it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Worked
&lt;/h2&gt;

&lt;p&gt;I built a local HTTP proxy called Prefixion that sits between Claude Code and the Anthropic API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code → Prefixion (localhost:8080) → api.anthropic.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It doesn't modify your prompts for caching. It doesn't try to be clever. It does two things well:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Context Window Warnings
&lt;/h3&gt;

&lt;p&gt;Every request passes through the proxy. Prefixion estimates token usage from the payload size and tracks where you are relative to the model's context limit.&lt;/p&gt;

&lt;p&gt;When you cross a threshold, it injects a warning directly into the conversation — appended to your last message so the model sees it as an urgent instruction:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At 70%&lt;/strong&gt; — a gentle alert:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"This conversation has used 72% of its context window. Write a conversation summary and suggest starting a new conversation."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;At 80%&lt;/strong&gt; — a firm warning:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"STOP. BEFORE responding to the user, write a conversation summary. Tell the user to start a new conversation."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;At 90%&lt;/strong&gt; — an emergency stop:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"STOP ALL WORK IMMEDIATELY. Do not make any more tool calls. Write a conversation summary. This conversation must end now."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The warnings escalate per conversation, so you only see each level once. And because they're injected into the user message (not the system prompt), they don't break any existing cache prefixes.&lt;/p&gt;

&lt;p&gt;The result: the model writes a summary file — what was accomplished, current status, open items, key files modified — before the session dies. When you start a new conversation, you have full context to pick up where you left off.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Everything Gets Tracked
&lt;/h3&gt;

&lt;p&gt;Every turn is logged to a local SQLite database with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Input/output token counts&lt;/li&gt;
&lt;li&gt;Cache read/write tokens (from the API response)&lt;/li&gt;
&lt;li&gt;Calculated cost in USD&lt;/li&gt;
&lt;li&gt;Guard events that fired (which warnings triggered, when)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's a web dashboard where you can browse conversations, see per-turn token breakdowns, and check guard efficiency metrics. It's useful for understanding how your sessions actually behave — which ones cost the most, where context fills up fastest, how often you hit the wall.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It's Set Up
&lt;/h2&gt;

&lt;p&gt;Point Claude Code at &lt;code&gt;http://localhost:8080&lt;/code&gt; as the API base URL and start the proxy. That's it. Auth headers pass through untouched. Streaming works. If the proxy fails for any reason, it forwards the original request unmodified — the "do no harm" principle.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The real problem isn't cost — it's session reliability.&lt;/strong&gt; I started this project trying to optimize prompt caching and reduce API bills. That turned out to be the wrong problem. The thing that actually hurt was losing work. A $2 session that crashes is worse than a $4 session that finishes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Warnings need to be injected, not displayed.&lt;/strong&gt; A notification in a sidebar doesn't help. The model needs to see the warning as an instruction it can act on. Injecting it into the conversation is crude but effective.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM tools will probably build these features natively.&lt;/strong&gt; Context awareness, session handoff — these should be built into Claude Code and Cursor and Aider. Until they are, a proxy is a clean way to add them without forking anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Should You Build One?
&lt;/h2&gt;

&lt;p&gt;Honestly — probably not. If you're a casual user, shorter sessions and manual summaries work fine. If you're a power user running 60-minute sessions on complex codebases, the context overflow problem is real and a proxy like this helps.&lt;/p&gt;

&lt;p&gt;But the ideas are what matter more than the code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Monitor context usage&lt;/strong&gt; and intervene before it's too late&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inject warnings as model instructions&lt;/strong&gt;, not UI notifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always write a summary&lt;/strong&gt; before a session ends, not after&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are patterns any tool can implement. The proxy approach is just one way to do it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;— DK&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>llm</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
