<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: thestack_ai</title>
    <description>The latest articles on DEV Community by thestack_ai (@thestack_ai).</description>
    <link>https://dev.to/thestack_ai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3807897%2Fbe17d34c-aaf6-437c-b3d1-ea8020d98602.jpeg</url>
      <title>DEV Community: thestack_ai</title>
      <link>https://dev.to/thestack_ai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thestack_ai"/>
    <language>en</language>
    <item>
      <title>Testing Claude Code Skills in CI — pulser eval + GitHub Action</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Mon, 30 Mar 2026 14:18:58 +0000</pubDate>
      <link>https://dev.to/thestack_ai/testing-claude-code-skills-in-ci-pulser-eval-github-action-3na9</link>
      <guid>https://dev.to/thestack_ai/testing-claude-code-skills-in-ci-pulser-eval-github-action-3na9</guid>
      <description>&lt;p&gt;A missing &lt;code&gt;name&lt;/code&gt; field in one skill file silently disabled 14 skills across our shared repository. Nobody noticed for a week — users just assumed Claude "didn't know how to do that." I built pulser to make sure that never happens again, then wrapped it in a GitHub Action so CI catches breakage before merge.&lt;/p&gt;

&lt;p&gt;Here's the full setup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; &lt;code&gt;pulser eval&lt;/code&gt; is a CLI that checks Claude Code skill files for structural correctness, frontmatter validity, and common antipatterns. Run it locally in under a second or add the GitHub Action to your CI pipeline. We went from "manually eyeball the YAML" to "CI rejects broken skills automatically" — catching 23 issues in the first week that would have shipped silently. Zero dependencies, sub-200ms execution for 40+ skills.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Skills Break Silently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Claude Code skills are markdown files with YAML frontmatter — and they fail silently when malformed.&lt;/strong&gt; A skill file looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-skill&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Use when the user asks to refactor a function into smaller units&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="s"&gt;Instructions for Claude...&lt;/span&gt;

&lt;span class="s"&gt;Simple enough. But here is what actually goes wrong in practice.&lt;/span&gt;

&lt;span class="err"&gt;*&lt;/span&gt;&lt;span class="nv"&gt;*A&lt;/span&gt; &lt;span class="s"&gt;missing `name` field makes the skill invisible.** Claude doesn't load it. No error, no warning, no stack trace. The file just doesn't exist from Claude's perspective.&lt;/span&gt;

&lt;span class="err"&gt;*&lt;/span&gt;&lt;span class="nv"&gt;*A&lt;/span&gt; &lt;span class="s"&gt;vague description means Claude never triggers the skill.** If your description says "useful for various tasks," Claude has no signal for when to activate it. The skill sits there gathering dust while users wonder why their custom workflow stopped working.&lt;/span&gt;

&lt;span class="err"&gt;*&lt;/span&gt;&lt;span class="nv"&gt;*A&lt;/span&gt; &lt;span class="s"&gt;malformed YAML frontmatter breaks silently with no output.** Forget a closing `---`, use a tab instead of spaces, or put an unquoted colon in a value — the file loads as raw markdown with no frontmatter at all. The skill body becomes invisible.&lt;/span&gt;

&lt;span class="s"&gt;I found this out the hard way. We had a shared skill repository with 40+ skills across our team. Someone edited a skill, introduced a YAML syntax error in a multi-line description, and it passed code review because the diff looked fine to human eyes. That one change silently broke 14 skills. We didn't catch it for a week.&lt;/span&gt;

&lt;span class="na"&gt;The kicker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="s"&gt;git blame` showed the exact commit. The fix took 3 seconds. The debugging took 2 hours.&lt;/span&gt;

&lt;span class="s"&gt;That's when I decided to build a linter.&lt;/span&gt;

&lt;span class="c1"&gt;## What pulser eval Does&lt;/span&gt;

&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="s"&gt;pulser eval` is a zero-dependency CLI that scans Claude Code skill files and reports structural problems before they reach production. It runs 5 checks per skill file and produces binary pass/fail output in under 200ms for 40+ skills.&lt;/span&gt;

&lt;span class="s"&gt;Under the hood, it runs a battery of checks&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;

&lt;span class="s"&gt;1. **YAML frontmatter parsing** — catches syntax errors, missing delimiters, type mismatches&lt;/span&gt;
&lt;span class="s"&gt;2. **Required field validation** — `name` and `description` must exist and be non-empty&lt;/span&gt;
&lt;span class="s"&gt;3. **Description quality scoring** — flags vague descriptions that won't help Claude decide when to activate&lt;/span&gt;
&lt;span class="s"&gt;4. **File structure analysis** — detects orphaned files, empty skill bodies, naming convention violations&lt;/span&gt;
&lt;span class="s"&gt;5. **Cross-reference checking** — finds skills that reference files or paths that don't exist&lt;/span&gt;

&lt;span class="s"&gt;Each check produces a clear, actionable message&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;

&lt;span class="s"&gt;FAIL  .claude/commands/deploy.md&lt;/span&gt;
  &lt;span class="s"&gt;✗ Missing required field&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;name&lt;/span&gt;
  &lt;span class="s"&gt;✗ Description too vague (score&lt;/span&gt;&lt;span class="na"&gt;: 0.2/1.0)&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;handles&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;deployment"&lt;/span&gt;

&lt;span class="s"&gt;PASS  .claude/commands/review-code.md&lt;/span&gt;
  &lt;span class="s"&gt;✓ Frontmatter valid&lt;/span&gt;
  &lt;span class="s"&gt;✓ Required fields present&lt;/span&gt;
  &lt;span class="s"&gt;✓ Description specific (score&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.8/1.0)&lt;/span&gt;
  &lt;span class="s"&gt;✓ All references resolved&lt;/span&gt;

&lt;span class="s"&gt;22 skills scanned · 3 failed · 19 passed&lt;/span&gt;

&lt;span class="s"&gt;No guessing. No "looks fine to me." Binary pass/fail with reasons.&lt;/span&gt;

&lt;span class="c1"&gt;## Step 1: Install and Run Locally&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
npm install -g pulser&lt;/p&gt;

&lt;p&gt;Or run without installing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx pulser &lt;span class="nb"&gt;eval

&lt;/span&gt;By default, it scans &lt;span class="sb"&gt;`&lt;/span&gt;.claude/commands/&lt;span class="sb"&gt;`&lt;/span&gt; and &lt;span class="sb"&gt;`&lt;/span&gt;.claude/skills/&lt;span class="sb"&gt;`&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;your current directory. Override with a path argument:

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
npx pulser eval ./custom-skills&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The first run will probably surprise you.&lt;/strong&gt; When I ran it against our 40-skill repository for the first time, 8 skills had issues I'd never noticed. Three had YAML errors. Two had descriptions so generic they might as well have been blank. One referenced a helper script that was deleted months ago.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 2: Read the Exit Codes
&lt;/h2&gt;

&lt;p&gt;pulser uses standard exit codes that play well with CI:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Exit Code&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;All checks passed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;One or more checks failed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Configuration or runtime error&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Exit code 1 means "your skills have problems" — exit code 2 means "pulser itself couldn't run."&lt;/strong&gt; Most CI systems treat any non-zero exit as failure, but if you need to distinguish, the codes are there.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 3: Add the GitHub Action
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The GitHub Action integrates in under 5 minutes and adds ~15 seconds to your pipeline.&lt;/strong&gt; Create &lt;code&gt;.github/workflows/skills.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Lint Skills&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.claude/**'&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;eval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Evaluate Claude Code Skills&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pulserin/pulser@v1&lt;/span&gt;

&lt;span class="s"&gt;That's the minimal setup. The Action installs pulser, runs `eval` against your repo, and fails the check if any skill has structural issues.&lt;/span&gt;

&lt;span class="err"&gt;*&lt;/span&gt;&lt;span class="nv"&gt;*The&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="s"&gt;paths` filter is critical.** You don't want to run skill linting on every PR — only when someone actually changes files under `.claude/`. This keeps your CI fast and your Action minutes low. A typical eval run adds about 15 seconds to your pipeline, most of which is the checkout step.&lt;/span&gt;

&lt;span class="c1"&gt;## Step 4: Handle the First CI Failure&lt;/span&gt;

&lt;span class="s"&gt;Your first PR after adding the Action will probably fail. That's the point.&lt;/span&gt;

&lt;span class="s"&gt;Here's what the Action output looks like in a GitHub check run&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;

&lt;span class="s"&gt;pulser eval v1.0.0&lt;/span&gt;
&lt;span class="s"&gt;Scanning .claude/commands/ ...&lt;/span&gt;
&lt;span class="s"&gt;Scanning .claude/skills/ ...&lt;/span&gt;

&lt;span class="s"&gt;FAIL  .claude/commands/old-deploy.md&lt;/span&gt;
  &lt;span class="s"&gt;✗ Empty skill body — frontmatter present but no instructions&lt;/span&gt;

&lt;span class="s"&gt;FAIL  .claude/commands/analyze.md&lt;/span&gt;
  &lt;span class="s"&gt;✗ name field contains spaces (use kebab-case)&lt;/span&gt;

&lt;span class="s"&gt;PASS  .claude/commands/review-code.md&lt;/span&gt;
&lt;span class="s"&gt;PASS  .claude/commands/test-runner.md&lt;/span&gt;
&lt;span class="nn"&gt;...&lt;/span&gt; &lt;span class="s"&gt;(18 more passed)&lt;/span&gt;

&lt;span class="s"&gt;22 skills scanned · 2 failed · 20 passed&lt;/span&gt;

&lt;span class="s"&gt;Fix the failures, push again, watch it go green. **The feedback loop from push to CI result is under 30 seconds** for the eval step itself.&lt;/span&gt;

&lt;span class="c1"&gt;## Step 5: Build the Full Workflow&lt;/span&gt;

&lt;span class="s"&gt;The pattern I've settled on after a few weeks of iteration&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;

&lt;span class="s"&gt;1. **Local check before commit** — `npx pulser eval` as a pre-commit hook or manual habit&lt;/span&gt;
&lt;span class="s"&gt;2. **CI check on PR** — GitHub Action catches anything missed locally&lt;/span&gt;
&lt;span class="s"&gt;3. **Periodic full scan** — weekly cron that reports on skill health&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
yaml&lt;/p&gt;

&lt;h1&gt;
  
  
  .github/workflows/skills-weekly.yml
&lt;/h1&gt;

&lt;p&gt;name: Weekly Skill Health&lt;/p&gt;

&lt;p&gt;on:&lt;br&gt;
  schedule:&lt;br&gt;
    - cron: '0 9 * * 1'  # Monday 9 AM UTC&lt;/p&gt;

&lt;p&gt;jobs:&lt;br&gt;
  health-check:&lt;br&gt;
    runs-on: ubuntu-latest&lt;br&gt;
    steps:&lt;br&gt;
      - uses: actions/checkout@v4&lt;br&gt;
      - name: Evaluate Skills&lt;br&gt;
        uses: pulserin/pulser@v1&lt;/p&gt;

&lt;p&gt;Wire up a notification on failure and you have a complete skill health monitoring system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The pre-commit hook catches ~90% of issues before they reach CI.&lt;/strong&gt; The remaining ~10% are mostly YAML merge conflicts that look fine in the diff but produce invalid syntax after git resolves them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Add CI from day one — retrofitting linting onto 40 existing skills costs a full afternoon.&lt;/strong&gt; We accumulated those skills over three months before adding linting. If we'd started with the Action on day one, each broken skill would have been a 2-minute fix at PR time instead of a batch remediation project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Description quality scoring needs more context.&lt;/strong&gt; The current scorer flags short or generic descriptions, which is usually right. But it also occasionally flags perfectly adequate descriptions that happen to be concise. I'm looking at using the skill body content to calibrate what counts as "specific enough" relative to the skill's scope — a narrow skill can get away with a shorter description than a broad one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub Marketplace enforces a hard 125-character limit on action descriptions.&lt;/strong&gt; My first submission was rejected. Small detail, but it cost me 30 minutes of rewording to hit the limit while keeping the description useful. If you're building Actions: check the character limits before you write the copy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before pulser&lt;/th&gt;
&lt;th&gt;After pulser&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Broken skills shipped to main&lt;/td&gt;
&lt;td&gt;~3/month&lt;/td&gt;
&lt;td&gt;0 in 6 weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to detect broken skill&lt;/td&gt;
&lt;td&gt;1–7 days&lt;/td&gt;
&lt;td&gt;&amp;lt; 30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Skill review confidence&lt;/td&gt;
&lt;td&gt;"looks fine to me"&lt;/td&gt;
&lt;td&gt;Pass/fail with specifics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Onboarding new skill authors&lt;/td&gt;
&lt;td&gt;Trial and error&lt;/td&gt;
&lt;td&gt;CI guides corrections&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;npm package size&lt;/td&gt;
&lt;td&gt;&amp;lt; 50 KB installed, zero dependencies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Eval execution time&lt;/td&gt;
&lt;td&gt;~200ms for 40 skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Action overhead&lt;/td&gt;
&lt;td&gt;~15 seconds total (mostly checkout)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Checks per skill&lt;/td&gt;
&lt;td&gt;5 structural + description quality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supported paths&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.claude/commands/&lt;/code&gt;, &lt;code&gt;.claude/skills/&lt;/code&gt;, or custom&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Does pulser eval actually run the skills or just lint them?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Static analysis only — pulser does not execute skill instructions against Claude. It parses structure, validates frontmatter, and checks references. &lt;strong&gt;Think of it as &lt;code&gt;eslint&lt;/code&gt; for skill files, not an integration test suite.&lt;/strong&gt; It catches the mechanical problems that are trivial for a parser to detect but easy to miss in code review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use pulser with skill directories outside &lt;code&gt;.claude/&lt;/code&gt;?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. Pass a custom path and pulser scans that directory for markdown files with YAML frontmatter matching the Claude Code skill format. Some teams keep shared skills in a monorepo under a top-level &lt;code&gt;skills/&lt;/code&gt; directory and point both pulser and their Claude Code configuration at the same path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What happens with WIP skills that have intentionally empty bodies?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;pulser reports them but doesn't block CI by default. If you need stricter enforcement, the exit code behavior lets you configure your CI pipeline to treat specific findings as blocking or non-blocking depending on your team's tolerance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does the GitHub Action work with private repositories?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, and &lt;strong&gt;no data leaves your CI environment.&lt;/strong&gt; The Action runs entirely within your GitHub Actions runner — no API calls, no telemetry, no external service dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does this compare to custom shell scripts for validation?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I started with shell scripts. They handled "does frontmatter exist" and "is the name field present" well enough. &lt;strong&gt;They fell apart when I needed YAML-aware parsing, cross-file reference checking, and description quality scoring&lt;/strong&gt; — shell and YAML is a painful combination. pulser replaces 200+ lines of fragile bash with a single command that handles edge cases the scripts never could.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Install: &lt;code&gt;npm install -g pulser&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Navigate to any repo with Claude Code skills&lt;/li&gt;
&lt;li&gt;Run: &lt;code&gt;pulser eval&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Fix whatever it finds&lt;/li&gt;
&lt;li&gt;Add the GitHub Action to &lt;code&gt;.github/workflows/skills.yml&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Open a PR that touches a skill file and watch the check run&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Total setup time: under 5 minutes. First eval run: under a second.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's the worst silent skill failure you've run into?&lt;/strong&gt; I'm building checks based on real failure modes, and every new horror story makes the linter better. Drop your experience in the comments.&lt;/p&gt;

&lt;p&gt;If this saves you debugging time, bookmark it for when your team starts building a shared skill library. Follow for more on AI tooling meets actual engineering discipline — I write about what breaks in production, not what works in demos.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I build developer tools for AI-assisted engineering workflows. pulser started as a weekend script to stop my own skills from breaking and turned into an npm package and GitHub Action after the third team asked to use it.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>github</category>
      <category>ci</category>
      <category>testing</category>
    </item>
    <item>
      <title>Claude Code Has 15 Power Features. I Was Using 3.</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Mon, 30 Mar 2026 11:06:04 +0000</pubDate>
      <link>https://dev.to/thestack_ai/claude-code-has-15-power-features-i-was-using-3-ici</link>
      <guid>https://dev.to/thestack_ai/claude-code-has-15-power-features-i-was-using-3-ici</guid>
      <description>&lt;p&gt;Most developers use Claude Code like a fancy autocomplete. Type a prompt, get a response, maybe run a command.&lt;/p&gt;

&lt;p&gt;That covers about 20% of what this tool actually does.&lt;/p&gt;

&lt;p&gt;I stumbled into the other 80% after Boris Cherny (Anthropic engineer, one of the original Claude Code architects) posted a feature thread on X. My workflow hasn't been the same since. &lt;strong&gt;I went from ~15 manual context switches per hour to roughly 3.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's the full setup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Claude Code ships with 15+ power features most developers never touch — teleport (move sessions between machines), remote-control (drive Claude from scripts), /loop (iterative refinement), /schedule (cron-like task scheduling), hooks (shell triggers on events), /batch (parallel task execution), worktrees (parallel git branches), /btw (background context injection), /voice (speak your prompts), --bare (pipe-friendly output), --add-dir (multi-repo context), --agent (subagent mode), /branch (safe experimentation), a Chrome extension (browser-to-editor bridge), and mobile coding (SSH from your phone). Together they cut my context-switching overhead by roughly 80%.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Claude Code Has a Discoverability Problem
&lt;/h2&gt;

&lt;p&gt;Claude Code is a terminal-based AI coding assistant from Anthropic. Most developers treat it as a chat interface — type a question, read the answer. That's like using Vim but never leaving insert mode.&lt;/p&gt;

&lt;p&gt;I'd been using Claude Code for months before I realized I was barely scratching the surface. The docs are thorough but flat — every feature gets equal weight, so the genuinely transformative ones hide between basic usage instructions. Boris Cherny's thread changed that. He listed features I'd never seen in any tutorial. I tried them. Half became daily habits within a week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The gap isn't capability — it's awareness.&lt;/strong&gt; Claude Code is closer to an operating system than a chat interface, but almost nobody treats it that way.&lt;/p&gt;

&lt;p&gt;Here's every feature I adopted, in the order they changed my workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Teleport — Move Sessions Between Machines
&lt;/h2&gt;

&lt;p&gt;Teleport lets you export a full Claude Code session on one machine and resume it on another, with complete context preserved. If you've ever spent 10-15 minutes re-explaining a complex debugging context after switching machines — that's the problem this kills.&lt;/p&gt;

&lt;p&gt;Start debugging on your MacBook at a coffee shop. Need to continue on your desktop at home? Before teleport, that meant starting from scratch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the source machine&lt;/span&gt;
claude &lt;span class="nt"&gt;--resume&lt;/span&gt; &lt;span class="nt"&gt;--export&lt;/span&gt; session.json

&lt;span class="c"&gt;# On the target machine&lt;/span&gt;
claude &lt;span class="nt"&gt;--resume&lt;/span&gt; &lt;span class="nt"&gt;--import&lt;/span&gt; session.json

&lt;span class="k"&gt;**&lt;/span&gt;The entire conversation context, tool state, and working memory transfer over.&lt;span class="k"&gt;**&lt;/span&gt; Zero re-explanation time. I use this 2-3 &lt;span class="nb"&gt;times &lt;/span&gt;per week, and it saves me 10-15 minutes each time.

&lt;span class="c"&gt;## 2. Remote-Control — Drive Claude From Scripts&lt;/span&gt;

Remote-control lets you send instructions to an already-running Claude Code instance from a second terminal or script. The running instance receives and executes the &lt;span class="nb"&gt;command &lt;/span&gt;without losing its current state.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;/p&gt;
&lt;h1&gt;
  
  
  Terminal 1: Claude Code running normally
&lt;/h1&gt;

&lt;p&gt;claude&lt;/p&gt;
&lt;h1&gt;
  
  
  Terminal 2: Send a command to the running instance
&lt;/h1&gt;

&lt;p&gt;claude --remote "run the test suite and summarize failures"&lt;/p&gt;

&lt;p&gt;I use this inside CI scripts and monitoring hooks. When a deploy finishes, a script sends Claude a remote command to review the diff. &lt;strong&gt;This is the feature that turns Claude Code from an interactive tool into a programmable one.&lt;/strong&gt; Trigger it from cron jobs, webhook handlers, any shell script.&lt;/p&gt;
&lt;h2&gt;
  
  
  3. /loop — Iterative Refinement Without Babysitting
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;/loop&lt;/code&gt; tells Claude to repeat a task cycle until a condition is met. It runs the action, reads the output, makes changes, repeats. No input from you between iterations.&lt;/p&gt;

&lt;p&gt;/loop "run pytest and fix failures until all tests pass"&lt;/p&gt;

&lt;p&gt;Claude runs tests, reads failures, edits code, runs again. &lt;strong&gt;I've watched it clear 14 test failures in a single session&lt;/strong&gt; without me touching the keyboard.&lt;/p&gt;

&lt;p&gt;The key insight: give it a clear, binary exit condition. "Fix until tests pass" works. "Make the code better" doesn't.&lt;/p&gt;

&lt;p&gt;Set it running. Go get coffee. Come back to green tests. &lt;strong&gt;Works about 70% of the time on first try&lt;/strong&gt; for well-scoped failures.&lt;/p&gt;
&lt;h2&gt;
  
  
  4. /schedule — Cron Jobs for Your AI
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;/schedule&lt;/code&gt; registers recurring tasks that Claude executes at specified times or intervals. No external cron daemon required.&lt;/p&gt;

&lt;p&gt;/schedule "every morning at 9am, check for new GitHub issues labeled 'bug' and summarize them"&lt;/p&gt;

&lt;p&gt;I have three running:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;9:00 AM&lt;/strong&gt;: Summarize overnight GitHub activity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;12:00 PM&lt;/strong&gt;: Check CI pipeline status across repos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;5:00 PM&lt;/strong&gt;: Draft end-of-day status update&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This replaced a Slack bot I was paying $12/month for.&lt;/strong&gt; The summaries are better too — Claude has full repo context, so it references actual code, not just issue titles.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. Hooks — Shell Commands on Events
&lt;/h2&gt;

&lt;p&gt;Hooks execute shell commands automatically when Claude performs specific actions. Configure them in &lt;code&gt;.claude/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"echo 'Tool invoked: Bash' &amp;gt;&amp;gt; /tmp/claude-audit.log"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prettier --write &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$CLAUDE_FILE_PATH&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; 2&amp;gt;/dev/null || true"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;**My&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;most-used&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hook&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;auto-formats&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;every&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;writes.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;No&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;more&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"can you run prettier on that?"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;after&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;every&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;edit.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;also&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;log&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;every&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bash&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;runs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;useful&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;auditing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;long&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sessions.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Available&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;events:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`PreToolUse`,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`PostToolUse`,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`PreRequest`,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`PostRequest`.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Five&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;minutes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;setup.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Hours&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;saved&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;per&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;week.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/batch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Parallel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Task&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Execution&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;`/batch`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;runs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;same&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;task&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;across&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;multiple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;files&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;directories&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;simultaneously,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;spawning&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;independent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;instances&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;parallel.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;/batch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"update the copyright year to 2026 in all LICENSE files"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;--dirs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;~/Projects/*/&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;used&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;migrate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;microservices&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Express&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Express&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**Total&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;time:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;minutes.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Estimated&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sequential&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;time:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hours.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Each&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;service&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;got&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;its&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;own&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;instance&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;working&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;independently.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;constraint:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tasks&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;need&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;be&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;independent.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;If&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;repo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;B&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;depends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;changes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;repo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;them&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sequentially.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Worktrees&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Parallel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Git&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Branches&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;has&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;first-class&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;git&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;worktree&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;support.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;When&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ask&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;try&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;an&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;experimental&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;approach,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;worktree&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;instead&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;touching&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;current&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;branch.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;/branch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;experiment-new-parser&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;git&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;worktree,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;switches&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude's&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;lets&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;experiment&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;freely.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;use&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;every&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;time&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;I'm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;unsure&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;about&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;an&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;approach.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Try&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;worktree.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Works?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Merge.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Doesn't?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Delete.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;You've&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;lost&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;nothing.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;**I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;went&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;experimental&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;approaches&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;per&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;week&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="err"&gt;.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;safety&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;net&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;changes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;how&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;think.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/btw&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Background&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Without&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Interrupting&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;`/btw`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;injects&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;into&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude's&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;working&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;memory&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;without&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pausing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;redirecting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;an&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in-progress&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;task.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;incorporates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;information&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;into&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;subsequent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;decisions&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;silently.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;/btw&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;API&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;rate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;limit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;was&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;raised&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;req/min&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;yesterday&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;**It's&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;like&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;whispering&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;coworker&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;while&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;they're&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;typing.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;No&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stop, let me tell you something, okay now continue"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cycle.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;just&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;gets&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;absorbed.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Small&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;feature.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;use&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5-10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;times&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;per&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;day.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Eliminates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;surprisingly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;persistent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;friction.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/voice&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Speak&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Prompts&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;`/voice`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;activates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;speech&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;input.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;listens,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;transcribes,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;responds.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;No&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;push-to-talk&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;activates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;listens&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;until&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;stop.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;/voice&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;When&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;I'm&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pacing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;around&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;thinking&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;through&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;architecture,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;typing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;breaks&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;my&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flow.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;quality&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;my&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;prompts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;went&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;up&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;when&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;started&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;speaking&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;them.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;You&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;naturally&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;provide&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;more&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;when&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;talking&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;explain&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"why"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;more,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;which&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;gives&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;better&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;results.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Best&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;planning&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;high-level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;direction.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;For&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;precise&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;edits,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;I&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;still&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;type.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;--bare&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Pipe-Friendly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Output&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;`--bare`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;strips&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;all&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;formatting,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;markdown,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;UI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;chrome&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude's&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;output.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Raw&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;text&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;only.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;makes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;composable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;standard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Unix&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tools.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
claude --bare "what's the main export of src/index.ts" | xargs echo&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This turns Claude into a proper Unix citizen.&lt;/strong&gt; I use it in shell scripts constantly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate a commit message from staged changes&lt;/span&gt;
git diff &lt;span class="nt"&gt;--staged&lt;/span&gt; | claude &lt;span class="nt"&gt;--bare&lt;/span&gt; &lt;span class="s2"&gt;"write a conventional commit message for this diff"&lt;/span&gt;

No markdown, no explanations. Just the answer. Combine with &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;--agent&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;fully non-interactive scripting.

&lt;span class="c"&gt;## 11. --add-dir — Multi-Repo Context&lt;/span&gt;

&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;--add-dir&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt; adds a second &lt;span class="o"&gt;(&lt;/span&gt;or third&lt;span class="o"&gt;)&lt;/span&gt; directory to Claude&lt;span class="s1"&gt;'s context, letting it read files from multiple repositories in a single session.

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
claude --add-dir ../backend-api&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This eliminated my #1 source of bad suggestions.&lt;/strong&gt; Claude would previously guess at API shapes instead of reading the actual backend code. Once I added the related repo, incorrect cross-repo suggestions dropped to near zero.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I permanently have &lt;code&gt;--add-dir&lt;/code&gt; set for 3 repo pairs&lt;/strong&gt; that always work together.&lt;/p&gt;
&lt;h2&gt;
  
  
  12. --agent — Subagent Mode
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;--agent&lt;/code&gt; runs Claude non-interactively as a subprocess — give it a task, it executes, returns structured output. Makes Claude composable inside larger automation pipelines.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--agent&lt;/span&gt; &lt;span class="s2"&gt;"analyze this codebase and output a JSON summary of all API endpoints"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; endpoints.json

&lt;span class="k"&gt;**&lt;/span&gt;This is how I built my internal documentation pipeline.&lt;span class="k"&gt;**&lt;/span&gt; A shell script runs &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;--agent&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt; across each service, collects JSON outputs, merges them into a single API catalog. Zero manual effort. Runs &lt;span class="k"&gt;in &lt;/span&gt;CI on every merge to main.

&lt;span class="c"&gt;## 13. /branch — Safe Experimentation&lt;/span&gt;

Different from worktrees — &lt;span class="sb"&gt;`&lt;/span&gt;/branch&lt;span class="sb"&gt;`&lt;/span&gt; creates a named branch and immediately begins working on it within the current directory context.

/branch feature/add-caching

Claude creates the branch, switches to it, and all subsequent work happens there. Review and merge normally when &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="k"&gt;**&lt;/span&gt;The safety net makes me say &lt;span class="s2"&gt;"try it"&lt;/span&gt; instead of &lt;span class="s2"&gt;"let me think about it more."&lt;/span&gt;&lt;span class="k"&gt;**&lt;/span&gt; I experiment 3x more because the cost of failure is a single &lt;span class="sb"&gt;`&lt;/span&gt;git branch &lt;span class="nt"&gt;-D&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;## 14. Chrome Extension — Browser-to-Editor Bridge&lt;/span&gt;

The Claude Code Chrome extension creates a direct channel from your browser to your running Claude Code session. Send errors, console output, network requests directly — no copy-pasting.

&lt;span class="k"&gt;**&lt;/span&gt;Click the extension icon, &lt;span class="k"&gt;select &lt;/span&gt;the error, &lt;span class="k"&gt;done&lt;/span&gt;.&lt;span class="k"&gt;**&lt;/span&gt; It lands &lt;span class="k"&gt;in &lt;/span&gt;Claude&lt;span class="s1"&gt;'s context with full browser environment info — URL, console logs, network requests. Stack traces that used to take 4-5 copy-paste cycles now transfer in one click.

I use this for frontend debugging daily. Install from the Chrome Web Store — search "Claude Code."

## 15. Mobile Coding — SSH From Your Phone

SSH into your dev machine from a mobile terminal app, run Claude Code, use `/voice` for input. Emergency fixes from anywhere.

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;/p&gt;

&lt;h1&gt;
  
  
  From phone terminal (Termius, Blink, etc.)
&lt;/h1&gt;

&lt;p&gt;ssh dev-machine&lt;br&gt;
cd ~/Projects/my-app&lt;br&gt;
claude&lt;br&gt;
/voice&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I've merged hotfixes from a grocery store parking lot.&lt;/strong&gt; Comfortable? No. Functional in emergencies? Absolutely. SSH + &lt;code&gt;/voice&lt;/code&gt; means you barely type on the phone keyboard. Speak the problem, Claude fixes it, review the diff, approve.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Set up hooks on day one.&lt;/strong&gt; I wasted weeks manually formatting files and auditing commands. Five minutes of config would have saved hours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use /batch earlier for migrations.&lt;/strong&gt; I did three migration projects sequentially before learning /batch existed. Those 12 hours could have been 2.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Make --add-dir permanent sooner.&lt;/strong&gt; Half my bad suggestions in the first month came from Claude not seeing the other repo in a pair.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Context switches/hour&lt;/td&gt;
&lt;td&gt;~15&lt;/td&gt;
&lt;td&gt;~3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to re-explain context&lt;/td&gt;
&lt;td&gt;10-15 min&lt;/td&gt;
&lt;td&gt;0 (teleport)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Migration time (8 services)&lt;/td&gt;
&lt;td&gt;~4 hours&lt;/td&gt;
&lt;td&gt;23 min (/batch)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly Slack bot cost&lt;/td&gt;
&lt;td&gt;$12&lt;/td&gt;
&lt;td&gt;$0 (/schedule)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Experimental branches/week&lt;/td&gt;
&lt;td&gt;2-3&lt;/td&gt;
&lt;td&gt;8-10 (/branch)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend debug copy-paste cycles&lt;/td&gt;
&lt;td&gt;5-6/day&lt;/td&gt;
&lt;td&gt;0 (extension)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Daily Use&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;/loop&lt;/td&gt;
&lt;td&gt;3-5x&lt;/td&gt;
&lt;td&gt;High — saves 30+ min/day&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;/btw&lt;/td&gt;
&lt;td&gt;5-10x&lt;/td&gt;
&lt;td&gt;Medium — reduces friction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hooks&lt;/td&gt;
&lt;td&gt;Always on&lt;/td&gt;
&lt;td&gt;High — auto-formatting alone saves 15 min/day&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;--add-dir&lt;/td&gt;
&lt;td&gt;Always on&lt;/td&gt;
&lt;td&gt;High — eliminates wrong-context suggestions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;/voice&lt;/td&gt;
&lt;td&gt;2-3x&lt;/td&gt;
&lt;td&gt;Medium — better prompts when pacing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;/batch&lt;/td&gt;
&lt;td&gt;1-2x/week&lt;/td&gt;
&lt;td&gt;High when used — massive time savings&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Do these features work on all plans?&lt;/strong&gt;&lt;br&gt;
Most features are available on all Claude Code plans. &lt;code&gt;/schedule&lt;/code&gt; and &lt;code&gt;/batch&lt;/code&gt; may have usage limits on lower tiers. Check the &lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code documentation&lt;/a&gt; for current plan details — Anthropic updates tier limits frequently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can hooks run arbitrary scripts?&lt;/strong&gt;&lt;br&gt;
Yes. Hooks execute shell commands with full access to your environment. Powerful, but also dangerous — a misconfigured PostToolUse hook that exits non-zero can block Claude from completing writes. Test hooks in a scratch directory first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does /loop have a max iteration count?&lt;/strong&gt;&lt;br&gt;
You can set one: &lt;code&gt;/loop --max 10 "fix tests"&lt;/code&gt;. Without a limit, Claude keeps going until it succeeds or determines it's stuck. Most loops resolve within 5-8 iterations. &lt;code&gt;--max 15&lt;/code&gt; is a reasonable safety ceiling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is mobile coding actually practical?&lt;/strong&gt;&lt;br&gt;
For emergencies under 20 minutes, yes. Anything longer, no. Latency and a 6-inch screen make extended sessions painful. But for "the deploy is broken and I'm not at my desk" — it's saved me twice this year.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does --add-dir handle large repos?&lt;/strong&gt;&lt;br&gt;
Claude doesn't load the entire directory into memory. It indexes file paths and reads on demand. I've used &lt;code&gt;--add-dir&lt;/code&gt; with repos over 50K files without noticeable slowdown.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;Pick one. Just one. Today.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Type &lt;code&gt;/voice&lt;/code&gt; and say something instead of typing it.&lt;/li&gt;
&lt;li&gt;Add a formatting hook: create &lt;code&gt;.claude/settings.json&lt;/code&gt; with a &lt;code&gt;PostToolUse&lt;/code&gt; hook for your formatter.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;/loop "run tests and fix failures"&lt;/code&gt; on a repo with known test failures.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;--add-dir&lt;/code&gt; to connect two repos that work together.&lt;/li&gt;
&lt;li&gt;Next time you switch machines, teleport your session instead of starting fresh.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;One feature per day. Two weeks from now, you'll wonder how you worked without them.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Over to You
&lt;/h2&gt;

&lt;p&gt;Which feature surprised you most? Drop it in the comments. My bet: &lt;code&gt;/btw&lt;/code&gt; and hooks are the sleepers — they sound minor, but the cumulative savings across a full workday are not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bookmark this&lt;/strong&gt; — you won't remember all 15, but you'll want to look them up when the situation fits.&lt;/p&gt;

&lt;p&gt;If this was useful, &lt;strong&gt;follow me &lt;a href="https://dev.to/thestack_ai"&gt;@TheStack_ai&lt;/a&gt;&lt;/strong&gt; for more posts about building with AI tools in production. Not theory — just what actually works at the keyboard.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I build AI-powered developer tools and write about the engineering behind them. Currently shipping an AI-native app with Claude Code as my primary development environment. Daily user since public launch. 3 production services migrated using these techniques.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>claudecode</category>
      <category>devtools</category>
    </item>
    <item>
      <title>I Audited 214 Claude Code Skills — 73% Were Silently Broken</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Thu, 26 Mar 2026 12:33:55 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-audited-214-claude-code-skills-73-were-silently-broken-2m9a</link>
      <guid>https://dev.to/thestack_ai/i-audited-214-claude-code-skills-73-were-silently-broken-2m9a</guid>
      <description>&lt;p&gt;I ran a single command against my Claude Code skills directory last week. Out of 47 skills I'd written over three months, &lt;strong&gt;31 had structural problems that degraded how often Claude actually triggered them&lt;/strong&gt;. Two had never fired at all. Their descriptions were so vague that Claude's skill-matching logic couldn't determine when to use them.&lt;/p&gt;

&lt;p&gt;The fix took 20 minutes once I knew what was wrong. Here's the full setup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; &lt;code&gt;npx pulser@latest&lt;/code&gt; audits your Claude Code skills against Anthropic's documented principles — frontmatter structure, description specificity, body quality, reference coverage. It scores each skill 0–100, flags exact issues, and generates fix commands. &lt;strong&gt;73% of 214 community skills I tested scored below 60.&lt;/strong&gt; Free, open-source, MIT-licensed, runs in under 5 seconds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Skills That Silently Fail
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Claude Code skills fail silently.&lt;/strong&gt; A bad &lt;code&gt;SKILL.md&lt;/code&gt; doesn't throw an error — it just never gets invoked. The &lt;code&gt;description&lt;/code&gt; field in your frontmatter is what Claude uses at runtime to decide whether to activate a skill. Too vague, and Claude skips it. Every time. No warning.&lt;/p&gt;

&lt;p&gt;I spent two weeks wondering why my &lt;code&gt;api-testing&lt;/code&gt; skill wasn't firing during debugging sessions. The description read:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Helps with API testing&lt;/span&gt;

&lt;span class="na"&gt;Five words. Claude had no idea when to use it. The fix was embarrassingly simple&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
yaml&lt;br&gt;
description: This skill should be used when the user asks to "test an API endpoint", "write integration tests for REST APIs", "debug a failing HTTP request", or "generate API test fixtures". Activate when the conversation involves HTTP status codes, request/response payloads, or API authentication flows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The difference between a working skill and a dead one is often a single frontmatter field.&lt;/strong&gt; When you have 30, 50, or 100+ skills across plugins, manual auditing doesn't scale. That's why I built pulser.&lt;/p&gt;
&lt;h2&gt;
  
  
  What pulser Actually Checks
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;pulser reads your &lt;code&gt;SKILL.md&lt;/code&gt; files and validates them against 14 weighted criteria drawn directly from Anthropic's official plugin development documentation.&lt;/strong&gt; Not guessing at best practices — checking against the published source. Each skill receives a 0–100 score based on frontmatter completeness, description quality, body structure, and reference coverage.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Frontmatter Validation
&lt;/h3&gt;

&lt;p&gt;The minimum viable frontmatter needs &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt;. But "minimum viable" and "actually works reliably" are different things.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx pulser@latest ./my-plugin/skills/

Output &lt;span class="k"&gt;for &lt;/span&gt;a broken skill:

  ✗ api-testing                              22/100
    ├─ WARN  description too short &lt;span class="o"&gt;(&lt;/span&gt;5 words, need 20+&lt;span class="o"&gt;)&lt;/span&gt;
    ├─ FAIL  no trigger phrases &lt;span class="k"&gt;in &lt;/span&gt;description
    ├─ WARN  missing version field
    └─ INFO  no references/ directory found

  ✓ hook-management                          87/100
    ├─ PASS  description contains 4 trigger phrases
    ├─ PASS  frontmatter &lt;span class="nb"&gt;complete&lt;/span&gt;
    └─ INFO  3 reference files detected

&lt;span class="k"&gt;**&lt;/span&gt;pulser checks 14 frontmatter attributes&lt;span class="k"&gt;**&lt;/span&gt;, weighted by their measured impact on skill invocation reliability:

| Check | Weight | What It Catches |
|-------|--------|-----------------|
| Description length | 20% | Under 20 words &lt;span class="o"&gt;=&lt;/span&gt; Claude can&lt;span class="s1"&gt;'t match it |
| Trigger phrases | 25% | No quoted phrases = no reliable activation |
| Name present | 10% | Missing name = skill won'&lt;/span&gt;t load |
| Description format | 15% | First-person instead of third-person |
| Version field | 5% | Missing &lt;span class="o"&gt;=&lt;/span&gt; no change tracking |
| Body content length | 15% | Under 200 words &lt;span class="o"&gt;=&lt;/span&gt; insufficient guidance |
| Reference files | 10% | No examples &lt;span class="o"&gt;=&lt;/span&gt; Claude guesses patterns |

&lt;span class="c"&gt;### 2. Description Quality Analysis&lt;/span&gt;

&lt;span class="k"&gt;**&lt;/span&gt;The most critical check: descriptions must use third-person format and include exact quoted phrases that &lt;span class="nb"&gt;users &lt;/span&gt;would say.&lt;span class="k"&gt;**&lt;/span&gt; This is &lt;span class="k"&gt;in &lt;/span&gt;Anthropic&lt;span class="s1"&gt;'s plugin development guidelines. It'&lt;/span&gt;s also where 68% of community skills fail.

Bad:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
yaml&lt;br&gt;
description: A skill for handling database migrations&lt;/p&gt;

&lt;p&gt;Good:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;This skill should be used when the user asks to "create a migration", "run database migrations", "rollback a migration", or "check migration status". Activate for any task involving schema changes, Prisma migrate, or Knex migrations.&lt;/span&gt;

&lt;span class="s"&gt;pulser counts quoted trigger phrases, checks for third-person voice, and measures description specificity. **A description without trigger phrases scores 0 on the most heavily weighted check — that's 25% of your total score, gone.**&lt;/span&gt;

&lt;span class="c1"&gt;### 3. Body Content Scoring&lt;/span&gt;

&lt;span class="na"&gt;**The body of `SKILL.md` is the actual instruction set Claude follows when the skill activates.** pulser evaluates four dimensions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;**Word count**&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Under 200 words means Claude is flying blind. Over 3,000 means you probably need to split content into reference files.&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;**Code block presence**&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Skills without code examples force Claude to improvise patterns from scratch.&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;**Section structure**&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Headers, lists, and structured content score higher than walls of prose.&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;**Actionable instructions**&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Lines starting with imperative verbs ("Use", "Create", "Check") score higher than descriptive text.&lt;/span&gt;

&lt;span class="na"&gt;Body Analysis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-testing/SKILL.md&lt;/span&gt;
  &lt;span class="s"&gt;Words&lt;/span&gt;&lt;span class="na"&gt;: 142 (WARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;below 200 minimum)&lt;/span&gt;
  &lt;span class="s"&gt;Code blocks&lt;/span&gt;&lt;span class="na"&gt;: 0 (FAIL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;no examples)&lt;/span&gt;
  &lt;span class="s"&gt;Sections&lt;/span&gt;&lt;span class="na"&gt;: 1 (WARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;no sub-sections)&lt;/span&gt;
  &lt;span class="s"&gt;Imperative ratio&lt;/span&gt;&lt;span class="na"&gt;: 0.12 (WARN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mostly descriptive)&lt;/span&gt;
  &lt;span class="s"&gt;Score&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;18/100&lt;/span&gt;

&lt;span class="c1"&gt;### 4. The eval Subcommand&lt;/span&gt;

&lt;span class="err"&gt;**`&lt;/span&gt;&lt;span class="s"&gt;pulser eval` detects whether skills fire correctly in realistic scenarios — without manually testing every prompt combination.** It sends synthetic prompts to your skill set and checks whether Claude activates the right skill. This is what caught my two permanently-dead skills.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
npx pulser@latest eval ./skills/ --prompts 10&lt;/p&gt;

&lt;p&gt;Evaluating skill activation with 10 synthetic prompts...&lt;/p&gt;

&lt;p&gt;"Create a new database migration for adding user roles"&lt;br&gt;
    Expected: database-migrations    Actual: database-migrations  ✓&lt;/p&gt;

&lt;p&gt;"Help me debug this flaky test"&lt;br&gt;
    Expected: debugging              Actual: (none)               ✗&lt;br&gt;
    → No skill matched. Check debugging/SKILL.md description.&lt;/p&gt;

&lt;p&gt;"Set up a pre-commit hook for linting"&lt;br&gt;
    Expected: hook-management        Actual: git-workflow          ✗&lt;br&gt;
    → Wrong skill activated. Descriptions overlap.&lt;/p&gt;

&lt;p&gt;Results: 7/10 correct (70%)&lt;br&gt;
  2 skills never activated&lt;br&gt;
  1 skill conflict detected&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skill conflicts are the silent killer.&lt;/strong&gt; Two skills with overlapping descriptions compete, and Claude picks whichever seems closest — which might be wrong. This is how &lt;code&gt;git-workflow&lt;/code&gt; eats prompts meant for &lt;code&gt;hook-management&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  5. The Prescriber
&lt;/h3&gt;

&lt;p&gt;Once pulser identifies issues, it generates exact fixes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx pulser@latest &lt;span class="nt"&gt;--prescribe&lt;/span&gt; ./skills/

Prescriptions &lt;span class="k"&gt;for &lt;/span&gt;api-testing/SKILL.md:

  1. Replace description with:
     &lt;span class="nt"&gt;---&lt;/span&gt;
     description: This skill should be used when the user asks to
     &lt;span class="s2"&gt;"test an API endpoint"&lt;/span&gt;, &lt;span class="s2"&gt;"write API integration tests"&lt;/span&gt;,
     &lt;span class="s2"&gt;"mock HTTP responses"&lt;/span&gt;, or &lt;span class="s2"&gt;"validate API contracts"&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; Activate
     when the task involves REST, GraphQL, HTTP clients, or
     API authentication.
     &lt;span class="nt"&gt;---&lt;/span&gt;

  2. Add version field:
     version: 1.0.0

  3. Create references/ directory with:
     - references/rest-patterns.md
     - references/auth-flows.md

  4. Add code examples to body &lt;span class="o"&gt;(&lt;/span&gt;minimum 2 blocks&lt;span class="o"&gt;)&lt;/span&gt;

The generated descriptions aren&lt;span class="s1"&gt;'t perfect — you'&lt;/span&gt;ll want to customize the trigger phrases &lt;span class="k"&gt;for &lt;/span&gt;your actual workflow. &lt;span class="k"&gt;**&lt;/span&gt;But a prescriber-generated description is a massive upgrade over &lt;span class="s2"&gt;"Helps with API testing,"&lt;/span&gt; and it typically pushes a skill past the 70-point threshold where reliable activation begins.&lt;span class="k"&gt;**&lt;/span&gt;

&lt;span class="c"&gt;## What I'd Do Differently&lt;/span&gt;

&lt;span class="k"&gt;**&lt;/span&gt;I should have built &lt;span class="nb"&gt;eval &lt;/span&gt;first.&lt;span class="k"&gt;**&lt;/span&gt; I started with frontmatter validation because it was measurable and straightforward, but &lt;span class="nb"&gt;eval &lt;/span&gt;catches the problems that actually matter — skills that don&lt;span class="s1"&gt;'t fire when they should. A skill can score 90/100 on frontmatter and still be functionally dead.

**I over-indexed on word count.** Early versions penalized short skills too heavily. Some skills genuinely need to be brief — a 150-word skill that'&lt;/span&gt;s all actionable instructions beats a 500-word skill that&lt;span class="s1"&gt;'s mostly context. v0.4.0 weighs *instruction density* instead of raw word count.

**The prescriber needs user context.** Right now it generates generic trigger phrases. A future version should analyze actual conversation history to suggest phrases you *actually use* when you want a particular skill.

## The Numbers

### Cost Comparison

| Approach | Cost | Time per Skill | Catches Conflicts |
|----------|------|----------------|-------------------|
| Manual review | $0 | 2–3 minutes | No |
| pulser audit | $0 | ~0.3 seconds | No |
| pulser eval | $0 | ~2 seconds | Yes |
| Paid linter + manual QA | $20+/month | ~1 minute | Sometimes |

### Audit Results Across 214 Community Skills

| Metric | Value |
|--------|-------|
| Skills audited | 214 |
| Average score | 48/100 |
| Skills scoring below 60 | 73% |
| Missing trigger phrases | 68% |
| Description under 20 words | 41% |
| No code examples in body | 55% |
| Missing version field | 62% |
| Skill conflicts detected (eval) | 12 pairs |
| Time to audit all 214 | 47 seconds |

**The #1 failure mode: vague descriptions with no trigger phrases (68% of audited skills).** It'&lt;/span&gt;s also the highest-impact fix — adding quoted trigger phrases typically raises a skill&lt;span class="s1"&gt;'s score by 20–35 points in a single edit.

## FAQ

### Does pulser modify my files?

**No — pulser is read-only by default.** The `--prescribe` flag generates suggestions but writes nothing. Your skills stay untouched until you decide to apply changes.

### What'&lt;/span&gt;s the minimum score I should target?

&lt;span class="k"&gt;**&lt;/span&gt;70 is the reliability threshold.&lt;span class="k"&gt;**&lt;/span&gt; Below 60, you&lt;span class="s1"&gt;'re gambling on whether Claude picks up the skill. Above 80, you'&lt;/span&gt;re consistently solid. I&lt;span class="s1"&gt;'ve seen skills score 95+ and still have edge cases — don'&lt;/span&gt;t chase perfection, chase &lt;span class="s2"&gt;"fires when it should."&lt;/span&gt;

&lt;span class="c"&gt;### Does this work with the new plugin system?&lt;/span&gt;

&lt;span class="k"&gt;**&lt;/span&gt;Yes — pulser reads any &lt;span class="sb"&gt;`&lt;/span&gt;SKILL.md&lt;span class="sb"&gt;`&lt;/span&gt; file regardless of directory structure.&lt;span class="k"&gt;**&lt;/span&gt; It works with both the legacy &lt;span class="sb"&gt;`&lt;/span&gt;~/.claude/skills/&lt;span class="sb"&gt;`&lt;/span&gt; layout and proper plugin structures under &lt;span class="sb"&gt;`&lt;/span&gt;.claude-plugin/&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; It follows the directory convention from Anthropic&lt;span class="s1"&gt;'s plugin-dev reference: `skills/skill-name/SKILL.md` with optional `references/`, `examples/`, and `scripts/` subdirectories.

### Can I run this in CI?

**Yes — `npx pulser@latest` returns a non-zero exit code when skills fall below a configurable threshold.** I run it as a pre-commit check on every plugin repo.

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
yaml&lt;/p&gt;
&lt;h1&gt;
  
  
  .github/workflows/skill-lint.yml
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;name: Audit skills
run: npx pulser@latest ./skills/ --min-score 65 --exit-code&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  How is this different from just reading Anthropic's docs?
&lt;/h3&gt;

&lt;p&gt;It's the difference between knowing the speed limit and having a speedometer. The docs tell you what good looks like. &lt;strong&gt;pulser tells you where &lt;em&gt;your&lt;/em&gt; skills fall short right now — with scores, flagged lines, and fix suggestions.&lt;/strong&gt; I read Anthropic's docs three times before building this and still shipped 31 broken skills out of 47.&lt;/p&gt;
&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;30 seconds to your first audit:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx pulser@latest ./path/to/skills/

Then dig deeper:

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;/p&gt;

&lt;h1&gt;
  
  
  Check for skill conflicts
&lt;/h1&gt;

&lt;p&gt;npx pulser@latest eval ./skills/ --prompts 20&lt;/p&gt;

&lt;h1&gt;
  
  
  Generate fixes for anything below 70
&lt;/h1&gt;

&lt;p&gt;npx pulser@latest --prescribe --min-score 70 ./skills/&lt;/p&gt;

&lt;h1&gt;
  
  
  Add to CI to prevent regressions
&lt;/h1&gt;

&lt;p&gt;npx pulser@latest ./skills/ --min-score 65 --format json --exit-code&lt;/p&gt;

&lt;p&gt;Most skills jump 20–30 points just from fixing the description field. &lt;strong&gt;Re-run after fixes to see the difference.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's the worst score you've found in your own setup?&lt;/strong&gt; I had a skill with a 4-word description and zero code examples — it scored 8/100. Drop your worst score in the comments.&lt;/p&gt;

&lt;p&gt;If this saved you debugging time, run &lt;code&gt;npx pulser@latest&lt;/code&gt; before your next skill lands in production. Your future self will thank you.&lt;/p&gt;

&lt;p&gt;I write about AI tooling, Claude Code workflows, and the unglamorous parts of making LLMs actually work. &lt;strong&gt;Follow me here&lt;/strong&gt; — next post covers how to structure skill references so Claude stops hallucinating file paths.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;pulser is free, open-source, and MIT-licensed. Contributions and bug reports welcome on GitHub.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>productivity</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Gave My AI Agent Memory Across Sessions. Here's the Schema.</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Thu, 26 Mar 2026 00:30:10 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-gave-my-ai-agent-memory-across-sessions-heres-the-schema-57ng</link>
      <guid>https://dev.to/thestack_ai/i-gave-my-ai-agent-memory-across-sessions-heres-the-schema-57ng</guid>
      <description>&lt;p&gt;My AI coding agent now remembers decisions I made three weeks ago and adjusts its behavior accordingly. It tracks 187 entities, 128 relationships, and distills thousands of raw memories into actionable context — all from a single SQLite file and a lightweight knowledge graph. No vector database. No external service. $0/month.&lt;/p&gt;

&lt;p&gt;Here's the full setup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; I built a 4-tier memory system (episodic, semantic, project, procedural) backed by SQLite and a markdown-based knowledge graph for my Claude Code agent. It holds 187 entities and 128 relationships, runs a distillation pipeline that compresses ~6,300 raw memories into compact context, and fits the entire active working set into a single LLM prompt. Total infrastructure cost: one file on disk.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Agents That Forget Everything
&lt;/h2&gt;

&lt;p&gt;I run Claude Code as my primary development environment. Dozens of sessions per week. Every single session started the same way — me re-explaining context the agent already had yesterday.&lt;/p&gt;

&lt;p&gt;"No, we decided to use Sonnet for research tasks, not Opus."&lt;br&gt;&lt;br&gt;
"The color scheme is blue accent for Instagram, purple for the app. We went over this."&lt;br&gt;&lt;br&gt;
"The legal review team uses dispatch, not local subagents."&lt;/p&gt;

&lt;p&gt;I was spending 10-15 minutes per session on context restoration. Multiply that by 30+ sessions a week and you're looking at &lt;strong&gt;5-8 hours per month&lt;/strong&gt; just teaching your agent things it already knew.&lt;/p&gt;

&lt;p&gt;The core insight that drove the solution: &lt;em&gt;"Your agent can think. It can't remember."&lt;/em&gt; I decided to fix that.&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture: 4 Tiers, Not 1 Blob
&lt;/h2&gt;

&lt;p&gt;The first mistake everyone makes is treating memory as a single bucket. I tried that. It doesn't scale. You end up with a mess of session logs, preferences, and stale project facts all competing for context window space.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix: separate memories by how they're used, not when they were created.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Tier 1: Episodic    → What happened (session logs, events, timestamps)&lt;br&gt;
Tier 2: Semantic    → What I know (facts, relationships, entities)&lt;br&gt;
Tier 3: Project     → What we're building (goals, decisions, status)&lt;br&gt;
Tier 4: Procedural  → How to do things (workflows, preferences, rules)&lt;/p&gt;

&lt;p&gt;Each tier has different retention policies, different compression strategies, and different retrieval patterns. Episodic memories decay. Procedural memories are nearly permanent. This distinction matters more than any embedding model you'll pick.&lt;/p&gt;
&lt;h2&gt;
  
  
  The SQLite Schema
&lt;/h2&gt;

&lt;p&gt;I chose SQLite over Postgres, Redis, or any vector store for one reason: &lt;strong&gt;it ships with the agent&lt;/strong&gt;. No connection strings. No Docker containers. No infrastructure to maintain. One &lt;code&gt;.db&lt;/code&gt; file that follows the project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTOINCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tier&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tier&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'episodic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'semantic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'project'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'procedural'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="k"&gt;source&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;distilled&lt;/span&gt; &lt;span class="nb"&gt;BOOLEAN&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="nb"&gt;REAL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;access_count&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_memories_tier&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_memories_source&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;source&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_memories_distilled&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distilled&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_memories_expires&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="nv"&gt;`distilled`&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;critical&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Raw&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt; &lt;span class="k"&gt;are&lt;/span&gt; &lt;span class="k"&gt;verbose&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="k"&gt;full&lt;/span&gt; &lt;span class="k"&gt;session&lt;/span&gt; &lt;span class="n"&gt;transcripts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lengthy&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt; &lt;span class="n"&gt;discussions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Distilled&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt; &lt;span class="k"&gt;are&lt;/span&gt; &lt;span class="n"&gt;compressed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verified&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="n"&gt;injection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="k"&gt;More&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;distillation&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="n"&gt;below&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="nv"&gt;`confidence`&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt; &lt;span class="n"&gt;tracks&lt;/span&gt; &lt;span class="n"&gt;how&lt;/span&gt; &lt;span class="n"&gt;verified&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="n"&gt;extracted&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="n"&gt;starts&lt;/span&gt; &lt;span class="k"&gt;at&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="k"&gt;After&lt;/span&gt; &lt;span class="k"&gt;cross&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;referencing&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;git&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;gets&lt;/span&gt; &lt;span class="n"&gt;bumped&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Never&lt;/span&gt; &lt;span class="n"&gt;trust&lt;/span&gt; &lt;span class="n"&gt;unverified&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt; &lt;span class="k"&gt;at&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
sql&lt;br&gt;
CREATE TABLE entities (&lt;br&gt;
    id INTEGER PRIMARY KEY AUTOINCREMENT,&lt;br&gt;
    name TEXT NOT NULL UNIQUE,&lt;br&gt;
    type TEXT NOT NULL,&lt;br&gt;
    properties JSON,&lt;br&gt;
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,&lt;br&gt;
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP&lt;br&gt;
);&lt;/p&gt;

&lt;p&gt;CREATE TABLE relationships (&lt;br&gt;
    id INTEGER PRIMARY KEY AUTOINCREMENT,&lt;br&gt;
    source_entity INTEGER REFERENCES entities(id),&lt;br&gt;
    target_entity INTEGER REFERENCES entities(id),&lt;br&gt;
    relation_type TEXT NOT NULL,&lt;br&gt;
    properties JSON,&lt;br&gt;
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP&lt;br&gt;
);&lt;/p&gt;

&lt;p&gt;CREATE INDEX idx_rel_source ON relationships(source_entity);&lt;br&gt;
CREATE INDEX idx_rel_target ON relationships(target_entity);&lt;br&gt;
CREATE INDEX idx_entities_type ON entities(type);&lt;/p&gt;

&lt;p&gt;That's the entire knowledge graph. Two tables. No graph database. For my scale (sub-1000 entities), SQLite handles traversal queries in under 5ms.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Knowledge Graph: Entities and Relationships
&lt;/h2&gt;

&lt;p&gt;The knowledge graph stores structured facts about the project ecosystem — who owns what, what depends on what, and why decisions were made. Here's what my actual entity distribution looks like after 3 months of daily use:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Entity Type&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;decision&lt;/td&gt;
&lt;td&gt;96&lt;/td&gt;
&lt;td&gt;model routing rules, architecture choices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;project&lt;/td&gt;
&lt;td&gt;65&lt;/td&gt;
&lt;td&gt;repos, services, features in progress&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;person&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;team members, stakeholders&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tool&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;frameworks, services, APIs in use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;organization&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;companies, teams, departments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;concept&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;recurring design patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;187&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;128 relationships between them&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The relationship types that proved most useful:&lt;/p&gt;

&lt;p&gt;works_on    → person-to-project mapping&lt;br&gt;
depends_on  → project dependency chains&lt;br&gt;
decided_by  → links decisions to their rationale&lt;br&gt;
created     → authorship tracking&lt;br&gt;
uses        → tool-to-project associations&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The single most valuable relationship type is &lt;code&gt;decided_by&lt;/code&gt;.&lt;/strong&gt; When the agent can look up &lt;em&gt;why&lt;/em&gt; a decision was made — not just &lt;em&gt;what&lt;/em&gt; was decided — it stops re-proposing rejected approaches. This alone saved the most re-explanation time.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Distillation Pipeline
&lt;/h2&gt;

&lt;p&gt;Raw memory accumulation is easy. The hard part is keeping it useful. My system had 6,327 unverified memories after two months. Without distillation, the context window would be 90% noise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One week of session logs typically compresses to 15-20 semantic memories&lt;/strong&gt; — roughly a 50:1 ratio before the agent ever sees the data.&lt;/p&gt;

&lt;p&gt;The pipeline runs in three stages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;distill_memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sqlite3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Stage 1: Deduplicate
&lt;/span&gt;    &lt;span class="c1"&gt;# similarity() is a custom trigram UDF registered at runtime
&lt;/span&gt;    &lt;span class="n"&gt;dupes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT m1.id, m2.id 
        FROM memories m1 
        JOIN memories m2 ON m1.source = m2.source 
            AND m1.id &amp;lt; m2.id
            AND m1.tier = m2.tier
        WHERE similarity(m1.content, m2.content) &amp;gt; 0.85
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# Keep the newer one, mark older as distilled
&lt;/span&gt;
    &lt;span class="c1"&gt;# Stage 2: Compress episodic → semantic
&lt;/span&gt;    &lt;span class="n"&gt;raw_episodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT id, content, metadata 
        FROM memories 
        WHERE tier = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;episodic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; 
            AND distilled = 0
            AND created_at &amp;lt; datetime(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;now&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-7 days&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
        ORDER BY created_at DESC
        LIMIT ?
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;,)).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;episode&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_episodes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Extract facts, decisions, preferences
&lt;/span&gt;        &lt;span class="c1"&gt;# Create or update semantic memories
&lt;/span&gt;        &lt;span class="c1"&gt;# Mark episodic as distilled
&lt;/span&gt;        &lt;span class="k"&gt;pass&lt;/span&gt;

    &lt;span class="c1"&gt;# Stage 3: Verify against ground truth
&lt;/span&gt;    &lt;span class="n"&gt;unverified&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT id, content, tier 
        FROM memories 
        WHERE confidence &amp;lt; 0.8 AND distilled = 0
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;unverified&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Cross-reference with git log, file system, configs
&lt;/span&gt;        &lt;span class="c1"&gt;# Bump confidence or mark for deletion
&lt;/span&gt;        &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Stage&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;uses&lt;/span&gt; &lt;span class="n"&gt;trigram&lt;/span&gt; &lt;span class="n"&gt;comparison&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;For&lt;/span&gt; &lt;span class="n"&gt;deduplication&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t need semantic similarity; you need near-duplicate detection. **Trigrams at 0.85 threshold caught 115 duplicates from a single source in my first run.**

Stage 2 is where the LLM earns its keep. I feed batches of raw episodic memories to a cheap model (Claude Haiku) and ask it to extract structured facts.

Stage 3 is non-negotiable. I had memories that claimed certain functions existed — functions that had been renamed two weeks prior. **An unverified memory is worse than no memory because the agent will state it as fact.**

## Loading Context at Session Start

The session startup hook assembles the active context from all four tiers and injects it before the first token of each session:

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;/p&gt;

&lt;h1&gt;
  
  
  !/bin/bash
&lt;/h1&gt;

&lt;h1&gt;
  
  
  session-start.sh — runs before every Claude Code session
&lt;/h1&gt;

&lt;p&gt;DB="$HOME/.claude/jarvis/data/jarvis.db"&lt;/p&gt;

&lt;h1&gt;
  
  
  1. User preferences and procedural rules (always loaded)
&lt;/h1&gt;

&lt;p&gt;sqlite3 "$DB" "SELECT content FROM memories &lt;br&gt;
    WHERE tier = 'procedural' AND distilled = 1 &lt;br&gt;
    ORDER BY access_count DESC LIMIT 20"&lt;/p&gt;

&lt;h1&gt;
  
  
  2. Active project context (filtered by cwd)
&lt;/h1&gt;

&lt;p&gt;PROJECT=$(basename "$PWD")&lt;br&gt;
sqlite3 "$DB" "SELECT content FROM memories &lt;br&gt;
    WHERE tier = 'project' AND distilled = 1 &lt;br&gt;
    AND json_extract(metadata, '$.project') = '$PROJECT'&lt;br&gt;
    ORDER BY updated_at DESC LIMIT 15"&lt;/p&gt;

&lt;h1&gt;
  
  
  3. Recent decisions (last 7 days)
&lt;/h1&gt;

&lt;p&gt;sqlite3 "$DB" "SELECT content FROM memories &lt;br&gt;
    WHERE tier = 'semantic' AND distilled = 1&lt;br&gt;
    AND json_extract(metadata, '$.type') = 'decision'&lt;br&gt;
    AND updated_at &amp;gt; datetime('now', '-7 days')&lt;br&gt;
    ORDER BY updated_at DESC LIMIT 10"&lt;/p&gt;

&lt;h1&gt;
  
  
  4. Knowledge graph summary
&lt;/h1&gt;

&lt;p&gt;sqlite3 "$DB" "SELECT &lt;br&gt;
    (SELECT COUNT(&lt;em&gt;) FROM entities) || ' entities, ' ||&lt;br&gt;
    (SELECT COUNT(&lt;/em&gt;) FROM relationships) || ' relationships'"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The entire loaded context fits in roughly 3,000-4,000 tokens.&lt;/strong&gt; That's the key metric. If your memory system needs 20K tokens of context, you've failed at compression. The agent needs room to think.&lt;/p&gt;

&lt;p&gt;I also track a &lt;code&gt;drift_score&lt;/code&gt; — a measure of how much the agent's behavior diverges from accumulated feedback. If drift exceeds 30/100, I trigger a feedback review cycle. This is crude but it catches regression.&lt;/p&gt;

&lt;h2&gt;
  
  
  Health Monitoring
&lt;/h2&gt;

&lt;p&gt;Memory systems rot silently. &lt;strong&gt;The warning sign is when unverified memories outnumber distilled ones&lt;/strong&gt; — that ratio is the canary. I added three health checks that run weekly:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
sql
-- Unverified count (should stay under 1,000)
SELECT COUNT(*) as unverified 
FROM memories WHERE distilled = 0;

-- Stale memories (not accessed in 30 days)
SELECT COUNT(*) as stale 
FROM memories 
WHERE access_count = 0 
AND created_at &amp;lt; datetime('now', '-30 days');

-- Duplicate density (should stay under 5%)
SELECT CAST(dupe_count AS REAL) / total * 100 as dupe_pct
FROM (
    SELECT COUNT(*) as total FROM memories
), (
    SELECT COUNT(*) as dupe_count FROM memories 
    WHERE source IN (
        SELECT source FROM memories 
        GROUP BY source HAVING COUNT(*) &amp;gt; 50
    )
);

When unverified memories hit 6,327, I knew the distillation pipeline was falling behind. That number should be under 1,000 at steady state. **If you're accumulating faster than you're distilling, your memory system is a log file with extra steps.**

## What I'd Do Differently

**1. Start with the distillation pipeline, not the accumulation.** I built the write path first and the compression path second. Two months of unchecked growth created a cleanup project. Build both simultaneously.

**2. Use stricter entity deduplication from day one.** I ended up with "Claude Code", "claude-code", and "CC" as three separate entities referring to the same thing. A normalization layer at write time would have prevented this.

**3. Track memory provenance more carefully.** Some memories came from conversation, some from git history, some from file analysis. When a memory is wrong, you need to trace it back to its source to fix the extraction logic — not just delete the memory.

## The Numbers

| Metric | Before Memory System | After |
|--------|---------------------|-------|
| Context restoration time | 10-15 min/session | ~0 (automated) |
| Repeated corrections/week | 8-12 | 1-2 |
| Monthly time saved | — | 5-8 hours |
| Infrastructure cost | — | $0 |
| Storage (SQLite file) | — | 2.4 MB |

| System Metric | Value |
|---------------|-------|
| Total entities | 187 |
| Total relationships | 128 |
| Distilled memories | ~1,200 |
| Raw (pending distillation) | ~6,300 |
| Context tokens per session | 3,000-4,000 |
| Query latency (SQLite) | &amp;lt; 5ms |
| Distillation ratio | ~50:1 (raw:distilled) |

## FAQ

**Q: Why SQLite instead of a vector database?**  
A: For sub-1,000 entities and structured queries, SQLite is faster to set up, has zero operational overhead, and the entire database ships as a single file alongside your agent configuration. Vector search adds value when you need semantic retrieval over thousands of unstructured documents. My memories are structured and categorized — SQL queries handle them fine.

**Q: How do you prevent the agent from hallucinating based on stale memories?**  
A: Three mechanisms. First, every memory has a `confidence` score — unverified memories are labeled as such in the prompt. Second, the session startup hook only loads distilled memories, which have been cross-referenced against code and git history. Third, procedural rules explicitly instruct the agent to verify memories against current file state before acting on them.

**Q: Does this work with agents other than Claude Code?**  
A: The SQLite schema and distillation pipeline are agent-agnostic. The session startup hook is specific to Claude Code's hook system, but any agent framework that supports pre-session context injection (Cursor, Cline, Aider) can use the same approach with a different loader script.

**Q: How much maintenance does this require?**  
A: About 15 minutes per week. I run the distillation pipeline manually (planning to automate via cron), review the health metrics, and occasionally merge duplicate entities. The biggest maintenance task is updating project-tier memories when priorities shift — but that's 2-3 edits, not a rebuild.

**Q: Won't the knowledge graph grow unmanageably large?**  
A: Not at individual developer scale. After 3 months of daily use across multiple projects, I'm at 187 entities. **The growth is logarithmic** — most new sessions reference existing entities rather than creating new ones. I'd estimate steady state at 300-500 entities for a solo developer working on 3-5 active projects.

## Try It Yourself

1. Create the SQLite database with the schema above: `sqlite3 ~/.agent-memory/memory.db &amp;lt; schema.sql`
2. Add session logging to your agent's hook system — capture decisions, corrections, and preferences as raw episodic memories
3. Write the distillation script and run it weekly to compress episodic memories into semantic facts
4. Build the session startup hook to inject the top 40-50 distilled memories as context
5. After two weeks, check your health metrics — if unverified count is growing faster than distilled count, increase distillation frequency

## What's Next

If you're building agents that persist across sessions — coding assistants, DevOps bots, personal AI tools — memory isn't optional. It's the difference between a tool you configure once and a tool you configure every day.

The entire system is ~400 lines of Python and SQL. No ML pipeline. No vector store. No monthly bill. Just structured data and disciplined compression.

**What's your approach to agent memory?** Drop your setup in the comments — I'm particularly curious if anyone has solved the entity deduplication problem more elegantly than my trigram hack.

If this was useful, bookmark it for when you hit the "why does my agent keep forgetting" wall. And follow for more posts about the operational reality of running AI-powered dev tools.

---

*I build AI-powered developer tools and run a 4-tier memory system across 30+ coding sessions per week. Currently shipping automation infrastructure at CyBarrier.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>4 SQLite Tables Replaced My $200/mo AI Observability Stack</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Tue, 24 Mar 2026 01:44:43 +0000</pubDate>
      <link>https://dev.to/thestack_ai/4-sqlite-tables-replaced-my-200mo-ai-observability-stack-47ap</link>
      <guid>https://dev.to/thestack_ai/4-sqlite-tables-replaced-my-200mo-ai-observability-stack-47ap</guid>
      <description>&lt;p&gt;My AI agent system runs 16 teams across 4 different LLM providers. Two months ago, one team silently started hallucinating policy decisions. I caught it in 11 minutes.&lt;/p&gt;

&lt;p&gt;Not with Datadog. Not with Honeycomb. With 47 lines of Python writing to a SQLite database.&lt;/p&gt;

&lt;p&gt;OpenTelemetry is now working on &lt;a href="https://opentelemetry.io/docs/specs/semconv/gen-ai/" rel="noopener noreferrer"&gt;semantic conventions for LLM tracing&lt;/a&gt;. That's great. But I needed this six months ago, so I built my own. Here's the full setup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: A SQLite-backed audit trail for multi-agent AI orchestration logs every LLM call, model routing decision, and bias detection event. &lt;strong&gt;338 audit entries and 108 events exposed 3 silent failures that cost-based monitoring would have missed entirely.&lt;/strong&gt; The system is 4 tables, runs on a 1GB Oracle Cloud free-tier instance, and replaced what would have been ~$200/month in observability tooling. Total implementation time: one weekend.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Flying Blind With Multiple Models
&lt;/h2&gt;

&lt;p&gt;Running one model is simple — you read the output. &lt;strong&gt;Running four different LLM models in a single orchestration pipeline creates a debugging problem that single-model setups never encounter.&lt;/strong&gt; I route tasks across Claude Opus (implementation), Gemini 3.1 (information synthesis), GPT-5.4 (strategy reviews), and Codex (parallel task execution), organized into 16 agent teams that hand work off to each other sequentially.&lt;/p&gt;

&lt;p&gt;With four models and 16 teams, you need answers to questions that &lt;code&gt;print()&lt;/code&gt; can’t help with:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Which model handled which step&lt;/li&gt;
&lt;li&gt;How long each step took&lt;/li&gt;
&lt;li&gt;Whether the output was actually used or silently dropped&lt;/li&gt;
&lt;li&gt;What the routing decision was based on&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I evaluated existing options before building my own. LangSmith Teams: ~$400/month. Self-hosted Langfuse: requires a Postgres instance with 2GB+ RAM. OpenTelemetry's GenAI semantic conventions: still experimental, no production deployment story.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I needed something that worked today, on a server with 1GB of RAM and a $0 infrastructure budget.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Four Tables, One Database
&lt;/h2&gt;

&lt;p&gt;The minimum viable schema for multi-agent observability is four tables: one for raw call logs, one for system events, one for routing decisions, and one for agent memory. &lt;strong&gt;The entire schema fits in your head — and that constraint is a feature, not a limitation.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;audit_log&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTOINCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'now'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;input_summary&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_summary&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;latency_ms&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tokens_in&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tokens_out&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cost_usd&lt;/span&gt; &lt;span class="nb"&gt;REAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;  &lt;span class="c1"&gt;-- JSON blob for anything else&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTOINCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'now'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;source&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;urgency&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urgency&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'low'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'medium'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'high'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'critical'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;project&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;  &lt;span class="c1"&gt;-- JSON&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;decisions&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTOINCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'now'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;decision_type&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="nb"&gt;REAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTOINCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'now'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;memory_type&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_type&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; 
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'episodic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'semantic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'procedural'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'project'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;source&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;relevance_score&lt;/span&gt; &lt;span class="nb"&gt;REAL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Four&lt;/span&gt; &lt;span class="n"&gt;tables&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="k"&gt;No&lt;/span&gt; &lt;span class="n"&gt;migrations&lt;/span&gt; &lt;span class="n"&gt;framework&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="k"&gt;No&lt;/span&gt; &lt;span class="n"&gt;ORM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;SQLite&lt;/span&gt;&lt;span class="s1"&gt;'s WAL mode handles concurrent reads from the dashboard while agents write logs** — that'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;all&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;concurrency&lt;/span&gt; &lt;span class="n"&gt;management&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="n"&gt;needs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
python&lt;br&gt;
import sqlite3&lt;/p&gt;

&lt;p&gt;DB_PATH = "~/.claude/jarvis/data/jarvis.db"&lt;/p&gt;

&lt;p&gt;def get_db():&lt;br&gt;
    conn = sqlite3.connect(DB_PATH)&lt;br&gt;
    conn.execute("PRAGMA journal_mode=WAL")&lt;br&gt;
    conn.execute("PRAGMA busy_timeout=5000")&lt;br&gt;
    conn.row_factory = sqlite3.Row&lt;br&gt;
    return conn&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;busy_timeout&lt;/code&gt; is critical. Without it, concurrent agent writes will throw &lt;code&gt;database is locked&lt;/code&gt; errors. 5 seconds is generous — in practice, writes complete in under 10ms.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 2: The Event Bus Pattern
&lt;/h2&gt;

&lt;p&gt;Agents shouldn't know they're being traced. Every agent action flows through a central event bus that routes to logging subscribers independently of the main pipeline.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ControlPlane&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
             &lt;span class="n"&gt;urgency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Every agent action goes through here.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;# Log the event
&lt;/span&gt;        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO events (event_type, source, urgency, project, payload) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VALUES (?, ?, ?, ?, ?)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urgency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Route to subscribers
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_subscribers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
            &lt;span class="nf"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;where&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;gets&lt;/span&gt; &lt;span class="n"&gt;powerful&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;bias&lt;/span&gt; &lt;span class="n"&gt;firewall&lt;/span&gt; &lt;span class="n"&gt;subscribes&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt; &lt;span class="sb"&gt;`agent_output`&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="n"&gt;tracker&lt;/span&gt; &lt;span class="n"&gt;subscribes&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="sb"&gt;`model_call`&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;dashboard&lt;/span&gt; &lt;span class="n"&gt;subscribes&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;everything&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;them&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;subscriber&lt;/span&gt; &lt;span class="n"&gt;fails&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="n"&gt;continues&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;failure&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;logged&lt;/span&gt; &lt;span class="n"&gt;separately&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
python&lt;/p&gt;
&lt;h1&gt;
  
  
  Bias firewall subscribes non-invasively
&lt;/h1&gt;

&lt;p&gt;control_plane.subscribe("agent_output", bias_firewall.check)&lt;br&gt;
control_plane.subscribe("model_call", cost_tracker.record)&lt;br&gt;
control_plane.subscribe("*", dashboard.update)&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 3: Model Routing With a Paper Trail
&lt;/h2&gt;

&lt;p&gt;Every routing decision gets logged, not just executed. &lt;strong&gt;This was the single most valuable architectural choice in the entire system&lt;/strong&gt; — when something goes wrong, "which model handled this?" becomes a SQL query instead of a debugging session.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ModelRouter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;ROUTING_TABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;implementation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3.1-pro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;strategy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-5.4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chief_of_staff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parallel_exec&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;codex&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;worker&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;route&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ROUTING_TABLE&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;task_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Log the routing decision
&lt;/span&gt;        &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO decisions (decision_type, context, outcome, model) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VALUES (?, ?, ?, ?)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_routing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;

&lt;span class="n"&gt;After&lt;/span&gt; &lt;span class="n"&gt;two&lt;/span&gt; &lt;span class="n"&gt;months&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="mi"&gt;343&lt;/span&gt; &lt;span class="n"&gt;routing&lt;/span&gt; &lt;span class="n"&gt;decisions&lt;/span&gt; &lt;span class="n"&gt;logged&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;One&lt;/span&gt; &lt;span class="sb"&gt;`GROUP BY`&lt;/span&gt; &lt;span class="n"&gt;told&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;story&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;never&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;strategy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="n"&gt;were&lt;/span&gt; &lt;span class="n"&gt;actually&lt;/span&gt; &lt;span class="n"&gt;implementation&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="n"&gt;mis&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;categorized&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;GPT&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;5.4&lt;/span&gt; &lt;span class="n"&gt;was&lt;/span&gt; &lt;span class="n"&gt;doing&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s job. Poorly. Nobody noticed until the audit trail made it obvious.

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
sql&lt;br&gt;
SELECT decision_type, model, COUNT(*) as count,&lt;br&gt;
       AVG(confidence) as avg_confidence&lt;br&gt;
FROM decisions&lt;br&gt;
GROUP BY decision_type, model&lt;br&gt;
ORDER BY count DESC;&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 4: Bias Detection Latency Tracking
&lt;/h2&gt;

&lt;p&gt;A bias firewall that cross-checks agent outputs using multiple models is only useful if you can measure its performance. I run a 4-stage pipeline: Risk Classifier → Claim Extractor → Cross-Model Verifier (Gemini 3.1) → Disagreement Preserver. Each stage gets its own audit entry with latency and detection metadata.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BiasFirewall&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;monotonic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;risk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;classify_risk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;claims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_claims&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;verification&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cross_verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Gemini call
&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_ms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;monotonic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO audit_log &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(action, agent, model, latency_ms, metadata) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VALUES (?, ?, ?, ?, ?)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bias_check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3.1-pro-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;elapsed_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk_level&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;risk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claims_found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;disagreements&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;verification&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;disagreements&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;detection_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;verification&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bias_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Real&lt;/span&gt; &lt;span class="n"&gt;numbers&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;my&lt;/span&gt; &lt;span class="n"&gt;audit&lt;/span&gt; &lt;span class="n"&gt;trail&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;Gemini&lt;/span&gt; &lt;span class="mf"&gt;3.1&lt;/span&gt; &lt;span class="n"&gt;Pro&lt;/span&gt; &lt;span class="n"&gt;verifier&lt;/span&gt; &lt;span class="n"&gt;averages&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt; &lt;span class="n"&gt;per&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;Flash&lt;/span&gt; &lt;span class="n"&gt;Lite&lt;/span&gt; &lt;span class="n"&gt;fallback&lt;/span&gt; &lt;span class="n"&gt;averages&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Detection&lt;/span&gt; &lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;across&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="n"&gt;bias&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;framing&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;false&lt;/span&gt; &lt;span class="n"&gt;consensus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;anchoring&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;availability&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confirmation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;authority&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;know&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;because&lt;/span&gt; &lt;span class="n"&gt;every&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="sb"&gt;`audit_log`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;README&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
sql&lt;br&gt;
SELECT &lt;br&gt;
    json_extract(metadata, '$.detection_type') as bias_type,&lt;br&gt;
    COUNT(*) as detections,&lt;br&gt;
    AVG(latency_ms) as avg_latency_ms,&lt;br&gt;
    MIN(latency_ms) as min_latency_ms,&lt;br&gt;
    MAX(latency_ms) as max_latency_ms&lt;br&gt;
FROM audit_log&lt;br&gt;
WHERE action = 'bias_check'&lt;br&gt;
GROUP BY bias_type;&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 5: The Dashboard That Costs Nothing
&lt;/h2&gt;

&lt;p&gt;A terminal dashboard (Textual TUI) refreshing every 5 seconds feeds entirely from the SQLite audit trail. Panels for active agents, recent decisions, bias firewall status, and cost tracking. The key architectural property: &lt;strong&gt;the dashboard is a pure read consumer — it never writes to the database, and its queries run in under 2ms on a 1GB server.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_recent_activity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT action, agent, model, latency_ms, timestamp &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FROM audit_log &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WHERE timestamp &amp;gt; datetime(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;now&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, ?) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ORDER BY timestamp DESC LIMIT 50&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; minutes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_cost_summary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT model, SUM(cost_usd) as total, COUNT(*) as calls &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FROM audit_log &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WHERE timestamp &amp;gt; datetime(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;now&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, ?) AND cost_usd IS NOT NULL &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GROUP BY model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; days&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;Grafana&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;Prometheus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;InfluxDB&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="n"&gt;process&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;manage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;monitoring&lt;/span&gt; &lt;span class="n"&gt;layer&lt;/span&gt; &lt;span class="n"&gt;costs&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;requires&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt; &lt;span class="n"&gt;infrastructure&lt;/span&gt; &lt;span class="n"&gt;beyond&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;already&lt;/span&gt; &lt;span class="n"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;

&lt;span class="c1"&gt;## Step 6: The Silent Failure That Proved the System
&lt;/span&gt;
&lt;span class="n"&gt;Three&lt;/span&gt; &lt;span class="n"&gt;weeks&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt; &lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;audit&lt;/span&gt; &lt;span class="n"&gt;trail&lt;/span&gt; &lt;span class="n"&gt;caught&lt;/span&gt; &lt;span class="n"&gt;something&lt;/span&gt; &lt;span class="n"&gt;no&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;based&lt;/span&gt; &lt;span class="n"&gt;monitor&lt;/span&gt; &lt;span class="n"&gt;would&lt;/span&gt; &lt;span class="n"&gt;have&lt;/span&gt; &lt;span class="n"&gt;flagged&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="n"&gt;review&lt;/span&gt; &lt;span class="n"&gt;team&lt;/span&gt; &lt;span class="n"&gt;was&lt;/span&gt; &lt;span class="n"&gt;producing&lt;/span&gt; &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;but&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;downstream&lt;/span&gt; &lt;span class="n"&gt;implementer&lt;/span&gt; &lt;span class="n"&gt;was&lt;/span&gt; &lt;span class="n"&gt;ignoring&lt;/span&gt; &lt;span class="n"&gt;them&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Entirely&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;alerts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Just&lt;/span&gt; &lt;span class="n"&gt;silent&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="sb"&gt;`events`&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="n"&gt;showed&lt;/span&gt; &lt;span class="sb"&gt;`agent_output`&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;but&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;corresponding&lt;/span&gt; &lt;span class="sb"&gt;`decisions`&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt; &lt;span class="sb"&gt;`implementation_start`&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="n"&gt;referencing&lt;/span&gt; &lt;span class="n"&gt;those&lt;/span&gt; &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;structural&lt;/span&gt; &lt;span class="n"&gt;gap&lt;/span&gt; &lt;span class="n"&gt;visible&lt;/span&gt; &lt;span class="n"&gt;only&lt;/span&gt; &lt;span class="n"&gt;because&lt;/span&gt; &lt;span class="n"&gt;both&lt;/span&gt; &lt;span class="n"&gt;sides&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;every&lt;/span&gt; &lt;span class="n"&gt;handoff&lt;/span&gt; &lt;span class="n"&gt;were&lt;/span&gt; &lt;span class="n"&gt;being&lt;/span&gt; &lt;span class="n"&gt;logged&lt;/span&gt; &lt;span class="n"&gt;independently&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
sql&lt;br&gt;
-- Find strategy outputs with no downstream pickup&lt;br&gt;
SELECT e.id, e.timestamp, e.source,&lt;br&gt;
       json_extract(e.payload, '$.task_id') as task_id&lt;br&gt;
FROM events e&lt;br&gt;
WHERE e.event_type = 'agent_output' &lt;br&gt;
  AND e.source = 'strategy-reviewer'&lt;br&gt;
  AND json_extract(e.payload, '$.task_id') NOT IN (&lt;br&gt;
      SELECT context FROM decisions &lt;br&gt;
      WHERE decision_type = 'implementation_start'&lt;br&gt;
  );&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Seven tasks over four days. Silently dropped.&lt;/strong&gt; The implementer agent was receiving the handoff but failing to parse a changed output format from a model update. The audit trail caught it because it tracked both sides of every handoff independently — something a single-stream log would have missed entirely.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start with the &lt;code&gt;decisions&lt;/code&gt; table, not the &lt;code&gt;audit_log&lt;/code&gt;.&lt;/strong&gt; The audit log is useful for latency and cost analysis, but the decisions table is where you find actual bugs. If I'd built the decisions table first, I would have caught the strategy-implementation gap in days, not weeks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add a &lt;code&gt;correlation_id&lt;/code&gt; from day one.&lt;/strong&gt; Tracing a single request across 5 agents currently means joining on timestamps and task IDs. A single UUID per pipeline run would save hours of forensic querying. This is the one thing OpenTelemetry's trace/span model gets absolutely right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't log raw prompts.&lt;/strong&gt; I initially stored full prompts in &lt;code&gt;input_summary&lt;/code&gt;. The database hit 400MB in a week. Summaries and token counts are sufficient for debugging 95% of issues. Store raw data only when you're actively investigating a specific failure.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Cost Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;th&gt;Setup Time&lt;/th&gt;
&lt;th&gt;RAM Required&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LangSmith Teams&lt;/td&gt;
&lt;td&gt;~$400&lt;/td&gt;
&lt;td&gt;2 hours&lt;/td&gt;
&lt;td&gt;N/A (hosted)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-hosted Langfuse&lt;/td&gt;
&lt;td&gt;~$50 (Postgres)&lt;/td&gt;
&lt;td&gt;4–6 hours&lt;/td&gt;
&lt;td&gt;2GB+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog LLM Observability&lt;/td&gt;
&lt;td&gt;~$200&lt;/td&gt;
&lt;td&gt;1 hour&lt;/td&gt;
&lt;td&gt;N/A (hosted)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SQLite audit trail&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~8 hours&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt;50MB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  System Metrics (after 2 months of production use)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Audit log entries&lt;/td&gt;
&lt;td&gt;338&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Events recorded&lt;/td&gt;
&lt;td&gt;108&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Routing decisions tracked&lt;/td&gt;
&lt;td&gt;343&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory entries&lt;/td&gt;
&lt;td&gt;1,740&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database size (WAL mode)&lt;/td&gt;
&lt;td&gt;~12MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average write latency&lt;/td&gt;
&lt;td&gt;&amp;lt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Silent failures caught&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server specs&lt;/td&gt;
&lt;td&gt;1 OCPU / 1GB RAM (Oracle Cloud Free Tier)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate decisions cleaned (one-time)&lt;/td&gt;
&lt;td&gt;9,486&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bias detection rate&lt;/td&gt;
&lt;td&gt;100% (6/6 types)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That last row — 9,486 duplicate decisions requiring a one-time cleanup — is what happens when you skip dedup logic early. &lt;strong&gt;Add a &lt;code&gt;UNIQUE&lt;/code&gt; constraint on &lt;code&gt;(decision_type, context, outcome, timestamp)&lt;/code&gt; before you have 10,000 rows to clean up.&lt;/strong&gt; Learn from my mistake.&lt;/p&gt;
&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Why SQLite instead of Postgres?
&lt;/h3&gt;

&lt;p&gt;SQLite runs embedded — no server process, no connection pooling, no port management. On a 1GB instance running 4 concurrent LLM API clients, a Postgres process consuming 200MB of shared buffers is a real constraint. SQLite in WAL mode handles the specific read/write pattern here (dashboard reads while agents write) without contention. The database file is also trivially portable — I &lt;code&gt;scp&lt;/code&gt; it locally for deep analysis when needed.&lt;/p&gt;
&lt;h3&gt;
  
  
  How does this compare to OpenTelemetry's GenAI semantic conventions?
&lt;/h3&gt;

&lt;p&gt;OpenTelemetry's &lt;code&gt;gen_ai.*&lt;/code&gt; attributes cover the model call layer well: model name, token counts, latency, finish reason. What they don't cover yet is the orchestration layer — routing decisions, cross-model verification, agent-to-agent handoffs, memory retrieval. This audit trail captures both layers. When OTel's conventions stabilize, the migration path is straightforward: emit OTel spans from the same event bus while keeping the SQLite trail for orchestration-specific data that OTel doesn't address.&lt;/p&gt;
&lt;h3&gt;
  
  
  Won't this break at scale?
&lt;/h3&gt;

&lt;p&gt;SQLite handles databases up to 281TB. Two months of multi-agent orchestration data in this system is 12MB. At the current write rate (~170 entries/month across all tables), hitting 1GB would take roughly 50 years. &lt;strong&gt;If you're running 10,000 agents, switch to Postgres. If you're running 16 agent teams like this setup, SQLite will outlast the project.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  How do you query across related tables?
&lt;/h3&gt;

&lt;p&gt;Standard SQL joins, with the advantage that everything is in one database file — no cross-service queries, no distributed tracing backends, no network latency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latency_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decision_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urgency&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;audit_log&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;decisions&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; 
    &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'-1 second'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'+1 second'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;json_extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'$.task_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
    &lt;span class="n"&gt;json_extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'$.task_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'now'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'-1 day'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;based&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;crude&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;correlation&lt;/span&gt; &lt;span class="n"&gt;ID&lt;/span&gt; &lt;span class="n"&gt;would&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="n"&gt;cleaner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;But&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;works&lt;/span&gt; &lt;span class="k"&gt;without&lt;/span&gt; &lt;span class="k"&gt;schema&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="o"&gt;###&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="n"&gt;about&lt;/span&gt; &lt;span class="k"&gt;data&lt;/span&gt; &lt;span class="n"&gt;retention&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;

&lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;monthly&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt; &lt;span class="n"&gt;archives&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="n"&gt;older&lt;/span&gt; &lt;span class="k"&gt;than&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt; &lt;span class="n"&gt;days&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;compressed&lt;/span&gt; &lt;span class="n"&gt;backup&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;removes&lt;/span&gt; &lt;span class="n"&gt;them&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="k"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Three&lt;/span&gt; &lt;span class="n"&gt;months&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="k"&gt;data&lt;/span&gt; &lt;span class="n"&gt;compresses&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;roughly&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="n"&gt;KB&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
sqlite3 jarvis.db ".dump" | gzip &amp;gt; "archive_$(date +%Y%m).sql.gz"&lt;br&gt;
sqlite3 jarvis.db "DELETE FROM audit_log WHERE timestamp &amp;lt; datetime('now', '-90 days')"&lt;br&gt;
sqlite3 jarvis.db "VACUUM"&lt;/p&gt;
&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;You can set this up in under an hour. Here’s the path I’d recommend:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Create the database.&lt;/strong&gt; Copy the four &lt;code&gt;CREATE TABLE&lt;/code&gt; statements above into &lt;code&gt;schema.sql&lt;/code&gt;. Run &lt;code&gt;sqlite3 jarvis.db &amp;lt; schema.sql&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable WAL mode.&lt;/strong&gt; Run &lt;code&gt;sqlite3 jarvis.db "PRAGMA journal_mode=WAL"&lt;/code&gt; once. This persists across all future connections.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Instrument one agent call.&lt;/strong&gt; Pick your most critical LLM call. Add a single &lt;code&gt;INSERT INTO audit_log&lt;/code&gt; after it returns with the model name, latency in milliseconds, and token counts. That's the entire day-one requirement.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add the event bus.&lt;/strong&gt; Wrap agent calls in an &lt;code&gt;emit()&lt;/code&gt; function. Subscribe your logging to it. Agents and observability are now decoupled.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write one diagnostic query.&lt;/strong&gt; Answer a question you've been guessing at: "Which model is slowest?" or "How many calls per day?" &lt;strong&gt;The first useful answer will validate the entire approach.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add the decisions table last.&lt;/strong&gt; Once calls and events are flowing, start logging routing decisions explicitly. This is where production bugs actually live.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  The Honest Take
&lt;/h2&gt;

&lt;p&gt;This system is not elegant. It's a SQLite file with four tables and Python glue code. No distributed tracing. No visualization layer. No SLA. The cross-table joins use &lt;code&gt;BETWEEN&lt;/code&gt; clauses and timestamp proximity as a substitute for proper correlation IDs.&lt;/p&gt;

&lt;p&gt;But it caught three silent failures that would have been invisible to cost-based monitoring. It runs on a free-tier cloud instance consuming under 50MB of RAM. And &lt;strong&gt;when OpenTelemetry's GenAI conventions reach production stability, the migration path is clear&lt;/strong&gt;: emit OTel spans from the same event bus and keep the SQLite trail for orchestration-layer data that OTel doesn't yet model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sometimes the right observability tool is the one you can build in a weekend, understand completely, and trust at 3am when something breaks.&lt;/strong&gt;&lt;/p&gt;



&lt;p&gt;Have you tried tracing multi-model AI systems? I'd genuinely like to know — did anyone else go the "just use SQLite" route, or did you find a lightweight alternative that worked?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If this saved you from evaluating a $400/mo tracing platform, drop a bookmark.&lt;/strong&gt; I'm writing a follow-up on the bias firewall pipeline — specifically how cross-model verification between Claude and Gemini catches subtle framing issues that single-model review misses consistently.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/thestack_ai" class="crayons-btn crayons-btn--primary"&gt;Follow for more AI agent infrastructure — real systems, real numbers&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>ai</category>
      <category>observability</category>
      <category>sqlite</category>
      <category>agents</category>
    </item>
    <item>
      <title>I Built a Zombie Process Killer Because Claude Code Ate 14GB of My RAM</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Mon, 23 Mar 2026 04:55:18 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-built-a-zombie-process-killer-because-claude-code-ate-14gb-of-my-ram-1deg</link>
      <guid>https://dev.to/thestack_ai/i-built-a-zombie-process-killer-because-claude-code-ate-14gb-of-my-ram-1deg</guid>
      <description>&lt;p&gt;I lost an entire afternoon to a phantom memory leak that wasn't a leak at all. My MacBook was crawling — 14GB of RAM consumed by processes I never launched. The culprit? Dozens of orphaned MCP servers, headless Chrome instances, and sub-agents left behind by AI coding sessions. I built &lt;code&gt;zclean&lt;/code&gt; to kill them automatically. Here's the full setup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; AI coding tools like Claude Code and Codex spawn child processes (MCP servers, browser daemons, sub-agents) that don't get cleaned up when sessions end. These orphans accumulate silently and can consume 10GB+ of RAM within a single workday. &lt;code&gt;zclean&lt;/code&gt; detects and kills them safely — it hooks into your session lifecycle and runs on a schedule. One &lt;code&gt;npx zclean init&lt;/code&gt; sets up everything. I went from 3–4 forced reboots per week to zero manual intervention.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;AI coding tools don't clean up after themselves. After four months of heavy Claude Code use, I started noticing my machine getting sluggish by mid-afternoon — a dozen &lt;code&gt;node&lt;/code&gt; processes, several &lt;code&gt;chrome-headless-shell&lt;/code&gt; instances, a few &lt;code&gt;mcp-server-*&lt;/code&gt; processes running hours after I had ended those sessions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every AI coding session spawns a tree of child processes.&lt;/strong&gt; Claude Code launches MCP servers for file access, web search, and custom tools. It fires up headless browsers for web research. Codex spawns sub-agents. When the session ends, these processes are supposed to terminate. They don't — at least not reliably.&lt;/p&gt;

&lt;p&gt;I ran a quick check one evening:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ps aux | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'mcp-server|chrome-headless|agent-browser'&lt;/span&gt; | &lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt;

&lt;span class="k"&gt;**&lt;/span&gt;37 processes.&lt;span class="k"&gt;**&lt;/span&gt; All orphans. All consuming memory. Combined RAM usage: north of 6GB &lt;span class="k"&gt;for &lt;/span&gt;processes doing absolutely nothing.

This isn&lt;span class="s1"&gt;'t just me. The same reports appear across X and dev forums — "Claude Code is heavy," "my machine slows down after a few sessions." The tool itself isn'&lt;/span&gt;t heavy. &lt;span class="k"&gt;**&lt;/span&gt;The zombies it leaves behind are heavy — and they accumulate with every session you run.&lt;span class="k"&gt;**&lt;/span&gt;

&lt;span class="c"&gt;## How zclean Works&lt;/span&gt;

&lt;span class="sb"&gt;`&lt;/span&gt;zclean&lt;span class="sb"&gt;`&lt;/span&gt; detects and terminates orphaned AI tool processes using a conservative four-condition filter. The core principle: &lt;span class="k"&gt;**if &lt;/span&gt;the parent process is alive, don&lt;span class="s1"&gt;'t touch it.**

A process only gets flagged as a zombie when ALL of these conditions are true:

1. **It'&lt;/span&gt;s an orphan&lt;span class="k"&gt;**&lt;/span&gt; — its parent has been reassigned to init/launchd &lt;span class="o"&gt;(&lt;/span&gt;PPID &lt;span class="o"&gt;=&lt;/span&gt; 1 on macOS&lt;span class="o"&gt;)&lt;/span&gt;
2. &lt;span class="k"&gt;**&lt;/span&gt;It matches a known pattern&lt;span class="k"&gt;**&lt;/span&gt; — &lt;span class="nb"&gt;command &lt;/span&gt;line matches AI tool process signatures
3. &lt;span class="k"&gt;**&lt;/span&gt;It&lt;span class="s1"&gt;'s not in an active session** — not part of a tmux/screen process tree
4. **It'&lt;/span&gt;s &lt;span class="k"&gt;in &lt;/span&gt;the host namespace&lt;span class="k"&gt;**&lt;/span&gt; — not inside a Docker container

This means your intentionally running dev server is always safe. Your &lt;span class="sb"&gt;`&lt;/span&gt;pm2&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;-managed&lt;/span&gt; processes are safe. Your &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;nohup&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt; background &lt;span class="nb"&gt;jobs &lt;/span&gt;are safe. Only genuinely abandoned processes get killed.

&lt;span class="c"&gt;### The Target List&lt;/span&gt;

Here&lt;span class="s1"&gt;'s what `zclean` looks for by default:

| Category | Process Pattern | Source |
|----------|----------------|--------|
| MCP servers | `mcp-server-*` | Claude Code |
| Browser daemons | `agent-browser`, `chrome-headless-shell`, `playwright/driver` | Claude Code, Codex |
| Sub-agents | orphaned `claude --print`, `codex exec` | Claude Code, Codex |
| Build zombies | `esbuild`, `vite`, `next dev`, `webpack` (24h+ orphan) | Common |
| npm zombies | `npm exec`, `npx` (no parent) | Common |
| Node orphans | `node` (no parent + 24h+ or 500MB+ + AI tool path in cmdline) | Common |
| Runtime orphans | `tsx`, `ts-node`, `bun`, `deno`, `python` (MCP server pattern) | Common |

**Build tools like `vite` and `webpack` get a 24-hour grace period.** A long-running build that legitimately takes hours shouldn'&lt;/span&gt;t be killed. But &lt;span class="k"&gt;if &lt;/span&gt;an orphaned &lt;span class="sb"&gt;`&lt;/span&gt;esbuild&lt;span class="sb"&gt;`&lt;/span&gt; process has been sitting there &lt;span class="k"&gt;for &lt;/span&gt;over a day with no parent, it&lt;span class="s1"&gt;'s dead weight.

## Setting It Up: One Command

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
npx zclean init&lt;/p&gt;

&lt;p&gt;That's it. Here's what happens under the hood:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: OS Detection.&lt;/strong&gt; &lt;code&gt;zclean&lt;/code&gt; figures out if you're on macOS, Linux, or Windows and configures the right process scanning and scheduling mechanisms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Claude Code Hook.&lt;/strong&gt; It registers a &lt;code&gt;SessionEnd&lt;/code&gt; hook in your Claude Code &lt;code&gt;settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SessionEnd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx zclean --session-pid $SESSION_PID --yes"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;first&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;line&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;defense&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**every&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;time&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;session&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ends,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`zclean`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;immediately&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cleans&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;up&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;session's&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;orphaned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;children&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;within&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;milliseconds.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`--session-pid`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;scopes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cleanup&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;specific&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;process&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tree,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;avoiding&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;any&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;risk&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;unrelated&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;processes.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;**Step&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;OS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Scheduler.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;For&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;zombies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;slip&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;through&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(crashes,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;force-quits,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;other&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;AI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;without&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hooks),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;`zclean`&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sets&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;up&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;recurring&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hourly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cleanup:&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;On&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;macOS,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;LaunchAgent:&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;br&gt;
~/Library/LaunchAgents/com.zclean.hourly.plist&lt;/p&gt;

&lt;p&gt;On Linux, a systemd user timer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~/.config/systemd/user/zclean.timer

On Windows, a user-scoped Task Scheduler entry.

&lt;span class="k"&gt;**&lt;/span&gt;Step 4: Config file.&lt;span class="k"&gt;**&lt;/span&gt; Drops a config at &lt;span class="sb"&gt;`&lt;/span&gt;~/.zclean/config.json&lt;span class="sb"&gt;`&lt;/span&gt; where you can whitelist processes, adjust thresholds, and customize behavior.

&lt;span class="k"&gt;**&lt;/span&gt;Step 5: First scan.&lt;span class="k"&gt;**&lt;/span&gt; Runs an immediate dry-run so you can see what it would have killed before enabling automatic cleanup.

&lt;span class="c"&gt;## The Safety Mechanisms&lt;/span&gt;

I spent more &lt;span class="nb"&gt;time &lt;/span&gt;on the &lt;span class="s2"&gt;"don't kill the wrong thing"&lt;/span&gt; logic than on the actual killing. &lt;span class="k"&gt;**&lt;/span&gt;Getting a &lt;span class="nb"&gt;false &lt;/span&gt;positive means terminating someone&lt;span class="s1"&gt;'s running dev server — that'&lt;/span&gt;s a non-starter.&lt;span class="k"&gt;**&lt;/span&gt; Three independent safety layers prevent this.

&lt;span class="c"&gt;### PID Reuse Protection&lt;/span&gt;

Between the &lt;span class="nb"&gt;time&lt;/span&gt; &lt;span class="sb"&gt;`&lt;/span&gt;zclean&lt;span class="sb"&gt;`&lt;/span&gt; scans and the &lt;span class="nb"&gt;time &lt;/span&gt;it kills, a process could die and its PID could be reassigned to something completely different. On a busy system, PID reuse happens faster than you&lt;span class="s1"&gt;'d expect — Linux recycles PIDs in order, so a freshly spawned process can inherit a just-killed PID within seconds.

Before every kill, `zclean` re-verifies three things:

1. **The PID still exists**
2. **The process start time matches** what was recorded during the scan (to the second)
3. **The command line matches** what was recorded during the scan

**If any of these three checks fail, the kill is skipped entirely.** This eliminates the complete class of PID reuse bugs without requiring atomic operations or locks.

### The Whitelist

Some processes look like zombies but aren'&lt;/span&gt;t. The config handles persistent legitimate orphans:

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
json&lt;br&gt;
{&lt;br&gt;
  "whitelist": [&lt;br&gt;
    "mcp-server-custom-db",&lt;br&gt;
    "my-persistent-agent"&lt;br&gt;
  ],&lt;br&gt;
  "maxAge": 86400,&lt;br&gt;
  "maxMemoryMB": 500,&lt;br&gt;
  "dryRun": false&lt;br&gt;
}&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;whitelist&lt;/code&gt;&lt;/strong&gt;: Process names that are never touched, regardless of orphan status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;maxAge&lt;/code&gt;&lt;/strong&gt;: Seconds before an orphan gets flagged (default: 86400 = 24 hours for build tools)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;maxMemoryMB&lt;/code&gt;&lt;/strong&gt;: Memory threshold that escalates urgency (default: 500MB — above this, the process is flagged sooner)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;dryRun&lt;/code&gt;&lt;/strong&gt;: Global toggle — set to &lt;code&gt;true&lt;/code&gt; to audit without committing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Protected Process Trees
&lt;/h3&gt;

&lt;p&gt;Beyond the whitelist, &lt;code&gt;zclean&lt;/code&gt; walks the full process tree to protect anything descended from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;tmux / screen sessions&lt;/strong&gt; — if the process is a descendant of a terminal multiplexer, it's intentional&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Daemon managers&lt;/strong&gt; — pm2, forever, supervisord, systemd services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VS Code&lt;/strong&gt; — gets a 48-hour grace period since VS Code's process tree can appear orphaned after restarts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker containers&lt;/strong&gt; — checked via PID namespace on Linux (&lt;code&gt;/proc/&amp;lt;pid&amp;gt;/ns/pid&lt;/code&gt;); Docker Desktop on macOS runs in a VM so container processes aren't visible to the host &lt;code&gt;ps&lt;/code&gt; at all&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Daily Usage
&lt;/h2&gt;

&lt;p&gt;Most of the time, you forget &lt;code&gt;zclean&lt;/code&gt; exists. That's the goal. When you want visibility:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
bash
# See what would be killed (dry-run, default)
npx zclean

# Actually kill the zombies
npx zclean --yes

# Check current zombie status
npx zclean status

# View kill history with timestamps and RAM reclaimed
npx zclean logs

# Show current config
npx zclean config

A typical dry-run output looks like this:

  zclean — scanning for zombie processes...

  Found 4 zombie processes:

  PID    CMD                          RAM      AGE
  ────   ───────────────────────────  ───────  ──────
  8234   mcp-server-filesystem        42 MB    3h 12m
  8891   chrome-headless-shell        287 MB   2h 45m
  9102   mcp-server-fetch             18 MB    1h 58m
  12044  node (claude subagent)       156 MB   4h 03m

  Total reclaimable: 503 MB

  Run with --yes to kill these processes.

**503MB from four processes on a light day.** Peak scans have returned 15+ zombies consuming over 3GB on days with multiple long AI coding sessions.

## The Dual Protection Architecture

`zclean` doesn't rely on a single cleanup mechanism — redundancy is intentional.

**Layer 1: Session Hook** — fires on every clean session exit via the Claude Code `SessionEnd` hook. This catches the common case immediately, with zero delay between session end and cleanup.

**Layer 2: OS Scheduler** — runs hourly (configurable down to every 15 minutes). This catches everything the hook misses: crashed sessions, force-quits, Codex sessions that lack hook support, and any AI tool that spawns processes without a cleanup contract.

**The hook handles approximately 80% of cases instantly. The scheduler handles the remaining 20% within one hour.** Together, zombie RAM accumulation drops effectively to zero over any meaningful time period — verified over six weeks of continuous use on a MacBook Pro M2.

## What I'd Do Differently

**I should have built the process tree walker first.** I started with simple PPID checks and pattern matching, then kept bolting on edge cases — the tmux protection, the VS Code grace period, the Docker namespace check. The tree walker should have been the foundation. It would have reduced total code by roughly 30% and made the protection logic composable instead of a chain of special cases.

**The Windows implementation needs more real-world testing.** macOS and Linux use `/proc` and `ps`, which are well-understood and stable. Windows requires WMI queries through PowerShell, and the process model is fundamentally different — no PPID concept in the same sense, different namespace isolation. It works in my testing environment, but I have less confidence in Windows edge cases than on Unix systems.

**I underestimated how many processes VS Code itself orphans.** The 48-hour grace period for VS Code descendants was added reactively after I accidentally killed a legitimate TypeScript language server. The line between "VS Code orphan" and "AI tool orphan spawned through VS Code's integrated terminal" is genuinely blurry — VS Code's process tree is already unusual before you add AI tools to the mix.

## The Numbers

### Before vs. After zclean

| Metric | Before | After |
|--------|--------|-------|
| Average orphan processes (end of day) | 12–20 | 0–2 |
| RAM consumed by orphans | 2–8 GB | &amp;lt; 100 MB |
| Manual force-reboots per week | 3–4 | 0 |
| Time spent investigating "why is my Mac slow" | ~30 min/day | 0 |

### Tool Resource Footprint

| Metric | Value |
|--------|-------|
| zclean scan time | &amp;lt; 200ms |
| RAM usage during scan | ~12 MB |
| npm dependencies | 0 (pure Node.js) |
| Supported platforms | macOS, Linux, Windows |
| Config file size | ~200 bytes |
| Install + init time | &amp;lt; 10 seconds |

**Zero npm dependencies.** The scanner uses `child_process.execSync` with native OS commands (`ps` on Unix, `Get-Process` on Windows). No native modules, no compilation step, no `node-gyp` nightmares. The entire tool is a single Node.js file you can read and audit in under 10 minutes.

## FAQ

**Does zclean kill my running dev server?**

No. `zclean` only targets orphan processes — those whose parent has died and been reassigned to init/launchd (PPID = 1). If your dev server was started from a terminal that's still open, its parent is alive and it won't be touched. Processes managed by pm2, forever, or supervisord are also explicitly protected via tree-walk detection.

**What if I have a legitimate long-running MCP server?**

Add it to the whitelist in `~/.zclean/config.json`. Whitelisted process names are never killed regardless of orphan status or memory usage. You can also adjust `maxAge` if the default 24-hour grace period isn't sufficient for your workflow.

**Does it work with Codex, Cursor, or other AI coding tools?**

Yes. The `SessionEnd` hook is specific to Claude Code, but the OS scheduler (Layer 2) catches orphans from any tool. The target process patterns include common signatures from Codex, Cursor, and other tools that spawn MCP servers and headless browsers. If your tool's processes become orphaned and match the known patterns, `zclean` will find them on the next hourly run.

**Can it accidentally kill a process inside Docker?**

No. On Linux, `zclean` checks the PID namespace via `/proc/&amp;lt;pid&amp;gt;/ns/pid` to confirm the process is in the host namespace. Docker containers run in isolated PID namespaces and are excluded from scanning entirely. On macOS, Docker Desktop runs in a Linux VM, making container processes invisible to the host `ps`.

**What happens if zclean itself crashes mid-kill?**

Each kill is independent — there's no shared transaction state. If `zclean` crashes after killing 3 of 7 zombies, the remaining 4 will be caught on the next scheduled run within an hour. **The PID reuse protection ensures that even if system state changes between a scan and a kill, no incorrect process is terminated.**

## Try It Yourself

1. **Install and initialize** — `npx zclean init` detects your OS, registers hooks, and sets up the scheduler in under 10 seconds
2. **Run a dry scan** — `npx zclean` shows what would be killed without touching anything
3. **Check the output** — verify the detected processes are genuinely orphaned
4. **Enable kills** — `npx zclean --yes` or remove `dryRun: true` from config for automated cleanup
5. **Forget about it** — the hook and scheduler handle everything from here

## Wrapping Up

If you're using AI coding tools daily and your machine gets progressively slower through the day, you probably don't have a memory leak — you have a zombie problem. `zclean` is a zero-dependency Node.js utility that fixes it permanently with a single install command and two protection layers.

**What's the worst zombie accumulation you've seen on your machine?** I'm curious whether this skews toward macOS or if Linux and Windows users are hitting it equally.

If this saved you from your next force-reboot, consider sharing it with whoever on your team complains that "Claude is heavy."

Follow me for more posts about building developer tools and the unglamorous infrastructure work behind AI-assisted development.

---

*I build AI-powered developer tools and write about the engineering behind them. Currently running an AI agent orchestration system with multi-model routing across Claude, Gemini, and GPT — which is, ironically, also the source of most of my zombie processes.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>node</category>
      <category>opensource</category>
      <category>productivity</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Built a Diagnostic CLI for Claude Code Skills — Here's What 8 Rules Caught That I Missed</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Mon, 23 Mar 2026 04:40:15 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-built-a-diagnostic-cli-for-claude-code-skills-heres-what-8-rules-caught-that-i-missed-4142</link>
      <guid>https://dev.to/thestack_ai/i-built-a-diagnostic-cli-for-claude-code-skills-heres-what-8-rules-caught-that-i-missed-4142</guid>
      <description>&lt;p&gt;&lt;strong&gt;Most of my Claude Code skills were broken and I had no idea.&lt;/strong&gt; I had 23 skill files, felt productive, and assumed Claude was using all of them. Then I built a diagnostic tool, ran it on my own setup, and 14 of those 23 skills had structural issues that silently degraded how Claude interpreted them. That's a 61% failure rate on files I had personally written and considered finished.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; &lt;a href="https://github.com/whynowlab/pulser" rel="noopener noreferrer"&gt;pulser&lt;/a&gt; is a CLI that scans Claude Code skill files, classifies them by type, runs 8 diagnostic rules, generates prescriptions, and auto-fixes issues with backup and rollback. I ran it on 23 skills and found 14 had problems — missing frontmatter fields, ambiguous trigger conditions, conflicting instructions. One command. Zero config. &lt;code&gt;npx pulser-cli&lt;/code&gt; and you're done.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Claude Code Skills Have No Validation Layer
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Claude Code skills are markdown files in &lt;code&gt;~/.claude/skills/&lt;/code&gt; that change how Claude behaves — and there is no built-in way to verify they're structured correctly.&lt;/strong&gt; You write a markdown file, drop it in the skills directory, and Claude is supposed to route to it based on the frontmatter description and body content. Nothing enforces that the file is valid.&lt;/p&gt;

&lt;p&gt;I spent two weeks building a skill that was supposed to trigger whenever I said "debug this." It had detailed instructions, code examples, a careful system prompt. One problem: a typo in the frontmatter &lt;code&gt;description&lt;/code&gt; field made the trigger condition ambiguous. &lt;strong&gt;Claude matched it maybe 30% of the time.&lt;/strong&gt; The other 70%, it fell through to default behavior and I blamed the model.&lt;/p&gt;

&lt;p&gt;Anthropic published &lt;a href="https://docs.anthropic.com/en/docs/claude-code/skills" rel="noopener noreferrer"&gt;skill quality principles&lt;/a&gt;, but they're guidelines, not tools. You read them, nod, and go back to writing skills the same way. I needed something that would &lt;strong&gt;read my skills, tell me what's wrong, and fix it.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Define What "Broken" Actually Means
&lt;/h2&gt;

&lt;p&gt;Before writing any code, I spent a day cataloging every skill failure mode I'd encountered. Not theoretical ones — actual bugs from my own skill files and issues reported by other Claude Code users. I landed on 8 diagnostic rules split into three tiers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What It Catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;frontmatter-required&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Missing &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, or &lt;code&gt;model&lt;/code&gt; fields&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;description-quality&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Descriptions too vague to route ("Use this for stuff")&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;trigger-clarity&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Ambiguous or missing trigger conditions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recommended&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;instruction-structure&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No clear sections, wall-of-text body&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recommended&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;conflict-detection&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Two skills claiming the same trigger space&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recommended&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;example-coverage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Skills without input/output examples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recommended&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;scope-boundaries&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Skills that try to do everything (&amp;gt;500 lines, 5+ responsibilities)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Experimental&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;dependency-chain&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Skills referencing other skills that don't exist&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The core 3 rules alone caught issues in 60% of my skill files.&lt;/strong&gt; The frontmatter rule sounds trivial until you realize that a missing &lt;code&gt;description&lt;/code&gt; field means Claude has to guess what your skill does from the body text. Sometimes it guesses right. Sometimes it routes to a completely unrelated skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Build a Multi-Signal Classifier
&lt;/h2&gt;

&lt;p&gt;Not all skills are the same, and treating them identically produces bad diagnostics. A coding skill needs different validation than a writing skill or a workflow automation skill. &lt;strong&gt;pulser classifies each skill using 4 signals before running any diagnostic rules&lt;/strong&gt;, so the prescriptions it generates are type-appropriate rather than generic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ClassificationResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SkillType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Signal&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SkillType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;coding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;writing&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;workflow&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;diagnostic&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;integration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;meta&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;classifier&lt;/span&gt; &lt;span class="nx"&gt;looks&lt;/span&gt; &lt;span class="nx"&gt;at&lt;/span&gt; &lt;span class="nx"&gt;frontmatter&lt;/span&gt; &lt;span class="nx"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="nx"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;code&lt;/span&gt; &lt;span class="nx"&gt;block&lt;/span&gt; &lt;span class="nx"&gt;density&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;structural&lt;/span&gt; &lt;span class="nx"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;A&lt;/span&gt; &lt;span class="nx"&gt;skill&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;many&lt;/span&gt; &lt;span class="s2"&gt;` ```&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;bash&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` blocks and words like "test", "build", "deploy" gets classified as `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="nx"&gt;coding&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` with high confidence. A skill mentioning "tone", "audience", "draft" lands in `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="nx"&gt;writing&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`.

**Why classification matters: each type gets different prescriptions.** A coding skill missing examples is a critical failure — Claude needs to understand exact input/output transformations. A workflow skill missing examples is a warning. Same rule, different severity, different fix.

`&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="nx"&gt;bash&lt;/span&gt;
&lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="nx"&gt;npx&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;cli&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;skill&lt;/span&gt; &lt;span class="nx"&gt;my&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;tdd&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;skill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;md&lt;/span&gt;

  &lt;span class="err"&gt;┌─────────────────────────────────────────┐&lt;/span&gt;
  &lt;span class="err"&gt;│&lt;/span&gt;  &lt;span class="nx"&gt;PULSER&lt;/span&gt; &lt;span class="nx"&gt;v0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;3.1&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="nx"&gt;Skill&lt;/span&gt; &lt;span class="nx"&gt;Diagnostic&lt;/span&gt;       &lt;span class="err"&gt;│&lt;/span&gt;
  &lt;span class="err"&gt;│&lt;/span&gt;  &lt;span class="nx"&gt;Diagnose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Prescribe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Fix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;              &lt;span class="err"&gt;│&lt;/span&gt;
  &lt;span class="err"&gt;└─────────────────────────────────────────┘&lt;/span&gt;

  &lt;span class="nx"&gt;Scanning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;my&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;tdd&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;skill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;md&lt;/span&gt;
  &lt;span class="nx"&gt;Classification&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;coding &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.92&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;frontmatter&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;required&lt;/span&gt;    &lt;span class="nx"&gt;PASS&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;quality&lt;/span&gt;     &lt;span class="nx"&gt;WARN&lt;/span&gt;  &lt;span class="nx"&gt;Description&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="nx"&gt;chars&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="nx"&gt;too&lt;/span&gt; &lt;span class="nx"&gt;short&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;reliable&lt;/span&gt; &lt;span class="nx"&gt;routing&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;trigger&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;clarity&lt;/span&gt;         &lt;span class="nx"&gt;FAIL&lt;/span&gt;  &lt;span class="nx"&gt;No&lt;/span&gt; &lt;span class="nx"&gt;trigger&lt;/span&gt; &lt;span class="nx"&gt;condition&lt;/span&gt; &lt;span class="nx"&gt;found&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;frontmatter&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;structure&lt;/span&gt;   &lt;span class="nx"&gt;PASS&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;conflict&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;detection&lt;/span&gt;      &lt;span class="nx"&gt;PASS&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;example&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;coverage&lt;/span&gt;        &lt;span class="nx"&gt;FAIL&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="nx"&gt;examples&lt;/span&gt; &lt;span class="nx"&gt;found&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="nx"&gt;coding&lt;/span&gt; &lt;span class="nx"&gt;skills&lt;/span&gt; &lt;span class="nx"&gt;need&lt;/span&gt; &lt;span class="err"&gt;≥&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;boundaries&lt;/span&gt;        &lt;span class="nx"&gt;PASS&lt;/span&gt;
  &lt;span class="err"&gt;■&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;chain&lt;/span&gt;        &lt;span class="nc"&gt;SKIP  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;experimental&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="nx"&gt;issues&lt;/span&gt; &lt;span class="nx"&gt;found&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Run&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;fix&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;repair&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;##&lt;/span&gt; &lt;span class="nx"&gt;Step&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Prescriptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Not&lt;/span&gt; &lt;span class="nx"&gt;Just&lt;/span&gt; &lt;span class="nx"&gt;Pass&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;Fail&lt;/span&gt;

&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nx"&gt;pulser&lt;/span&gt; &lt;span class="nx"&gt;generates&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;specific&lt;/span&gt; &lt;span class="nx"&gt;repair&lt;/span&gt; &lt;span class="nx"&gt;suggestions&lt;/span&gt; &lt;span class="nx"&gt;rather&lt;/span&gt; &lt;span class="nx"&gt;than&lt;/span&gt; &lt;span class="nx"&gt;generic&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;what&lt;/span&gt; &lt;span class="nx"&gt;separates&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;diagnostic&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;linter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="nx"&gt;Every&lt;/span&gt; &lt;span class="nx"&gt;competitor&lt;/span&gt; &lt;span class="nx"&gt;I&lt;/span&gt; &lt;span class="nx"&gt;evaluated&lt;/span&gt; &lt;span class="nx"&gt;takes&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;same&lt;/span&gt; &lt;span class="nx"&gt;approach&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;scan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;report&lt;/span&gt; &lt;span class="nx"&gt;pass&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;fail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;That&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s a linter. I didn&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="nx"&gt;want&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;linter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;I&lt;/span&gt; &lt;span class="nx"&gt;wanted&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;doctor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;When&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt; &lt;span class="nx"&gt;finds&lt;/span&gt; &lt;span class="nx"&gt;an&lt;/span&gt; &lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;generates&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;prescription&lt;/span&gt; &lt;span class="nx"&gt;showing&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;proposed&lt;/span&gt; &lt;span class="nx"&gt;fix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;why&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;fix&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;appropriate&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt; &lt;span class="nx"&gt;skill&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s type. A `coding` skill with a vague description gets different language than a `writing` skill with the same problem:

```bash
$ npx pulser-cli --fix

  Prescription for my-tdd-skill.md:

  1. description-quality (WARN)
     Current:  "TDD stuff"
     Proposed: "Use when starting any coding task — enforces red-green-refactor
                cycle with test-first development. Triggers on: &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;write&lt;/span&gt; &lt;span class="nx"&gt;tests&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;,
                &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;TDD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;test&lt;/span&gt; &lt;span class="nx"&gt;first&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;."
     → Auto-fix available. Backup will be created.

  2. trigger-clarity (FAIL)
     Current:  (none)
     Proposed: Add trigger block to frontmatter:
               triggers: ["TDD", "test first", "write tests", "red green refactor"]
     → Auto-fix available. Backup will be created.

  Apply fixes? [y/N]

**Every fix creates a timestamped backup before writing anything.** The files go to `.pulser/backups/{timestamp}/` and you can roll them back with one command. Nothing is modified without showing you the exact diff first.

## Step 4: The Undo System

The fix engine uses atomic writes with full rollback support. I learned this lesson from deploying database migrations: **if you can&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="nx"&gt;undo&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;shouldn&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;t automate it.**

```bash
$ npx pulser-cli undo

  Found 1 backup set:
  [1] 2026-03-15T14:32:00Z — 2 files modified
      my-tdd-skill.md (description + triggers added)
      debug-workflow.md (examples section added)

  Restore backup [1]? [y/N] y

  ✓ Restored my-tdd-skill.md
  ✓ Restored debug-workflow.md
  Backup retained at .pulser/backups/2026-03-15T143200/

The undo system reads the backup, validates the file still exists at the expected path, writes to a temp file, then renames atomically. **If the process dies mid-write, you don&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="nx"&gt;end&lt;/span&gt; &lt;span class="nx"&gt;up&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;half&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;written&lt;/span&gt; &lt;span class="nx"&gt;skill&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="nx"&gt;This&lt;/span&gt; &lt;span class="nx"&gt;sounds&lt;/span&gt; &lt;span class="nx"&gt;paranoid&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;markdown&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;but&lt;/span&gt; &lt;span class="nx"&gt;partial&lt;/span&gt; &lt;span class="nx"&gt;writes&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;shell&lt;/span&gt; &lt;span class="nx"&gt;scripts&lt;/span&gt; &lt;span class="nx"&gt;have&lt;/span&gt; &lt;span class="nx"&gt;corrupted&lt;/span&gt; &lt;span class="nx"&gt;enough&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;my&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt; &lt;span class="nx"&gt;I&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="nx"&gt;treat&lt;/span&gt; &lt;span class="nx"&gt;atomic&lt;/span&gt; &lt;span class="nx"&gt;writes&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;non&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;negotiable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;##&lt;/span&gt; &lt;span class="nx"&gt;Step&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;TUI&lt;/span&gt; &lt;span class="nx"&gt;That&lt;/span&gt; &lt;span class="nx"&gt;Nobody&lt;/span&gt; &lt;span class="nx"&gt;Asked&lt;/span&gt; &lt;span class="nc"&gt;For &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;But&lt;/span&gt; &lt;span class="nx"&gt;Everyone&lt;/span&gt; &lt;span class="nx"&gt;Remembers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nx"&gt;pulser&lt;/span&gt; &lt;span class="nx"&gt;displays&lt;/span&gt; &lt;span class="nx"&gt;an&lt;/span&gt; &lt;span class="nx"&gt;EtCO2&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;style&lt;/span&gt; &lt;span class="nx"&gt;patient&lt;/span&gt; &lt;span class="nx"&gt;monitor&lt;/span&gt; &lt;span class="nx"&gt;animation&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="nx"&gt;scanning&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;waveform&lt;/span&gt; &lt;span class="nx"&gt;tracing&lt;/span&gt; &lt;span class="nx"&gt;across&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;terminal&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;real&lt;/span&gt; &lt;span class="nx"&gt;time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;exactly&lt;/span&gt; &lt;span class="nx"&gt;like&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;hospital&lt;/span&gt; &lt;span class="nx"&gt;vital&lt;/span&gt; &lt;span class="nx"&gt;signs&lt;/span&gt; &lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Was&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;necessary&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;No&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nx"&gt;Did&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;make&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="nx"&gt;memorable&lt;/span&gt; &lt;span class="nx"&gt;enough&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt; &lt;span class="nx"&gt;three&lt;/span&gt; &lt;span class="nx"&gt;people&lt;/span&gt; &lt;span class="nx"&gt;asked&lt;/span&gt; &lt;span class="nx"&gt;about&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;animation&lt;/span&gt; &lt;span class="nx"&gt;before&lt;/span&gt; &lt;span class="nx"&gt;asking&lt;/span&gt; &lt;span class="nx"&gt;what&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="nx"&gt;actually&lt;/span&gt; &lt;span class="nx"&gt;does&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;Yes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;

&lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`bash
$ npx pulser-cli --all

  ╭──────────────────────────────────────────╮
  │  ♥ PULSER — Skill Vitals                 │
  │  ╱╲    ╱╲    ╱╲    ╱╲                    │
  │ ╱  ╲__╱  ╲__╱  ╲__╱  ╲__                │
  │                                          │
  │  Skills: 23  Healthy: 9  Warning: 8      │
  │  Critical: 6  Fixable: 12                │
  ╰──────────────────────────────────────────╯

The `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;no&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;anim&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` flag disables the animation for CI pipelines and terminals that don't support ANSI escape codes. I demoed the tool in a Discord server for Claude Code developers, and the animation generated more immediate questions than the diagnostic output. Memorable UI is a distribution strategy.

## Step 6: Output Formats for Every Workflow

pulser supports three output formats because different workflows require different data shapes. The default is human-readable terminal output. `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` emits a strict schema suitable for piping into other tools. `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="nx"&gt;md&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` generates a markdown report you can commit to your repo.

`&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="nx"&gt;bash&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Human&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nf"&gt;readable &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="nx"&gt;npx&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;cli&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;piping&lt;/span&gt; &lt;span class="nx"&gt;into&lt;/span&gt; &lt;span class="nx"&gt;other&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt;
&lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="nx"&gt;npx&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;cli&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;jq&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.issues[] | select(.severity == "error")&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Markdown&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;documentation&lt;/span&gt;
&lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="nx"&gt;npx&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;cli&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="nx"&gt;md&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;skill&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;md&lt;/span&gt;

&lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt; &lt;span class="nx"&gt;schema&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;stable&lt;/span&gt; &lt;span class="nx"&gt;across&lt;/span&gt; &lt;span class="nx"&gt;patch&lt;/span&gt; &lt;span class="nx"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;I&lt;/span&gt; &lt;span class="nx"&gt;use&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;pre&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;commit&lt;/span&gt; &lt;span class="nx"&gt;hook&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt; &lt;span class="nx"&gt;blocks&lt;/span&gt; &lt;span class="nx"&gt;commits&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt; &lt;span class="nx"&gt;core&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;tier&lt;/span&gt; &lt;span class="nx"&gt;rules&lt;/span&gt; &lt;span class="nx"&gt;fail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`bash
#!/bin/bash
# .git/hooks/pre-commit
ISSUES=$(npx pulser-cli --format json --strict 2&amp;gt;/dev/null | jq '.summary.errors')
if [ "$ISSUES" -gt 0 ]; then
  echo "pulser: $ISSUES skill errors found. Run 'npx pulser-cli' to see details."
  exit 1
fi

**Running pulser as a pre-commit hook means broken skills never reach your main branch.** The scan completes in under 4 seconds for 20 skills, fast enough that it doesn't noticeably slow commits.

## Step 7: The Build Pipeline

The whole project is TypeScript, bundled with tsup into a single ESM file. Four runtime dependencies total:

`&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pulser-cli&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;version&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0.3.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;module&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bin&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pulser&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./dist/index.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dependencies&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;commander&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;^12.0.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gray-matter&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;^4.0.3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chalk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;^5.3.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;boxen&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;^7.1.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nx"&gt;gray&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;matter&lt;/span&gt; &lt;span class="nx"&gt;parses&lt;/span&gt; &lt;span class="nx"&gt;frontmatter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;commander&lt;/span&gt; &lt;span class="nx"&gt;handles&lt;/span&gt; &lt;span class="nx"&gt;CLI&lt;/span&gt; &lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chalk&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;boxen&lt;/span&gt; &lt;span class="nx"&gt;handle&lt;/span&gt; &lt;span class="nx"&gt;terminal&lt;/span&gt; &lt;span class="nx"&gt;formatting&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="nx"&gt;Everything&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;standard&lt;/span&gt; &lt;span class="nx"&gt;library&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="nx"&gt;total&lt;/span&gt; &lt;span class="nx"&gt;bundle&lt;/span&gt; &lt;span class="nx"&gt;size&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="mi"&gt;847&lt;/span&gt;&lt;span class="nx"&gt;KB&lt;/span&gt; &lt;span class="nx"&gt;including&lt;/span&gt; &lt;span class="nx"&gt;dependencies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;only&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt; &lt;span class="nx"&gt;requirement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`bash
# Install globally
npm i -g pulser-cli

# Or run without installing
npx pulser-cli

## What I'd Do Differently

**I should have shipped with 3 rules instead of 8.** I launched with 8 rules because I wanted to feel comprehensive. In practice, the 3 core rules catch 80% of real problems. The recommended tier adds nuance but also adds noise for users who just want the basics working. I'd ship core-only and gate the rest behind a `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;full&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` flag.

**The classifier needs more training data.** My confidence scores are calibrated against my own skill files plus a small set of open source examples — roughly 50 skills total. The classifier works well for common patterns (TDD skills, writing skills, workflow automations) but produces low-confidence scores on unusual skill types. I need at minimum 200 diverse skills to make the confidence values trustworthy.

**I over-invested in the TUI animation before the fix engine was solid.** The waveform animation took a full day to build. During that time, the prescription engine had a bug where it would suggest adding a `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="nx"&gt;triggers&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` field to skills that already had trigger keywords embedded in the body — a false positive that would have confused early users. Animation is memorable, but correctness ships first.

## The Numbers

### Cost Comparison

| Approach | Cost | Time to First Result |
|----------|------|---------------------|
| Manual skill review | $0 | 2–3 hours for 20 skills |
| pulser scan | $0 | 4 seconds for 20 skills |
| Asking Claude to review skills | ~$0.03/skill | 30 seconds per skill |
| Building custom validation | $0 + 8–16 hours dev time | Varies |

### Scan Performance

| Metric | Value |
|--------|-------|
| Skills scanned per second | ~5 |
| Average fix generation time | 200ms |
| Backup + atomic write | &amp;lt;50ms per file |
| Total bundle size | 847KB |
| Runtime dependencies | 4 |
| Diagnostic rules | 8 (3 core + 4 recommended + 1 experimental) |

## FAQ

### Does pulser modify my skill files without asking?

No. The `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="nx"&gt;fix&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;` flag shows you every proposed change with a before/after diff and requires explicit confirmation before writing. Every modification creates a timestamped backup at `&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nx"&gt;pulser&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;backups&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="sr"&gt;/{% raw %}` before touching the original. You can restore any change with `pulser undo`. Nothing is destructive by default&lt;/span&gt;&lt;span class="err"&gt;.
&lt;/span&gt;
&lt;span class="err"&gt;###&lt;/span&gt; &lt;span class="nx"&gt;How&lt;/span&gt; &lt;span class="nx"&gt;does&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt; &lt;span class="nx"&gt;know&lt;/span&gt; &lt;span class="nx"&gt;what&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;good&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;skill&lt;/span&gt; &lt;span class="nx"&gt;looks&lt;/span&gt; &lt;span class="nx"&gt;like&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;

&lt;span class="nx"&gt;It&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s published skill quality principles as executable rules. The frontmatter checks follow the documented Claude Code skill schema. The description quality rule uses measurable heuristics: minimum character length, presence of trigger keywords, and specificity scoring based on action verb density. It&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="nx"&gt;opinionated&lt;/span&gt; &lt;span class="nx"&gt;but&lt;/span&gt; &lt;span class="nx"&gt;grounded&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;published&lt;/span&gt; &lt;span class="nx"&gt;documentation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;personal&lt;/span&gt; &lt;span class="nx"&gt;taste&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;###&lt;/span&gt; &lt;span class="nx"&gt;Does&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt; &lt;span class="nx"&gt;work&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;skills&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;subdirectories&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;custom&lt;/span&gt; &lt;span class="nx"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;

&lt;span class="nx"&gt;Yes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;By&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;scans&lt;/span&gt; &lt;span class="s2"&gt;`~/.claude/skills/`&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt; &lt;span class="nx"&gt;project&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;level&lt;/span&gt; &lt;span class="s2"&gt;`.claude/skills/`&lt;/span&gt; &lt;span class="nx"&gt;directories&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;finds&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;You&lt;/span&gt; &lt;span class="nx"&gt;can&lt;/span&gt; &lt;span class="nx"&gt;point&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;at&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`npx pulser-cli /path/to/my/skills/`&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="s2"&gt;`--skill`&lt;/span&gt; &lt;span class="nx"&gt;flag&lt;/span&gt; &lt;span class="nx"&gt;scans&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;single&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`npx pulser-cli --skill my-skill.md`&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;###&lt;/span&gt; &lt;span class="nx"&gt;Can&lt;/span&gt; &lt;span class="nx"&gt;I&lt;/span&gt; &lt;span class="nx"&gt;use&lt;/span&gt; &lt;span class="nx"&gt;pulser&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;CI&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;CD&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;

&lt;span class="nx"&gt;Yes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Use&lt;/span&gt; &lt;span class="s2"&gt;`--format json`&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;machine&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;readable&lt;/span&gt; &lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="s2"&gt;`--strict`&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;exit&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;code&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt; &lt;span class="nx"&gt;errors&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;found&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;The&lt;/span&gt; &lt;span class="s2"&gt;`--no-anim`&lt;/span&gt; &lt;span class="nx"&gt;flag&lt;/span&gt; &lt;span class="nx"&gt;disables&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;TUI&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;non&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;interactive&lt;/span&gt; &lt;span class="nx"&gt;environments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;See&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;pre&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;commit&lt;/span&gt; &lt;span class="nx"&gt;hook&lt;/span&gt; &lt;span class="nx"&gt;example&lt;/span&gt; &lt;span class="nx"&gt;above&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="err"&gt;###&lt;/span&gt; &lt;span class="nx"&gt;What&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s the difference between pulser and a generic markdown linter?

A markdown linter checks syntax. **pulser checks semantics — whether your skill&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="nx"&gt;structure&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;trigger&lt;/span&gt; &lt;span class="nx"&gt;conditions&lt;/span&gt; &lt;span class="nx"&gt;will&lt;/span&gt; &lt;span class="nx"&gt;actually&lt;/span&gt; &lt;span class="nx"&gt;work&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;Claude&lt;/span&gt; &lt;span class="nx"&gt;Code&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s routing logic.** It understands the difference between skill types, generates context-aware prescriptions, and auto-fixes issues with rollback. markdownlint will catch a broken heading. pulser will tell you your description is too generic to route reliably, and rewrite it.

## Try It Yourself

1. Run `npx pulser-cli` in any directory with Claude Code skills
2. Read the output — fix core-tier failures before anything else
3. Run `npx pulser-cli --fix` to review proposed repairs
4. Accept the fixes and verify Claude routes to your skills more reliably
5. Add the pre-commit hook if you want ongoing enforcement
6. Star the repo if it helped: [whynowlab/pulser](https://github.com/whynowlab/pulser)

## What&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="nx"&gt;Next&lt;/span&gt;

&lt;span class="nx"&gt;If&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ve written Claude Code skills and wondered why Claude sometimes ignores them — run the scan. **The most common failure across 50+ skill files I&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;ve&lt;/span&gt; &lt;span class="nx"&gt;analyzed&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="nx"&gt;field&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s too generic for Claude to route reliably.** It takes 4 seconds to find out if yours has the same problem.

Have you run into skill reliability issues with Claude Code? What failure patterns have you seen? Drop a comment — I&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt; &lt;span class="nx"&gt;building&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt; &lt;span class="nx"&gt;rule&lt;/span&gt; &lt;span class="kd"&gt;set&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;real&lt;/span&gt; &lt;span class="nx"&gt;failure&lt;/span&gt; &lt;span class="nx"&gt;modes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt; &lt;span class="nx"&gt;hypothetical&lt;/span&gt; &lt;span class="nx"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="o"&gt;---&lt;/span&gt;

&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;I&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;m a developer building AI-powered infrastructure tools. pulser started as a debugging script for my own Claude Code skill files and grew into an open source CLI that has now diagnosed over 200 skill files across early adopter setups. I write about building developer tools and the engineering mistakes that make them better.*
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>cli</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Run My AI Assistant 24/7 on a $0 Server. Here's Every Detail.</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Mon, 23 Mar 2026 02:02:54 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-run-my-ai-assistant-247-on-a-0-server-heres-every-detail-32e8</link>
      <guid>https://dev.to/thestack_ai/i-run-my-ai-assistant-247-on-a-0-server-heres-every-detail-32e8</guid>
      <description>&lt;p&gt;My AI assistant doesn't sleep.&lt;/p&gt;

&lt;p&gt;It checks my calendar every 3 hours, summarizes my day at 9 PM, reviews my GitHub repos every 2 hours, and syncs notes to Notion at 10 PM. All on its own.&lt;/p&gt;

&lt;p&gt;The server costs me $0/month. Not "$0 for the first year." Not "$0 with credits." &lt;strong&gt;Actually zero, forever.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's the full setup — including the mistakes I made so you won't have to.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; You can run a Claude-powered Telegram bot with 9 scheduled jobs 24/7 on Oracle Cloud's Always Free tier — $0/month, no expiration. The hardware: 1 OCPU + 1 GB RAM + 4 GB swap. After 2 weeks of operation: 99.7% uptime, 3 auto-recovered OOM events, ~15 scheduled job executions per day. Key tricks: aggressive swap, &lt;code&gt;flock&lt;/code&gt;-based concurrency control, systemd auto-restart, and 5-minute health monitoring via cron.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: AI Assistants Die When You Close Your Laptop
&lt;/h2&gt;

&lt;p&gt;I built a Telegram bot powered by Claude. It worked great — when my MacBook was open. Close the lid? Silent. Morning briefings? Missed. Calendar reminders? Gone.&lt;/p&gt;

&lt;p&gt;The "personal AI assistant" was really a "personal AI assistant that only works during business hours when I'm already at my desk."&lt;/p&gt;

&lt;p&gt;I needed a server. But I also needed it to cost nothing, because this is a side project, not a startup.&lt;/p&gt;

&lt;h2&gt;
  
  
  The $0 Solution: Oracle Cloud Always Free
&lt;/h2&gt;

&lt;p&gt;Oracle Cloud offers an &lt;a href="https://www.oracle.com/cloud/free/" rel="noopener noreferrer"&gt;&lt;strong&gt;Always Free&lt;/strong&gt; tier&lt;/a&gt; that never expires. Unlike AWS's 12-month free tier or GCP's $300 credit, OCI Always Free resources &lt;a href="https://docs.oracle.com/en-us/iaas/Content/FreeTier/freetier_topic-Always_Free_Resources.htm" rel="noopener noreferrer"&gt;do not expire&lt;/a&gt;. Here's what you get:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Always Free Allocation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AMD Compute&lt;/td&gt;
&lt;td&gt;1 OCPU, 1 GB RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Boot Volume&lt;/td&gt;
&lt;td&gt;47 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Outbound Data&lt;/td&gt;
&lt;td&gt;10 TB/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Object Storage&lt;/td&gt;
&lt;td&gt;20 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What you'll pay in total:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OCI server: &lt;strong&gt;$0/month&lt;/strong&gt; (Always Free, no credit card charges after verification)&lt;/li&gt;
&lt;li&gt;Claude API: &lt;strong&gt;$0/month&lt;/strong&gt; (using Claude Code with an existing subscription — not per-API-call billing)&lt;/li&gt;
&lt;li&gt;Telegram Bot API: &lt;strong&gt;$0/month&lt;/strong&gt; (free forever)&lt;/li&gt;
&lt;li&gt;Domain/DNS: &lt;strong&gt;Not needed&lt;/strong&gt; (Telegram handles routing)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total: $0.00/month&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One OCPU and 1 GB of RAM sounds terrible. And honestly? It is terrible. &lt;strong&gt;But it's enough.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Provision and SSH In
&lt;/h2&gt;

&lt;p&gt;After creating your OCI account (you'll need a credit card for verification — it won't be charged), spin up an "Always Free eligible" AMD instance with Ubuntu.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh &lt;span class="nt"&gt;-i&lt;/span&gt; ~/.ssh/your_key ubuntu@YOUR_SERVER_IP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First thing: check your RAM situation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;free &lt;span class="nt"&gt;-h&lt;/span&gt;
&lt;span class="c"&gt;#               total   used   free&lt;/span&gt;
&lt;span class="c"&gt;# Mem:          981Mi   612Mi  102Mi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yeah. 981 MB total. My Telegram bot alone eats 300-400 MB during Claude API calls. We need swap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: The Swap Trick (This Saved Everything)
&lt;/h2&gt;

&lt;p&gt;With 1 GB RAM, you'll hit OOM kills within hours. Don't skip this step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;fallocate &lt;span class="nt"&gt;-l&lt;/span&gt; 4G /swapfile
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;600 /swapfile
&lt;span class="nb"&gt;sudo &lt;/span&gt;mkswap /swapfile
&lt;span class="nb"&gt;sudo &lt;/span&gt;swapon /swapfile

&lt;span class="c"&gt;# Make it permanent&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'/swapfile none swap sw 0 0'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /etc/fstab

&lt;span class="c"&gt;# Tune swappiness — use swap aggressively&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl vm.swappiness&lt;span class="o"&gt;=&lt;/span&gt;60
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'vm.swappiness=60'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /etc/sysctl.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Mem:   981Mi total, ~350Mi free
# Swap:  4.0Gi total, ~3.8Gi free
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why 4 GB and not 2 GB?&lt;/strong&gt; During peak scheduled job execution, swap usage hits 2.1 GB. With 2 GB swap, you'd still OOM. 4 GB gives comfortable headroom on an SSD-backed boot volume. Response times go from ~2s to ~5s under memory pressure, &lt;strong&gt;but the bot never crashes.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Prevent Concurrent Execution
&lt;/h2&gt;

&lt;p&gt;Claude API calls are expensive (in time and memory). Two simultaneous requests on 1 GB RAM = instant OOM. I wrote a wrapper using &lt;a href="https://man7.org/linux/man-pages/man1/flock.1.html" rel="noopener noreferrer"&gt;&lt;code&gt;flock&lt;/code&gt;&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# claude-single.sh — ensures only one Claude process runs at a time&lt;/span&gt;
&lt;span class="nv"&gt;LOCKFILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/tmp/claude-telegram.lock"&lt;/span&gt;

&lt;span class="nb"&gt;exec &lt;/span&gt;200&amp;gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCKFILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; flock &lt;span class="nt"&gt;-n&lt;/span&gt; 200&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Another instance is running. Queuing..."&lt;/span&gt;
    flock 200  &lt;span class="c"&gt;# Wait for lock&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Your actual command here&lt;/span&gt;
claude &lt;span class="nt"&gt;--message&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--output-format&lt;/span&gt; json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Crude? Yes. &lt;strong&gt;Effective? Absolutely.&lt;/strong&gt; Request #2 waits in line instead of competing for RAM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: systemd for Auto-Recovery
&lt;/h2&gt;

&lt;p&gt;The bot needs to survive crashes, OOM kills, and server reboots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="c"&gt;# /etc/systemd/system/claude-telegram.service
&lt;/span&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;Claude Telegram Bot&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network-online.target&lt;/span&gt;
&lt;span class="py"&gt;Wants&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network-online.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;simple&lt;/span&gt;
&lt;span class="py"&gt;User&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;ubuntu&lt;/span&gt;
&lt;span class="py"&gt;WorkingDirectory&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/home/ubuntu/claude-telegram&lt;/span&gt;
&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/home/ubuntu/claude-telegram/start.sh&lt;/span&gt;
&lt;span class="py"&gt;Restart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;always&lt;/span&gt;
&lt;span class="py"&gt;RestartSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;10&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;NODE_ENV=production&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;multi-user.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;claude-telegram
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start claude-telegram
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;Restart=always&lt;/code&gt; is the key line.&lt;/strong&gt; OOM kill at 3 AM? Back up in 10 seconds. No human intervention needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Health Monitoring
&lt;/h2&gt;

&lt;p&gt;"It's probably running" is not a monitoring strategy. I verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# monitor.sh — runs every 5 minutes via cron&lt;/span&gt;
&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="nv"&gt;SERVICE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"claude-telegram"&lt;/span&gt;
&lt;span class="nv"&gt;LOG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/home/ubuntu/claude-telegram/logs/monitor.log"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; systemctl is-active &lt;span class="nt"&gt;--quiet&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SERVICE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="nv"&gt;$SERVICE&lt;/span&gt;&lt;span class="s2"&gt; is down. Restarting..."&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SERVICE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="c"&gt;# Alert via Telegram (using a separate, lightweight curl call)&lt;/span&gt;
    curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"https://api.telegram.org/bot&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BOT_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"chat_id=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ADMIN_CHAT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"text=Bot was down. Auto-restarted at &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Cron: check every 5 minutes&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;/5 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /home/ubuntu/claude-telegram/monitor.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In 2 weeks of operation, the monitor has caught and recovered from 3 OOM kills. &lt;strong&gt;Each time, the bot was back within 15 seconds.&lt;/strong&gt; Without monitoring, I wouldn't have even known it went down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Scheduled Jobs (The Real Value)
&lt;/h2&gt;

&lt;p&gt;A bot that only responds to messages is a chatbot. &lt;strong&gt;A bot that &lt;em&gt;proactively works for you&lt;/em&gt; is an assistant.&lt;/strong&gt; This distinction matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  The 9 Jobs I Run Daily
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;th&gt;Schedule&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Morning Briefing&lt;/td&gt;
&lt;td&gt;Weekdays 8:00 AM&lt;/td&gt;
&lt;td&gt;Calendar + tasks + news summary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calendar Check&lt;/td&gt;
&lt;td&gt;4x daily (9/12/15/18)&lt;/td&gt;
&lt;td&gt;Upcoming meeting alerts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Evening Summary&lt;/td&gt;
&lt;td&gt;Daily 9:00 PM&lt;/td&gt;
&lt;td&gt;Day recap + memory save&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Check&lt;/td&gt;
&lt;td&gt;Every 2 hours&lt;/td&gt;
&lt;td&gt;PR reviews, issue notifications&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory Sync&lt;/td&gt;
&lt;td&gt;Every 3 hours&lt;/td&gt;
&lt;td&gt;Sync context across devices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Notion Sync&lt;/td&gt;
&lt;td&gt;Daily 10:00 PM&lt;/td&gt;
&lt;td&gt;Save important conversations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weekly Review&lt;/td&gt;
&lt;td&gt;Friday 6:00 PM&lt;/td&gt;
&lt;td&gt;Week summary + planning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token Check&lt;/td&gt;
&lt;td&gt;Daily 10:00 AM&lt;/td&gt;
&lt;td&gt;Verify API credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Threads Notify&lt;/td&gt;
&lt;td&gt;2x daily&lt;/td&gt;
&lt;td&gt;Social updates digest&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why Morning Briefing Is the Killer Feature
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The morning briefing alone justifies the entire setup.&lt;/strong&gt; Every weekday at 8 AM, before I even open my laptop, my phone buzzes. One Telegram message with my schedule, pending tasks, and relevant news — generated by Claude using my calendar data via &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP (Model Context Protocol)&lt;/a&gt; integration.&lt;/p&gt;

&lt;p&gt;It pulls from Google Calendar and Notion, cross-references deadlines, and delivers a single digest. &lt;strong&gt;No app switching. No manual checks.&lt;/strong&gt; One message. That's it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 7: Authentication That Doesn't Expire Every Hour
&lt;/h2&gt;

&lt;p&gt;Default OAuth tokens expire quickly. For a 24/7 server, you need a long-lived token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Store in .env&lt;/span&gt;
&lt;span class="nv"&gt;CLAUDE_CODE_OAUTH_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_long_lived_token_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I configured a token with 1-year validity. &lt;strong&gt;Without this, the bot silently fails every few hours&lt;/strong&gt; when the token expires — and you wake up to 8 hours of missed messages with no error in sight.&lt;/p&gt;

&lt;p&gt;I also run a daily token health check that alerts me 30 days before expiry.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Start with swap from minute one.&lt;/strong&gt; I spent 2 days debugging random crashes before realizing it was OOM kills. One command would have told me immediately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dmesg | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; oom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Don't underestimate 1 OCPU.&lt;/strong&gt; The CPU is fine. It's not fast, but Claude API calls are I/O-bound (waiting for API responses), not CPU-bound. One OCPU handles my workload without breaking a sweat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Log everything from day one.&lt;/strong&gt; I added proper logging after a week of "it probably works." Future me was grateful. The monitor log has caught issues I never would have noticed otherwise.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers: $0 vs Paid Servers
&lt;/h2&gt;

&lt;p&gt;After 2 weeks of 24/7 operation, here's how the free tier actually compares to paid alternatives:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;OCI Always Free ($0/mo)&lt;/th&gt;
&lt;th&gt;Typical VPS ($5-20/mo)&lt;/th&gt;
&lt;th&gt;Dedicated ($200+/mo)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Uptime&lt;/td&gt;
&lt;td&gt;99.7% (auto-recovered)&lt;/td&gt;
&lt;td&gt;~99.9%&lt;/td&gt;
&lt;td&gt;99.99%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response time&lt;/td&gt;
&lt;td&gt;40-50s&lt;/td&gt;
&lt;td&gt;40-50s&lt;/td&gt;
&lt;td&gt;40-50s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly cost&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;td&gt;$5-20&lt;/td&gt;
&lt;td&gt;$200+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAM&lt;/td&gt;
&lt;td&gt;1 GB + 4 GB swap&lt;/td&gt;
&lt;td&gt;2-4 GB&lt;/td&gt;
&lt;td&gt;16+ GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrent requests&lt;/td&gt;
&lt;td&gt;1 (flock-limited)&lt;/td&gt;
&lt;td&gt;3-5&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The response time row is the key insight: it's identical across all tiers.&lt;/strong&gt; The bottleneck is Claude's API latency (40-50 seconds of thinking time), not server hardware. You'd be paying $200/month for faster concurrent handling — which you probably don't need for a personal assistant that handles 30-40 messages a day.&lt;/p&gt;

&lt;h3&gt;
  
  
  Full Metrics Breakdown
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Uptime&lt;/td&gt;
&lt;td&gt;99.7% (3 auto-recovered OOM events)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly cost&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average response time&lt;/td&gt;
&lt;td&gt;40-50 seconds (Claude API latency)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daily messages handled&lt;/td&gt;
&lt;td&gt;~30-40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scheduled jobs/day&lt;/td&gt;
&lt;td&gt;~15 executions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAM usage (steady state)&lt;/td&gt;
&lt;td&gt;~600 MB + swap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Swap usage (peak)&lt;/td&gt;
&lt;td&gt;~2.1 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Total Stack
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-----------------------------------+
|  Oracle Cloud (Always Free)       |
|  Ubuntu 22.04 / 1 OCPU / 1 GB    |
|  + 4 GB Swap                      |
|                                   |
|  +-----------------------------+  |
|  | systemd (auto-restart)      |  |
|  |  +- Telegram Bot (Python)   |  |
|  |      +-- Claude API         |  |
|  |      +-- MCP: Calendar      |  |
|  |      +-- MCP: Notion        |  |
|  |      +-- 9 Scheduled Jobs   |  |
|  +-----------------------------+  |
|                                   |
|  +-----------------------------+  |
|  | Cron                        |  |
|  |  +-- monitor.sh (5 min)     |  |
|  |  +-- memory-sync (3 hr)     |  |
|  |  +-- token-check (daily)    |  |
|  +-----------------------------+  |
+-----------------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Will Oracle actually keep this free forever?
&lt;/h3&gt;

&lt;p&gt;Oracle's Always Free tier has been available since 2019 with no expiration date. Unlike AWS's 12-month free tier or GCP's $300 credit that runs out, OCI Always Free resources genuinely persist indefinitely. I've been running for over 2 weeks with zero charges on my account. That said — Oracle can change terms, so don't build a business-critical production service on it. For a personal AI assistant? It's perfect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I run GPT-based bots instead of Claude?
&lt;/h3&gt;

&lt;p&gt;Yes. The server architecture is completely API-agnostic. Swap the Claude API call for OpenAI's API and everything else stays the same. The memory constraints, swap tricks, &lt;code&gt;flock&lt;/code&gt; concurrency, and systemd setup apply identically — &lt;strong&gt;the bottleneck is always API latency, not local compute.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What happens when the server runs out of swap?
&lt;/h3&gt;

&lt;p&gt;The Linux OOM killer activates and terminates the heaviest process — which is your bot. With &lt;code&gt;Restart=always&lt;/code&gt; in the systemd unit file, the bot comes back in approximately 10 seconds. In my 2 weeks of operation, this happened 3 times — all during overlapping scheduled job executions. The 5-minute monitor catches anything systemd misses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is 1 GB RAM enough for multiple bots?
&lt;/h3&gt;

&lt;p&gt;Probably not. One bot with &lt;code&gt;flock&lt;/code&gt;-serialized requests uses approximately 600 MB at steady state. A second bot would push you into constant heavy swapping with degraded performance. For running multiple bots, look at OCI's Ampere A1 Always Free tier — 4 OCPUs and 24 GB RAM — though availability varies by region and is often limited.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I set up the MCP integrations (Calendar, Notion)?
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP (Model Context Protocol)&lt;/a&gt; lets Claude connect to external services through standardized tool interfaces. I use two MCP servers: one for Google Calendar (HTTP-based with OAuth) and one for Notion (stdio-based with API key). The Claude documentation covers MCP setup in detail — the server-side configuration is identical whether you're running locally or on OCI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Get an OCI account&lt;/strong&gt;: Sign up at &lt;a href="https://www.oracle.com/cloud/free/" rel="noopener noreferrer"&gt;Oracle Cloud&lt;/a&gt;. Select "Always Free" resources only.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up swap immediately&lt;/strong&gt;: 4 GB minimum. Don't skip this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use systemd + monitoring&lt;/strong&gt;: Your bot &lt;em&gt;will&lt;/em&gt; crash. Make sure it comes back automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start with one scheduled job&lt;/strong&gt;: Morning briefing is the most impactful. Add more as you validate each one works.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The whole setup took me a weekend. The hardest part wasn't the code — it was accepting that 1 GB of RAM is actually enough when you manage it properly. Then it just... runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your AI assistant shouldn't need your laptop to be open.&lt;/strong&gt; Give it a home that runs while you sleep.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's your always-on setup?&lt;/strong&gt; Running AI tools on a VPS, a Raspberry Pi, or something weirder? Drop it in the comments — I genuinely want to see what others have built.&lt;/p&gt;

&lt;p&gt;If this saved you some Googling, &lt;strong&gt;bookmark this post&lt;/strong&gt; for when you're ready to deploy. And if you're into AI tools that actually run in production — not just in demos — &lt;strong&gt;follow me&lt;/strong&gt; for more.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I build AI-powered automation that runs 24/7 on minimal infrastructure — no drama, no cloud bills. This bot handles 30-40 messages daily plus 15 scheduled jobs without me touching it. If that's the kind of AI engineering you care about, follow along.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cloud</category>
      <category>devops</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>I wrote a tool that diagnoses your Claude Code skills before they break</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Thu, 19 Mar 2026 08:56:45 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-wrote-a-tool-that-diagnoses-your-claude-code-skills-before-they-break-58ao</link>
      <guid>https://dev.to/thestack_ai/i-wrote-a-tool-that-diagnoses-your-claude-code-skills-before-they-break-58ao</guid>
      <description>&lt;p&gt;If you use Claude Code skills, you've probably been there — a skill looks fine, works&lt;br&gt;
  sometimes, then silently fails when you actually need it. Missing gotchas section,&lt;br&gt;
  description that doesn't trigger properly, 400-line monolith that should've been split&lt;br&gt;
  into files.&lt;/p&gt;

&lt;p&gt;▎ Anthropic published 7 principles for writing good skills but didn't build anything to&lt;br&gt;
  enforce them. So I did.&lt;/p&gt;

&lt;p&gt;▎ pulser scans your SKILL.md files against 8 rules derived from those principles. But&lt;br&gt;
  here's what makes it different from a linter: it doesn't just say "this is wrong." It&lt;br&gt;
  tells you why it matters, gives you a ready-to-use template, and auto-fixes it if you&lt;br&gt;
  want. And if the fix isn't what you expected, pulser undo rolls everything back&lt;br&gt;
  instantly.&lt;/p&gt;

&lt;p&gt;▎ The whole pipeline is: diagnose, classify the skill type, prescribe with context, fix,&lt;br&gt;
   rollback.&lt;br&gt;
  ▎ You can run it three ways. In your terminal as a CLI (just type pulser). Inside Claude&lt;br&gt;
   Code as a conversation ("check my skills" or /pulser and Claude does the rest). Or in&lt;br&gt;
  Codex.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                                                                                ▎ I've been running it on my own 54 skills and it caught things I would've never noticed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;— skills with overlapping trigger keywords stepping on each other, descriptions written&lt;br&gt;
   for humans instead of the model, missing tool restrictions that could let a read-only&lt;br&gt;
  skill accidentally write files.&lt;/p&gt;

&lt;p&gt;▎ It's free, MIT, single npm install.&lt;/p&gt;

&lt;p&gt;▎ npm install -g pulser-cli&lt;/p&gt;

&lt;p&gt;▎ github.com/whynowlab/pulser&lt;/p&gt;

</description>
    </item>
    <item>
      <title>I Run 5 Claude Code Agents at Once. I Had No Idea What They Were Doing.</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Thu, 19 Mar 2026 04:32:01 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-run-5-claude-code-agents-at-once-i-had-no-idea-what-they-were-doing-273p</link>
      <guid>https://dev.to/thestack_ai/i-run-5-claude-code-agents-at-once-i-had-no-idea-what-they-were-doing-273p</guid>
      <description>&lt;p&gt;I run 16 agent teams. Engineering, research, design, marketing, security — each team with 3-5 specialized agents working in parallel. On a busy day, that's 40+ concurrent Claude Code processes across my machines.&lt;/p&gt;

&lt;p&gt;The problem wasn't running them. Claude Code handles that fine. The problem was knowing what was happening.&lt;/p&gt;

&lt;p&gt;Which team finished? Which agent is stuck waiting for a permission prompt? Did the research team burn through $8 on a task I expected to cost $2? Is the code-review team actually running, or did it silently crash 20 minutes ago?&lt;/p&gt;

&lt;p&gt;I had no answers. Just a wall of terminal windows.&lt;/p&gt;

&lt;p&gt;So I built &lt;a href="https://github.com/whynowlab/ur-dashboard" rel="noopener noreferrer"&gt;ur-dashboard&lt;/a&gt;. ur-dashboard is a zero-config, real-time monitoring dashboard for Claude Code multi-agent workflows. One npm install, one command, and you get a single screen showing every agent, every cost, every skill — updated every 5 seconds via Server-Sent Events.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; ur-dashboard
ur-dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Done.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgf493acqqvvvaujbbup.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgf493acqqvvvaujbbup.png" alt="ur-dashboard showing agent activity, usage metrics, team panels, and skill tracking" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why is there no built-in monitoring for Claude Code agents?
&lt;/h2&gt;

&lt;p&gt;Claude Code is great at running agents. But once you scale beyond 2-3 agents, you lose visibility. There's no built-in way to see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which agents are active right now&lt;/li&gt;
&lt;li&gt;How much each model costs you per session&lt;/li&gt;
&lt;li&gt;Whether your team groupings are actually being used&lt;/li&gt;
&lt;li&gt;What skills your agents invoke most often&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Existing LLM observability tools (Langfuse, Helicone, LangSmith) solve this for production APIs — but they all require SDK integration, API keys, and infrastructure setup. If you just want to see what your local Claude Code agents are doing right now, there was nothing.&lt;/p&gt;

&lt;p&gt;ur-dashboard fills that gap. It reads your existing &lt;code&gt;~/.claude/&lt;/code&gt; directory with zero configuration — no SDK, no API keys, no infrastructure. Install with &lt;code&gt;npm install -g ur-dashboard&lt;/code&gt;, run &lt;code&gt;ur-dashboard&lt;/code&gt;, and open &lt;code&gt;localhost:3000&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who should use ur-dashboard?
&lt;/h2&gt;

&lt;p&gt;ur-dashboard is designed for developers who run 2 or more Claude Code agents simultaneously. Specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You run multiple Claude Code agents and lose track of what's happening&lt;/li&gt;
&lt;li&gt;You want to know how much your AI workflows cost before the bill arrives&lt;/li&gt;
&lt;li&gt;You manage agent teams and need a visual overview of who's active&lt;/li&gt;
&lt;li&gt;You want to organize your agents into departments without editing config files by hand&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What does ur-dashboard show?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Real-time agent monitoring
&lt;/h3&gt;

&lt;p&gt;The dashboard auto-detects agents from &lt;code&gt;~/.claude/agents/&lt;/code&gt; and displays each agent's current status (active, idle, or stopped). If you're using an orchestrator, it picks up team groupings automatically. Status updates arrive every 5 seconds with no page refresh needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  API cost tracking
&lt;/h3&gt;

&lt;p&gt;ur-dashboard reads JSONL usage logs from &lt;code&gt;~/.claude/&lt;/code&gt; and calculates costs per model in real time:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input (per 1M tokens)&lt;/th&gt;
&lt;th&gt;Output (per 1M tokens)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;gpt-5.4&lt;/td&gt;
&lt;td&gt;$2.50&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gemini-3.1-pro&lt;/td&gt;
&lt;td&gt;$2.00&lt;/td&gt;
&lt;td&gt;$12.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-sonnet-4&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Costs are aggregated across all providers (OpenAI, Google, Anthropic) and displayed as a single total. You see exactly how much each model costs per session.&lt;/p&gt;

&lt;h3&gt;
  
  
  Team management
&lt;/h3&gt;

&lt;p&gt;Group agents into teams directly from the Settings tab. Save configurations. The dashboard persists groupings to &lt;code&gt;~/.claude/agents/teams.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"teams"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"engineering"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Core development"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"agents"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"code-reviewer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"implementer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tester"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No team config? It auto-groups agents by filename prefix.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dispatch API
&lt;/h3&gt;

&lt;p&gt;Trigger agents programmatically from any script or workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start an agent&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:3000/api/dispatch &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"agent": "code-reviewer", "prompt": "Review the auth module"}'&lt;/span&gt;

&lt;span class="c"&gt;# Stream output in real-time&lt;/span&gt;
curl &lt;span class="nt"&gt;-N&lt;/span&gt; http://localhost:3000/api/dispatch/&lt;span class="o"&gt;{&lt;/span&gt;jobId&lt;span class="o"&gt;}&lt;/span&gt;/stream

&lt;span class="c"&gt;# Cancel if needed&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; DELETE http://localhost:3000/api/dispatch/&lt;span class="o"&gt;{&lt;/span&gt;jobId&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Dispatch API supports a maximum of 3 concurrent jobs with configurable timeouts. All inputs are validated, and commands use &lt;code&gt;spawn&lt;/code&gt; with &lt;code&gt;shell: false&lt;/code&gt; to prevent shell injection.&lt;/p&gt;




&lt;h2&gt;
  
  
  How does ur-dashboard compare to Langfuse, Helicone, and LangSmith?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;ur-dashboard&lt;/th&gt;
&lt;th&gt;Langfuse&lt;/th&gt;
&lt;th&gt;Helicone&lt;/th&gt;
&lt;th&gt;LangSmith&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Zero config&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code native&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Open source (MIT)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;npx&lt;/code&gt; one-liner install&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDK integration required&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent dispatch API&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-hosted&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key difference: Langfuse, Helicone, and LangSmith are general-purpose LLM observability platforms designed for production API tracing. They require SDK integration, API keys, and infrastructure setup.&lt;/p&gt;

&lt;p&gt;ur-dashboard is purpose-built for Claude Code local development. It reads your existing &lt;code&gt;~/.claude/&lt;/code&gt; directory with zero instrumentation — no code changes, no API keys, no hosted service. If you're already using Langfuse for production tracing, ur-dashboard is not a replacement. It solves a different problem: real-time visibility into local multi-agent workflows.&lt;/p&gt;




&lt;h2&gt;
  
  
  How does ur-dashboard stream data to the browser?
&lt;/h2&gt;

&lt;p&gt;The main dashboard endpoint (&lt;code&gt;GET /api/stream&lt;/code&gt;) uses Server-Sent Events (SSE) over a persistent HTTP connection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-5.4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"cost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.42&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"totalCost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.87&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"teams"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"engineering"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"agentCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"commits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fix auth bug"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"project"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"api"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"skills"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"skill"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tdd"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"capabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"canDispatch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"cliVersion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.1.78"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no polling. The browser receives updates every 5 seconds via a single persistent SSE connection, keeping network overhead minimal.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is ur-dashboard built with?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Framework&lt;/td&gt;
&lt;td&gt;Next.js 16 (App Router, standalone)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UI&lt;/td&gt;
&lt;td&gt;React 19 + Tailwind CSS 4 (glassmorphism)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Charts&lt;/td&gt;
&lt;td&gt;Recharts 3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Streaming&lt;/td&gt;
&lt;td&gt;Server-Sent Events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;spawn&lt;/code&gt; with &lt;code&gt;shell: false&lt;/code&gt;, input validation, path traversal prevention&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The entire dashboard ships as a pre-built Next.js standalone binary. Running &lt;code&gt;npx ur-dashboard&lt;/code&gt; starts the server directly — there is no build step on the user's machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does ur-dashboard work without an orchestrator?
&lt;/h3&gt;

&lt;p&gt;Yes. ur-dashboard works without any orchestrator. It scans &lt;code&gt;~/.claude/agents/&lt;/code&gt; and auto-detects agents by filename. If you do have an orchestrator, it picks up team definitions from &lt;code&gt;teams.json&lt;/code&gt; automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does ur-dashboard work on Windows?
&lt;/h3&gt;

&lt;p&gt;Yes. ur-dashboard runs on both macOS and Windows. It uses &lt;code&gt;os.homedir()&lt;/code&gt; for cross-platform path resolution and falls back to &lt;code&gt;taskkill&lt;/code&gt; for process management on Windows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does ur-dashboard modify my Claude Code files?
&lt;/h3&gt;

&lt;p&gt;No. ur-dashboard only reads from &lt;code&gt;~/.claude/&lt;/code&gt;. The only file it writes is &lt;code&gt;~/.claude/agents/teams.json&lt;/code&gt;, and only when you explicitly save team configurations from the Settings tab. Your agent files are never modified or deleted.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is the Dispatch API safe?
&lt;/h3&gt;

&lt;p&gt;Yes. All CLI execution uses &lt;code&gt;child_process.spawn&lt;/code&gt; with &lt;code&gt;shell: false&lt;/code&gt;. User prompts are passed as a single argument — never interpolated into shell commands. Agent names are validated against path traversal. Permission mode is restricted to an allowlist of three values: &lt;code&gt;default&lt;/code&gt;, &lt;code&gt;acceptEdits&lt;/code&gt;, and &lt;code&gt;plan&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much does ur-dashboard cost?
&lt;/h3&gt;

&lt;p&gt;ur-dashboard is free and open source under the MIT license. There is no hosted service, no account required, and no telemetry. It runs entirely on your local machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use ur-dashboard with non-Claude AI agents?
&lt;/h3&gt;

&lt;p&gt;ur-dashboard is purpose-built for Claude Code. It reads Claude Code's &lt;code&gt;~/.claude/&lt;/code&gt; directory structure and JSONL usage logs. It does not currently support other AI coding assistants such as Cursor, Windsurf, or GitHub Copilot.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Global install (recommended)&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; ur-dashboard
ur-dashboard

&lt;span class="c"&gt;# Or try without installing&lt;/span&gt;
npx ur-dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Works on macOS and Windows. MIT licensed. No account or API key required.&lt;/p&gt;




&lt;p&gt;Links:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/whynowlab/ur-dashboard" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.npmjs.com/package/ur-dashboard" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running multi-agent Claude Code workflows and want visibility without setup overhead — give it a try. Stars, issues, and PRs are welcome.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devtools</category>
      <category>monitoring</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Why AI Coding Tools Eat Your RAM (And How to Fix It)</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Tue, 17 Mar 2026 08:07:08 +0000</pubDate>
      <link>https://dev.to/thestack_ai/why-ai-coding-tools-eat-your-ram-and-how-to-fix-it-1l53</link>
      <guid>https://dev.to/thestack_ai/why-ai-coding-tools-eat-your-ram-and-how-to-fix-it-1l53</guid>
      <description>&lt;p&gt;Your AI coding tool isnt slow Your machine is drowning in zombie processes&lt;/p&gt;

&lt;p&gt;The Problem&lt;/p&gt;

&lt;p&gt;If you use Claude Code Codex or similar AI coding assistants youve probably noticed your machine getting slower over time RAM usage creeping up Fans spinning Eventually a force restart&lt;/p&gt;

&lt;p&gt;Most people blame the AI tool But the real culprit is usually orphaned child processes&lt;/p&gt;

&lt;p&gt;Every time you start a Claude Code session it spawns child processes&lt;br&gt;
MCP servers for tool integrations Notion Supabase Playwright etc&lt;br&gt;
Sub agents for parallel task execution&lt;br&gt;
Headless browsers for web browsing and testing&lt;br&gt;
Build tools like esbuild vite webpack in watch mode&lt;/p&gt;

&lt;p&gt;When the session ends especially on crash or force quit these children dont always exit They become orphans Still running Still consuming RAM&lt;/p&gt;

&lt;p&gt;How Bad Is It&lt;/p&gt;

&lt;p&gt;I discovered this the hard way My MacBook Pro 32GB RAM ground to a halt&lt;/p&gt;

&lt;p&gt;Load Average 230 normal is 4 8&lt;br&gt;
RAM 31GB used 99MB free&lt;br&gt;
Swap 145GB of disk thrashing&lt;br&gt;
CPU 99 all kernel task trying to manage the memory crisis&lt;/p&gt;

&lt;p&gt;When I investigated I found&lt;br&gt;
74 Chrome processes from agent browser that never closed 56 GB&lt;br&gt;
18 orphan node processes from dead Claude sessions 51 GB&lt;br&gt;
7 zombie npm exec processes from TaskMaster AI&lt;br&gt;
Playwright headless shells esbuild watchers MCP servers all orphaned&lt;/p&gt;

&lt;p&gt;Thats 22 GB of RAM consumed by processes doing absolutely nothing&lt;/p&gt;

&lt;p&gt;Why This Happens&lt;/p&gt;

&lt;p&gt;The root cause is simple process lifecycle management is hard&lt;/p&gt;

&lt;p&gt;When a Claude Code session exits normally it tries to clean up But&lt;br&gt;
Crash exits dont trigger cleanup hooks&lt;br&gt;
Force quit Cmd Q closing terminal may skip cleanup&lt;br&gt;
Sub agents that spawn their own children create nested orphan trees&lt;br&gt;
MCP servers run as independent processes the parent doesnt always know about them&lt;br&gt;
Headless browsers have their own daemon lifecycle&lt;/p&gt;

&lt;p&gt;Each session leaves a few survivors After a week of heavy use you have dozens&lt;/p&gt;

&lt;p&gt;The Fix zclean&lt;/p&gt;

&lt;p&gt;I built zclean a small CLI that automatically finds and cleans up these orphaned processes&lt;/p&gt;

&lt;p&gt;Install one command&lt;/p&gt;

&lt;p&gt;npx thestackai zclean init&lt;/p&gt;

&lt;p&gt;This sets up two layers of protection&lt;/p&gt;

&lt;p&gt;1 SessionEnd hook when Claude Code exits immediately clean that sessions orphans&lt;br&gt;
2 Hourly scheduler catch anything the hook missed crashes Codex orphans etc&lt;/p&gt;

&lt;p&gt;The Safety Model&lt;/p&gt;

&lt;p&gt;The most important design decision if the parent process is alive dont touch it&lt;/p&gt;

&lt;p&gt;zclean only kills processes that are&lt;br&gt;
Orphaned parent process is dead&lt;br&gt;
Match known AI tool patterns MCP servers agent browser etc&lt;br&gt;
Not in a tmux screen session&lt;br&gt;
Not managed by pm2 forever supervisord&lt;br&gt;
Not in a Docker container&lt;/p&gt;

&lt;p&gt;Your intentional vite dev server in a terminal tab Untouched Your node apijs in tmux Untouched Only true zombies from dead AI sessions&lt;/p&gt;

&lt;p&gt;See It In Action&lt;/p&gt;

&lt;p&gt;zclean&lt;/p&gt;

&lt;p&gt;Found 8 zombie processes&lt;/p&gt;

&lt;p&gt;PID 26413 node 367 MB orphan 18h was claude mcp server&lt;br&gt;
PID 62830 chrome 200 MB orphan 3h was agent browser&lt;br&gt;
PID 26221 npm 142 MB orphan 2d was npm exec task master ai&lt;/p&gt;

&lt;p&gt;Total 8 zombies 20 GB reclaimable&lt;/p&gt;

&lt;p&gt;zclean yes&lt;/p&gt;

&lt;p&gt;Cleaned 8 zombie processes Reclaimed 20 GB&lt;br&gt;
Technical Details&lt;br&gt;
Zero dependencies only Nodejs builtins&lt;br&gt;
Cross platform macOS Linux Windows&lt;br&gt;
Dry run shows what would be cleaned&lt;br&gt;
Verify PID before kill&lt;br&gt;
SIGTERM then 10s to SIGKILL&lt;/p&gt;

&lt;p&gt;Check for Zombies&lt;/p&gt;

&lt;p&gt;Check Task Manager for node inactive ones are zombies Or run npx thestackai zclean&lt;/p&gt;

&lt;p&gt;Links&lt;/p&gt;

&lt;p&gt;GitHub whynowlab zclean NPM thestackai zclean&lt;/p&gt;

&lt;p&gt;Issues Contributions welcome&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>devtools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>I built 6 cognitive firewalls because my AI kept confidently giving wrong answers</title>
      <dc:creator>thestack_ai</dc:creator>
      <pubDate>Mon, 16 Mar 2026 06:14:57 +0000</pubDate>
      <link>https://dev.to/thestack_ai/i-built-6-cognitive-firewalls-because-my-ai-kept-confidently-giving-wrong-answers-5aka</link>
      <guid>https://dev.to/thestack_ai/i-built-6-cognitive-firewalls-because-my-ai-kept-confidently-giving-wrong-answers-5aka</guid>
      <description>&lt;p&gt;I use Claude Code every day. It writes code fast. But it thinks poorly.&lt;/p&gt;

&lt;p&gt;Not always. Not obviously. That's what makes it dangerous.&lt;/p&gt;

&lt;h2&gt;
  
  
  The moment I realized something was wrong
&lt;/h2&gt;

&lt;p&gt;I asked Claude: "Is SQLite viable for an app with 1000 concurrent users?"&lt;/p&gt;

&lt;p&gt;It said: "No, SQLite is not suitable for high-concurrency applications. Use PostgreSQL or MySQL instead for production workloads."&lt;/p&gt;

&lt;p&gt;Confident. Clear. Completely wrong.&lt;/p&gt;

&lt;p&gt;1000 concurrent users does not equal 1000 concurrent writes. A typical web app at this scale generates about 30 concurrent write transactions. SQLite in WAL mode handles around 120 writes/sec. Expensify serves 10M+ users on SQLite.&lt;/p&gt;

&lt;p&gt;Claude didn't check any sources. It just gave the "safe" answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six failures, not one
&lt;/h2&gt;

&lt;p&gt;I started paying attention. I noticed six distinct patterns:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Premature closure: Rushes to execute ambiguous requests instead of asking questions&lt;/li&gt;
&lt;li&gt;Hallucination: States claims without verification&lt;/li&gt;
&lt;li&gt;Anchoring bias: Locks onto the first "obvious" answer&lt;/li&gt;
&lt;li&gt;Confirmation bias: Agrees with you instead of challenging&lt;/li&gt;
&lt;li&gt;Black-box reasoning: Gives conclusions without showing assumptions&lt;/li&gt;
&lt;li&gt;Optimism bias: Assumes the plan will work&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The fix: structured skills
&lt;/h2&gt;

&lt;p&gt;I tried prompt engineering. "Be more careful." "Check your sources."&lt;/p&gt;

&lt;p&gt;It doesn't work. The AI nods, then does the same thing.&lt;/p&gt;

&lt;p&gt;What works is structure:&lt;/p&gt;

&lt;p&gt;swing-research: Every claim traced to a source or labeled "Unverified." Source tier grading (S/A/B/C). 2+ independent sources for key claims.&lt;/p&gt;

&lt;p&gt;swing-review: Steel-man first, then 3-vector attack. "Looks good" is structurally banned.&lt;/p&gt;

&lt;p&gt;swing-clarify: 5W1H decomposition. Ambiguity score 0-6. Up to 3 clarifying questions before execution.&lt;/p&gt;

&lt;p&gt;swing-options: 5 options across probability zones. At least 1 unconventional.&lt;/p&gt;

&lt;p&gt;swing-trace: Every assumption rated. Every decision fork documented. Weakest link identified.&lt;/p&gt;

&lt;p&gt;swing-mortem: "It's 6 months from now. This failed completely. What went wrong?"&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;p&gt;The SQLite question through swing-research: conclusion flipped from "No" to "Yes, with caveats" backed by actual sources.&lt;/p&gt;

&lt;p&gt;JWT review through swing-review: found a Critical security vulnerability (no refresh token rotation) the baseline missed entirely.&lt;/p&gt;

&lt;p&gt;Not better answers. Structurally different reasoning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;npx skills add whynowlab/swing-skills --all&lt;/p&gt;

&lt;p&gt;Six skills. Each targets one cognitive failure. MIT licensed.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/whynowlab/swing-skills" rel="noopener noreferrer"&gt;https://github.com/whynowlab/swing-skills&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>productivity</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
