DEV Community

Cover image for Claude Skills Fail Silently. Here Is My Solution.
GEM² Inc.
GEM² Inc.

Posted on

Claude Skills Fail Silently. Here Is My Solution.

"Three structural wounds that prose skills cannot fix. 12 lifecycle skills that do. Install in 30 seconds."


I've been using SKILLs every day, with almost zero failure. That's not because I got lucky. It's because I stopped writing them as prose — and I built 12 lifecycle skills that orchestrate all the others, the way Unix commands orchestrate any pipeline.

The problem is not SKILL itself — it is the failure of linguistic precision, and the failure to select the correct skill for a specific situation.


Your skills break and you never know.

Three independent studies found the same thing: 56% non-invocation (Vercel), 50% success rate (Villega), 77% activation (Seleznov, n=650). No error. No warning. No log. Claude proceeds without the skill and the output looks plausible.

That is only one of three wounds.


The three wounds

1. Silent Scope Decay — your prose constraints dilute under context compaction. Your architecture boundaries drift between sessions. Neither failure is detectable.

2. Judgment Theater — you review AI output but have no contract to check against. The review feels real. It is structurally empty.

3. Trigger Collision — the wrong skill fires, or no skill fires at all, and the failure is unobservable. More skills, more ambiguity, more compounding.


What I did

I stopped writing skills as prose. I built 12 lifecycle skills that orchestrate all the others — the way Unix commands orchestrate any pipeline.

The skills don't encode domain knowledge. They govern the lifecycle of work itself: plan it, execute it, verify it against a typed contract, archive the proven result. The domain knowledge lives in contracts, not skills. The contracts compound. The skills stay fixed.

/plan-work → /proceed-work → /verify-work → /archive-work
     ^                                              |
     +──── /search-kg (proven patterns feed back) ──+
Enter fullscreen mode Exit fullscreen mode

Every verified contract becomes searchable prior art for the next session. You never start from zero.


Taste it

npm i @gem_squared/tpmn-skill-install
Enter fullscreen mode Exit fullscreen mode

The 12 skills. MIT-licensed. Works with Claude Code out of the box. No infrastructure. No server. Git + filesystem.

From GitHub: github.com/gem-squared/tpmn-skill


What the 12 skills do

They don't encode domain knowledge. They govern the lifecycle of work itself.

$ claude --permission-mode auto

/search-kg          Search proven patterns from prior work
     |
/plan-work          Write CONTRACTs (A → B | P) for each unit
     |
/proceed-work       Execute one unit, verify inline, retry on failure
     |              (repeat for each unit)
     |
/verify-work        Verify Result vs CONTRACT.B (per-unit or batch)
     |
/archive-work       Move to archive/, git commit, compounds.
     |
/extract-skill      Convert proven contracts into reusable skills (optional)
Enter fullscreen mode Exit fullscreen mode

Every unit of work has a CONTRACT:

  • A — input state: what exists before the work
  • B — output state: what must exist after (always a state, never an action)
  • P — precondition: what must be true to start
  • Clarity % — how well-defined the scope is (0–100)
Skill What
/init-session Bootstrap project files, detect layer
/check-session Read-only status report
/search-kg Search proven contracts in archive
/search-skill Discover installed + archived skills
/plan-work Decompose work into 1–9 contracted units
/proceed-work Execute one unit, verify inline, retry on failure
/update-work-plan Add, modify, abort, reorder PENDING units
/verify-work Verify Results against CONTRACTs (per-unit or batch)
/extract-skill Convert proven WP into reusable skill
/skill-to-kg Sweep non-core skills to archive, restore specific
/archive-work Finalize WP, git commit
/end-session Commit session state for recovery

The lifecycle stays fixed. The knowledge compounds. That is the inversion: stop growing your skill library, grow your contract archive.


How prose fails — and why contracts don't

Silent Scope Decay. Claude compacts long context by summarizing. "When preconditions hold, plan the work..." compacts to "Plans work" — constraints gone, silently. TPMN uses algebraic notation: P ≜ work ≠ ⊥ ∧ project_slug ≠ ⊥. Remove a conjunct and the formula visibly breaks. Breakage is detectable, not silent.

Judgment Theater. "Does it look right?" is not verification. /verify-work runs three deterministic checks every time: field coverage (did B produce every required field?), type conformance, constraint satisfaction. STATE = SUCCESS is structural. Not an opinion.

Trigger Collision. 900,000+ skills in the ecosystem. Even in your private project, the same dynamic applies: more skills, more ambiguity, more compounding failure. TPMN solves this with two inversions — lifecycle orchestration (12 fixed skills, deterministic selection) and knowledge compounds (every verified contract becomes searchable prior art via /search-kg).


Try it

claude --permission-mode auto

/init-session       # bootstrap — must be first
/plan-work          # decompose into contracted units
/proceed-work       # execute one unit
/verify-work        # verify against contract — SUCCESS or FAILURE
/archive-work       # proven. searchable. compounds.
Enter fullscreen mode Exit fullscreen mode

The full analysis

Each wound has a structural proof — why prose fails, why contracts don't, and how 130+ real work plans validated the model. Read the deep dive:

Three Wounds That Prose Skills Cannot Fix — The Full Analysis

The spec is public. CC-BY-4.0. The proof is verifiable. The rest is yours.


David Seo — GEM2.AI

Top comments (0)