GEM² Inc.

Posted on Apr 30

Claude Skills want ALL

#claudecode #cloudskills #devtools #opensource

"AI is a mathematical and logical system. TPMN is Algebraic Logical Language, ALL, to communicate with AI"

For engineering work, I thought AI needs engineering language. Not for understanding, but for parsing. I needed a language that an AI agent could parse unambiguously, that a human could read without a manual, and that survived context compaction intact.

That language is TPMN — an Algebraic Logical Language (ALL).

TPMN stands for four sources:
TLA+ (temporal logic for concurrent systems),
Panini (the ancient Sanskrit grammarian who solved semantic disambiguation),
Mathematical notation (for formal constraints),
Natural language (for the subjective meaning that symbols cannot carry alone).
Each source contributes something specific. Together, they form ALL — the language for specifying AI skills.

If you have read the introduction,

GEM² Inc.

Apr 30

Claude Skills Fail Silently. Here Is My Solution.

#claudecode #cloudskills #devtools #opensource

4 min read

GEM² Inc.

Apr 30

Three Wounds That Prose Skills Cannot Fix — The Full Analysis

#cloudskills #claudecode #devtools #opensource

14 min read

you know the F: A → B | P pattern. This post explains why algebraic notation is necessary, what each layer contributes, and how the complete language works.

Why algebra, not prose

AI is a mathematical and logical system. It processes tokens through transformer architectures — linear algebra, attention mechanisms, probability distributions. The internal representation is mathematical.

Yet we communicate with AI in prose. We write skills as paragraphs of instructions. We describe workflows in bullet points. Then we are surprised when AI interprets ambiguously.

The mismatch is fundamental: we are writing in natural language to a mathematical system. Prose is optimized for human communication — nuance, context, implication. These qualities matter when AI infers intent in natural conversation. But they become the core cause of hallucination when AI must execute engineering work precisely, correctly, and consistently.

I think hallucination is not a fault in AI's processing logic. It is extrapolation against context dilution and drift.

TPMN is an Algebraic Logical Language that I created to supersede NL-based prompting. It is not invented from scratch — each of its four layers draws from a historically proven formalism.

An algebraic expression like P: title ≠ ⊥ ∧ project_slug ≠ ⊥ has exactly one interpretation. A prose instruction like "make sure the title and project are provided" has many — what counts as "provided"? Is an empty string provided? Is a whitespace-only string? The algebra eliminates the question.

This is not about making things harder for humans. The algebra is readable. 𝔹 means boolean. 𝕊 means string. ⊥ means absent. ∧ means AND. You learn the symbols once. They never change meaning.

AI cannot read your mind. AI is not a magic wand.
You can talk to AI in any persona — but the persona does not change what AI is. When you want humanistic discourse, nuance and reading between the lines matter. You bring a shared vocabulary and context to make that work.

The same applies to engineering. When you ask AI to do engineering work, you need basic mathematical terms and logical structure to describe your need clearly.

TPMN is created for exactly this: axiomatic rigor, procedural clarity, and impersonality — the Economy of Expression for communicating engineering work to AI with minimum ambiguity.

The four layers of TPMN

T — TLA+ (structural layer)

Leslie Lamport's TLA+ provides the structural backbone. Records, sequences, definitions:

(* Records — named field structures *)
Person ≜ [name: 𝕊, age: ℕ]

(* Sequences — ordered steps *)
Pipeline ≜ <<plan, design, implement, test, deploy, verify>>

(* Definitions — binding names to meanings *)
Init_Session: A → B | P ≜ [...]

The ≜ symbol (defined as) comes directly from TLA+. It is visually distinct from = (equality) and → (transformation). When you see ≜, you know: this is a definition, not a comparison.

Sets define enumerations — the closed universe of valid values:

Status ≜ {PENDING, IN_PROGRESS, COMPLETED, BLOCKED, ABORTED}

When a TPMN contract says mode: {feature, bug}, AI knows there are exactly two valid values. Not "feature, bug, or similar." Exactly two. This is how you eliminate the drift that makes prose skills unreliable.

P — Panini (conflict resolution layer)

Panini solved the problem of rule conflict and ambiguity in Sanskrit grammar 2,500 years ago. His Ashtadhyayi — ~4,000 rules generating every valid Sanskrit form — is a deterministic generative system. When two rules compete for the same derivation, meta-rules (paribhasha) resolve the conflict so that exactly one wins. No ambiguity survives.

The problem is identical to what makes prose skills fail: multiple valid interpretations with no resolution mechanism. Panini's solution was not to write more prose — it was to build a formal system where conflicts are resolved by structure, not by judgment.

TPMN applies this principle through three patterns:

Typed categories — Panini classified phonemes and morphemes into formal categories (pratyahara) so that rules could target precise classes, not vague descriptions. TPMN does the same with typed fields: two fields named status are not the same field if one is 𝕊 and the other is {PENDING, IN_PROGRESS, COMPLETED}. The type resolves the conflict.
Grounding 5W — Panini's rules are ordered and scoped — each rule declares exactly when it applies. TPMN mirrors this: every skill declares who, what, when, where, why. The "when" and "what" fields are the most critical — Claude Skills already recommends them. But with the remaining 3W (who, where, why), the AI can resolve which skill applies with minimal guessing.
Exception over default — Panini's system uses the utsarga/apavada principle: specific exceptions override general rules. TPMN adapts this as negative contracts — ¬B declares what a skill explicitly never does. AI agents are helpful by default. They will cross boundaries if those boundaries are not declared. The specific exclusion overrides the general helpfulness.

M — Mathematical notation (decisional layer)

TLA+ provides structure — records, sequences, definitions. Mathematical notation provides the logic that fills those structures: programmatic decisions and formal constraints.

TLA+ defines the container: A ≜ [name: 𝕊, age: ℕ]. Math notation writes the rule that decides whether the container is valid: name ≠ ⊥ ∧ age > 0. Every precondition P in F: A → B | P, every invariant, every verification predicate — the logic that determines pass or fail — is math notation.

The core symbols:

Symbol	Meaning	Role
`∧`	AND — conjunction	Combining predicates in P
`∨`	OR — disjunction	Alternative conditions
`¬`	NOT — negation	Exclusion, negative contracts
`∈`	Element of — membership	Type checking, set membership
`∀`	For all — universal	Field coverage, invariants
`⟹`	Implies	Chain invariant between flow steps
`⟺`	If and only if	STATE verification predicate

For example, the STATE verification predicate — how we determine SUCCESS or FAILURE — is pure M-layer:

STATE = SUCCESS ⟺
  (∀ field ∈ B: b[field] ≠ ⊥ ∧ type(b[field]) = B[field].type)
  ∧ P(a, b) holds

This is structural type-checking, not subjective judgment. AI can self-evaluate against its own CONTRACT — and that makes drift detectable rather than silent.

Epistemic tags extend the M layer with claim provenance — unique to TPMN:

Symbol	Meaning
`⊢`	Grounded — claim from verifiable fact
`⊨`	Inferred — derived from grounded claims
`⊬`	Extrapolated — beyond evidence

INV ≜ [
  ⊢ Strictly read-only — never modifies any file,
  ⊢ B is state report — AI decides next action based on B,
  ⊢ MANDATE: session state detection only
]

Every rule is ⊢-tagged — grounded. During context compaction, ⊢-tagged claims survive as hard constraints inside code blocks. Prose instructions like "never modify files" get summarized away. ⊢ NEVER modify any file inside a TPMN block survives because code blocks are treated as atomic by summarizers.

N — Natural language (meaning layer)

Formal notation cannot carry subjective meaning. "ARCHITECT beginning a work session" is not expressible in algebra. So TPMN uses NL in controlled positions:

(* ... *) inline comments — explanation alongside formal structure
String values inside records — who: "ARCHITECT beginning a work session"
Flow step action fields — operational descriptions of what to do

The rule: NL complements, it does not replace. The structure constrains; the NL explains. NL is never used for definitions, types, or logic — only for the human meaning that symbols cannot carry alone.

The contract: F: A → B | P

The four layers combine into the contract — the core of every UNIT-SKILL:

Skill_Name: A → B | P ≜ [
  A: [input fields with types],
  B: [output STATE fields with types],
  P: precondition predicates joined by ∧
]

¬B ≜ [
  ⊢ NEVER {boundary this skill must not cross},
  ⊢ NEVER {sibling mandate it must not assume},
  ⊢ NEVER {side effect it must not produce}
]

A — input state. Typed record. What the skill receives.
B — output state. Must be state, never action. The core invariant: B is what the skill produces, not what happens next.
P — preconditions. Conjunction of predicates that must all hold before execution.
¬B — negative contract. What the skill explicitly never does.

The flow: ordered steps with chain invariant

Flow ≜ <<
  [name: "step_name",
   action: "what to do",
   pre:  precondition_predicate,
   post: postcondition_predicate],
  ...
>>

Key constraints:

Maximum 5 steps, desire 3 or fewer — our reliability model based on 0.8^N decay. Claude's official guidance says "one skill, one job" and "keep SKILL.md under 500 lines" — the step limit is TPMN's structural enforcement of the same principle
The chain invariant holds: ∀ i ∈ 1..N-1: Flow[i].post ⟹ Flow[i+1].pre
Flow is linear — no branching between steps (branching is the AI's job)
IF/THEN/ELSE within a step's action field is acceptable (local logic)
IF/THEN/ELSE between steps is not — split into separate skills

The <<...>> syntax is TLA+ sequence notation — it signals "ordered operations" rather than "data structure."

The grounding: 5W record

Every skill anchors itself in context:

Grounding_5W ≜ [
  who:   "actor — the role invoking this skill",
  what:  "deliverable — the state transformation",
  when:  "condition — when AI should select this skill",
  where: "scope — boundary of applicability",
  why:   "rationale — why this skill exists as a separate unit"
]

All five fields are required: ∀ skill: ∀ w ∈ {who, what, when, where, why}: |skill.grounding[w]| > 0

Skills without explicit grounding drift in meaning over time. The "when" field is the most important — it tells the AI exactly when to select this skill.

Compaction survival: why formal notation wins

This is the practical argument that matters most.

When an AI agent's context window fills, earlier content gets compacted. Claude Code re-attaches skills post-compaction, but only the first 5,000 tokens per skill within a 25,000-token shared budget. Skills invoked earlier can be dropped entirely. And even within the budget, prose instructions lose nuance — research shows summarization achieves ~6:1 compression, meaning each surviving sentence must carry six times the semantic density of the original.

(* ~40 tokens — exact constraints preserved: *)
P: title ≠ ⊥ ∧ project_slug ≠ ⊥
¬B ≜ [⊢ NEVER modify any file]
Flow ≜ <<S₁, S₂, S₃>>

(* ~60 tokens — same information, but compressible: *)
"Make sure the title and project slug are both provided before
running. The skill should never modify any files. Execute
the three steps in order: first read, then query, then report."

Both fit within the re-attachment budget. But when the budget is tight and the summarizer compresses, the prose version loses "never modify any files" while the TPMN version keeps ⊢ NEVER modify any file intact — because there is nothing to compress. The algebra is already at minimum expression. That is why AI needs ALL.

And density compounds in complex projects

TPMN is not built for small scripts. It is built for complex, dense projects — the kind where you run 10, 20, 50 concurrent workflows across a long session. That is exactly where algebraic density pays off twice: once in compaction survival, and again in raw token savings.

I measured this directly. I took 6 real Claude Skills (3 official Anthropic, 3 community) and converted each into TPMN workplan contracts. The results: bespoke skills average 3,583 tokens each. The equivalent TPMN contracts average 672 tokens — a 5.3x compression ratio. The largest compression was 9.6x (Anthropic's skill-creator: 8,916 → 929 tokens), because most of its budget went to process scaffolding that TPMN's core skills already handle. The smallest was 1.9x (webapp-testing), because it was already lean.

TPMN's 12 core skills cost ~20,000 tokens as shared infrastructure, loaded once. Each additional workflow adds only ~672 tokens. Bespoke skills have zero infrastructure cost but pay ~3,583 tokens per workflow. At 7+ concurrent workflows, TPMN is cheaper in total. At 20, it saves 53%. At 50, 70%. The more complex the project, the larger the advantage — because the infrastructure cost is fixed and the per-workflow cost is 5.3x lower.

Notation principles

After building skills in this notation, I have distilled four principles:

Symbols are unambiguous at any depth. Whether you read 𝔹 at the top of a contract or nested inside Seq(Record[status: 𝔹]), it means boolean. No context-dependent interpretation.

Structure is preserved under compaction. TPMN blocks survive as atomic code blocks. Prose instructions get summarized and lose constraints.

NL complements, it does not replace. The (* ... *) comment syntax carries subjective meaning. The structure constrains; the NL explains.

Types are the disambiguator. Two fields named status in different skills are not the same field if one is 𝕊 and the other is {PENDING, IN_PROGRESS, COMPLETED}. The type makes them distinguishable.

This notation is not complex. It is precise. Those are different things. Complexity hides meaning. Precision reveals it.

Try it

The TPMN Skill Standard v4 is MIT-licensed. Install the core skills into any project:

npx @gem_squared/tpmn-skill-install

Full spec: TPMN Skill Standard v4 on GitHub

David Seo — GEM².AI

DEV Community

Claude Skills want ALL

Claude Skills Fail Silently. Here Is My Solution.

Three Wounds That Prose Skills Cannot Fix — The Full Analysis

Why algebra, not prose

The four layers of TPMN

T — TLA+ (structural layer)

P — Panini (conflict resolution layer)

M — Mathematical notation (decisional layer)

N — Natural language (meaning layer)

The contract: F: A → B | P

The flow: ordered steps with chain invariant

The grounding: 5W record

Compaction survival: why formal notation wins

And density compounds in complex projects

Notation principles

Try it

Top comments (0)