I've been building AI Skills, running them, watching them break, fixing them, and breaking them again. Through that cycle, I arrived at a set of clear design principles.
A "Skill" here means a defined procedure you give to an AI agent — a reusable instruction set that tells the AI what to do and how to verify it. Think of it as a prompt template with structure and control flow.
Here's what I learned.
Skills Are Not Batch Files
First, a premise.
A Skill is not a batch file. There is no guarantee that steps execute in the order they're written. There is no guarantee that instructions are followed. There is no state between calls.
An LLM is not a command executor. It is a probabilistic model.
That changes everything about how to approach Skill design. Writing a Skill as a "command" is a path to eventual failure. A Skill is not a directive — it is a control structure for raising expected output quality.
With that premise in place, here are the five principles.
Principle 1: List the steps before starting
Put the task list outside the model first. Having the AI verbalize "what to do" before acting is the starting point of control.
Don't let execution begin immediately.
The first thing to do is have the AI enumerate the tasks for the session: what needs to happen, at what granularity, with what checkable units. Make it output an explicit list.
This looks like basic planning. But the real point is different.
AI operates entirely within context. By creating a checklist externally, the state of the work becomes visible inside that context. This is the act of creating external state to control the AI.
Rather than handing everything to the Skill, visualize the work structure before executing. That one step makes a significant difference in stability.
Principle 2: Always review at the end
Design with the assumption that things will be missed. A Skill that skips the verification loop will silently break at some point.
Don't treat a single pass as complete.
Even when the AI judges that the task is done, don't accept that as final. Have it re-check the checklist, ask whether anything was skipped, and re-run if anything is missing.
The structure is: plan → execute → verify → re-execute if incomplete.
Use Skills with the assumption that they break. "Things get missed" is the starting fact — build that assumption into the design. Don't expect a single run to be perfect. Instead, build a structure that can recover when it fails.
Principle 3: Humans write intent. Let AI write the Skill.
Humans write the goal and constraints. The AI expands the procedure. A Skill should be the result of AI understanding your intent.
This is the biggest shift in thinking.
When a human tries to write out the complete procedure as a Skill, bias creeps in. Tacit knowledge stays hidden. Abstraction level drifts. Coverage breaks down. The person writing can't see what they're leaving out.
AI, on the other hand, is good at taking stated intent and expanding it into structured, explicit steps.
So divide the roles.
The human's job: define the goal, state constraints, specify the success condition, provide judgment criteria. Write "what you want," "how far to go," and "what counts as done."
The AI's job: expand into steps, generate checklists, calibrate granularity, format the output. Write "how to structure it."
A Skill should be the result of AI understanding your intent. The human then reviews the output for gaps in that understanding. If gaps exist, revise the intent — not the steps.
That cycle is what grows a Skill.
Principle 4: Don't embed domain knowledge in Skills
A Skill holds structure only. The moment it holds domain knowledge, it inherits the same problems as copy-pasted code.
Skills multiply fast once you start using them. Review Skills, validation Skills, generation Skills — a new Skill emerges for each purpose. That's natural.
The problem comes next. As Skills multiply, domain knowledge gets embedded in each one: business rules, specific constraints, project-specific assumptions. Each is correct at the time of writing. But when the domain knowledge changes, you can't chase down every Skill. Update misses happen.
This is the same structure as copy-pasted code. Knowledge scatters and quietly rots.
A Skill must not be a container for knowledge. A Skill should hold only three things: the shape of the procedure, the verification method, and the control structure.
Centralize knowledge externally. Have Skills reference it. When a change is needed, there is only one place to update.
Principle 5: Don't over-specify with examples
Specific examples narrow the AI's field of view. Write judgment criteria, not a permission list.
Too many specific examples in a Skill create a whitelist effect.
Write "check A, check B, check C" and the AI will check A, B, and C. D and E won't enter its view. Sensitivity to unexpected cases drops.
Examples are powerful. But that power functions as a permission list — and it damages coverage.
What to write instead: rules. "Verify completeness." "Enumerate the scope of impact." "Consider unexpected cases too." Constrain with abstract rules, not concrete enumerations. That keeps the AI's judgment range wide.
Summary
Looking across the five principles, a single line runs through them.
The more you treat AI as a controllable execution engine, the more the design fails. The path leads back to the premise — LLMs are probabilistic models — and to a design philosophy of supporting with structure, maintaining with separation, and correcting with loops.
- Visualize the work by listing it before you start
- Run a verification loop at the end
- Let AI generate the Skill from human intent, not the other way around
- Separate knowledge externally; keep structure in the Skill
- Constrain with abstract rules, not enumerated examples
When moving from the phase of "building" Skills to the phase of "growing" them, these five principles form the skeleton of the design.
Top comments (0)