mufeng

Posted on Jul 4

The Real AI Productivity Hack Is Not a Better Prompt

#ai #claude #skills #productivity

I used to think the next jump in AI productivity would come from writing better prompts.

Longer prompts. More precise prompts. Prompts with role definitions, tone rules, examples, constraints, and output formats.

After reading a book on Agent Skills, I think that framing is too small.

The real bottleneck is not that I fail to explain a task once. The real bottleneck is that I keep explaining the same class of task again and again: how I want an article structured, how I review code, how I prepare App Store release notes, how I generate visuals, how I check a draft before publishing.

At some point, “using AI” quietly turns into “managing AI manually.”

The book’s most useful idea is simple:

AI productivity does not come from making every prompt longer. It comes from turning repeated work into executable, maintainable, testable skills.

That changed how I think about AI work.

A Skill Is Not a Prompt
A prompt is a temporary instruction inside one conversation.

A skill is a reusable operating manual for an agent.

That difference sounds small until you use AI every day. A prompt tells the model what you want right now. A skill tells the agent how a category of work should be done every time:

when to activate
what input to read
what steps to follow
which tools or scripts to call
what output to produce
what must never happen
where the agent should stop and ask for human judgment
That last part matters.

The goal is not to remove the human from the work. The goal is to stop spending human attention on the same low-level instructions.

For me, the most obvious candidates are not exotic:

a writing style skill
a code review skill
an iOS release checklist skill
an App Store release notes skill
a book notes skill
a weekly review skill
These are not tasks I cannot do. They are tasks where I keep repeating the same standards, preferences, caveats, and checks.

That repetition is the real cost.

The Useful Split: Judgment, Mechanics, and Workflow
One of the cleanest distinctions in the book is this:

prompts handle semantic judgment
scripts handle deterministic mechanics
skills orchestrate the whole workflow
This sounds obvious, but many AI workflows fail because they give the model the wrong job.

For example, asking a model to decide where an article needs illustrations is reasonable. Asking it to reliably rename files, validate image dimensions, split long documents, or calculate table values is usually a mistake.

Those are deterministic jobs. They should be handled by scripts or strict tools.

The model is better used for judgment:

choosing the angle of an essay
identifying the weak part of a draft
comparing two architecture options
explaining a tradeoff
turning rough material into clear language
The skill sits above both. It says: when this kind of task appears, use the model for the judgment parts, use scripts for the mechanical parts, and preserve the checkpoints where a human decision is required.

That is a much more durable pattern than trying to put everything into one giant prompt.

Context Is a Workbench, Not a Warehouse
Large context windows make it tempting to dump everything into the conversation.

Style guides. Prior chats. Examples. Templates. API docs. Drafts. Personal preferences. All of it.

The book argues for the opposite discipline: load the right material at the right time.

That is how skills should be designed. The main SKILL.md should not become a warehouse. It should contain the core workflow:

trigger conditions
inputs and outputs
main steps
hard constraints
failure modes
references to load only when needed
Long templates, examples, API notes, and style samples belong in separate reference files.

This is not just about token savings. It is about attention. The more unrelated material you push into context, the easier it becomes for the model to miss the one rule that actually matters.

Context should feel like a workbench: only the tools needed for the current job should be on it.

Good Workflows Are Not Fully Automatic
The dangerous version of AI automation is the one that looks efficient because it removes every pause.

Become a Medium member
Give the agent source material. Let it choose the angle. Let it write the draft. Let it polish the draft. Let it generate images. Let it publish.

That looks like a productivity win. Often it is just a way to outsource the most important decisions.

The better workflow is more selective.

For writing, I want AI to:

analyze source material
propose several angles
stop
let me choose the angle
draft from that angle
revise against my standards
prepare platform-specific versions
The pause is not friction. It is the point.

The same applies to development. AI can propose implementation plans, write tests, scan for regressions, and generate release notes. But architecture decisions, product tradeoffs, and publish decisions still need human ownership.

AI can do the prep work. It should not silently take over the judgment.

Skills Need Engineering, Not Decoration
A useful skill should be treated more like a small software product than a clever note.

That means it has a lifecycle:

define the real problem
build the smallest usable version
run it on real tasks
record failure modes
add tests or examples
refactor when the file becomes too large
keep improving it as the work changes
The most useful part of a skill is often not the elegant workflow. It is the “gotchas” section.

That is where you record the failures that keep happening:

the agent forgot to read the reference template
the output sounded too generic
the script handled the wrong file path
the model rewrote sections it should have preserved
the task needed a human checkpoint before publishing
This is where personal experience becomes operational memory.

If the same mistake happens twice, it probably belongs in the skill. If the same task happens three times, it is probably a candidate for a skill.

The Security Boundary Is Part of the Design
Skills become more serious when they can read files, write files, call scripts, access the network, or publish content.

At that point, they are not just prompts. They are operational tools.

So the safety rules need to be designed in from the beginning:

limit where the skill can read and write
avoid destructive actions without confirmation
back up before overwriting important files
test publishing workflows with fake data first
remove local paths, secrets, and personal assumptions before sharing a skill publicly
inspect third-party skills before running their scripts
This is not paranoia. It is basic engineering hygiene.

The more capable the agent becomes, the more explicit the boundaries must be.

What I Am Going to Try First
The book made the idea feel concrete enough that I can turn it into a weekly habit.

This week, I would start with three small skills.

First: a writing style skill.

Not a giant manifesto. Just a role, three style principles, a short banned-phrase list, and a few examples of what “good” looks like.

Second: an iOS or app release checklist skill.

The first version only needs to cover version number, release notes, screenshots, privacy text, and a final manual confirmation before submission.

Third: a gotchas section for existing skills.

Take the last three AI outputs that were disappointing. Convert each failure into a specific rule. Do not patch for one example. Capture the pattern.

There is also one experiment worth running immediately:

Take a piece of material you want to turn into an article. Do not ask AI to write the article. Ask it to do only two things: analyze the material and propose three angles. Then stop and choose the angle yourself.

If the final article improves, the human checkpoint paid for itself.

The Shift
The book did not make me want to use AI more.

It made me want to manage AI less manually.

That is the real shift: from temporary instruction to reusable workflow; from prompt accumulation to experience engineering; from asking AI to remember my preferences to writing those preferences into a system that can be maintained.

Better prompts still matter.

But the real compounding return comes when the prompt stops being a one-off instruction and becomes part of a skill.

Disclosure: this essay was adapted from my Chinese reading notes and drafted with AI assistance.

Top comments (1)

Skillselion • Jul 4

The "if the same task happens three times, it's a candidate for a skill" line is the right trigger, and calling the gotchas section the most useful part is underrated - that's where a skill stops being a clever prompt and becomes a tool.

One thing I'd add: there's a second reusability curve past "reusable for me." The gotchas that make a skill work for you - your file paths, your house style, your stack assumptions - are exactly what breaks it for anyone else. The skills that actually travel tend to be narrow and single-responsibility: one job, a tight trigger, few hard constraints. The big orchestrator that owns a whole workflow is usually the most valuable skill for its author and the least portable.

Your judgment/mechanics/workflow split names the axis most write-ups miss. The failure mode I'd watch for is the mirror of the one you named: not just handing the model deterministic work, but burying a human-judgment checkpoint inside a script because automating it was easier than pausing. The pause being the point is exactly right.