
The problem isn't that AI coding agents write bad code.
The problem is that they skip steps.
Ask an agent to fix a bug—it reads a few files, guesses a cause, patches the code. Ask it to add a feature—it starts writing before anyone's agreed on what the feature actually does. Ask it to refactor—it touches unrelated files, reformats half the codebase, and hands you a diff too large to review.
None of this is stupidity. It's the absence of process discipline.
Software development has always required workflow constraints: clarify before implementing, plan before coding, test before shipping, debug root causes not symptoms, verify before declaring done. The question is whether your AI agent follows them—or bypasses them entirely.
Superpowers is a plugin framework for Claude Code and Codex that encodes those constraints as loadable, composable agent workflows. This is what it is, when to use it, and how to get started.
What "Skills" Actually Are
The word "skill" is overloaded in AI contexts. Here it means something specific: a workflow protocol that loads into an agent session and constrains how the agent approaches a category of task.
Not "be more careful." Not a style guide. A specific sequence of steps with defined inputs, outputs, and verification gates.
The analogy is a checklist for a surgeon or a pilot—not because either lacks expertise, but because cognitive discipline under pressure requires procedural anchors.
The core Superpowers Skills cover the major failure modes in AI-assisted development:
| Skill | Failure Mode It Prevents | What It Produces |
|---|---|---|
brainstorming |
Implementing the wrong thing | Clarified scope with edge cases surfaced |
writing-plans |
Drifting mid-implementation | Executable task list: file scope + verification per step |
test-driven-development |
"Works on my machine" guesswork | RED-GREEN-REFACTOR cycles that lock behavior first |
systematic-debugging |
Shotgun-patching symptoms | Root cause hypotheses, evidence-based elimination, minimal fix |
verification-before-completion |
"Should be done" claims | Actual test runs, browser paths, or device checks |
requesting-code-review |
Merging unreviewed code | Severity-ranked risk list before merge |
using-git-worktrees |
Task bleed across workstreams | Isolated workspaces with clean baseline |
These aren't independent tips—they chain into a complete development pipeline:
Vague requirement
→ brainstorming (scope + edge cases)
→ writing-plans (executable task list)
→ test-driven-development (behavior locked by tests)
→ requesting-code-review (risks surfaced)
→ verification-before-completion (actually verified)
The Key Insight: Process Errors vs. Code Errors
AI agents will get better at writing correct code over time. They won't automatically get better at following process—unless process is encoded somewhere.
The bugs Superpowers Skills prevents aren't syntax errors or logic bugs. They're:
- Building the wrong feature because nobody asked the right clarifying questions
- Writing code that "looks complete" but has zero coverage on the edge cases that matter
- Patching a symptom while the root cause persists
- Refactoring that expands scope until the diff is unmergeable
- Shipping because the agent said "done" without running anything
A more capable model doesn't fix these. A faster agent arguably makes them worse—more code written in the wrong direction before anyone catches it.
A Real Example: Adding Invoice Export
Imagine you tell an agent: "Add a billing export feature."
Without workflow constraints, it will probably find the billing service, write an endpoint, add a download button, and report completion. Whether that implementation handles empty data, unauthorized requests, large datasets, or export format edge cases depends entirely on whether the model guessed right.
With Superpowers Skills, the flow looks like this:
Step 1: brainstorming
Before touching any files, the agent surfaces questions:
- Export format: PDF, CSV, or Excel?
- Date range limits?
- Permission checks required?
- Sync download or async background job?
- What does the user see on failure?
This isn't bureaucracy. This is the list of decisions that will otherwise get made silently—by the model, in the wrong direction.
Step 2: writing-plans
A compliant plan doesn't say "implement invoice export." It says:
1. Add exportInvoiceCsv(userId, range) to billing service.
Verify: unit tests covering empty data, normal data, unauthorized access.
2. Wire export endpoint in API routes.
Verify: 403 on missing permissions, valid text/csv response on success.
3. Add download button to billing page.
Verify: file downloads on click, loading and error states render correctly.
Every task has a file scope and a verification gate. That's what makes it executable instead of aspirational.
Step 3: test-driven-development
Tests first. Not as documentation—as behavior contracts:
describe("exportInvoiceCsv", () => {
it("exports invoices as csv rows", () => {
const csv = exportInvoiceCsv([
{ id: "inv_001", amount: 1999, currency: "USD" },
{ id: "inv_002", amount: 2999, currency: "USD" },
]);
expect(csv).toContain("id,amount,currency");
expect(csv).toContain("inv_001,1999,USD");
expect(csv).toContain("inv_002,2999,USD");
});
});
Write the failing test. Confirm it fails. Implement the minimum to pass. Confirm it passes. Then refactor. The order matters.
Step 4: requesting-code-review
Before merge, the review targets:
- Does this match the agreed plan?
- Any authorization gaps?
- Large dataset edge cases?
- Unhandled error states?
- Files changed outside the agreed scope?
Step 5: verification-before-completion
Depending on project type:
| Project Type | Verification Method |
|---|---|
| Web app | Start dev server, walk the critical path in browser |
| Backend service | Run tests, type check, hit the endpoint |
| CLI tool | Run the command, check actual output |
| iOS app | Test on real device (especially IAP, StoreKit, permissions) |
| SDK / Library | Unit tests + integration tests + example project |
The principle: evidence over claims. "I think it's done" is not verification.
How to Install
Claude Code
/plugin install superpowers@claude-plugins-official
Or via the Superpowers marketplace:
/plugin marketplace add obra/superpowers-marketplace
/plugin install superpowers@superpowers-marketplace
Codex CLI
/plugins
Search superpowers, select Install Plugin.
Codex App
Sidebar → Plugins → Coding category → Superpowers → +
When to Use vs. Skip
Not every task needs a full workflow. A typo fix doesn't need a plan. A one-liner doesn't need TDD.
The right mental model is risk-proportional discipline:
| Task | Recommended Approach |
|---|---|
| Typo fix, config lookup | Direct action—just verify the output |
| Single-file small change | Optional workflow; at minimum verify |
| Bug with unclear root cause |
systematic-debugging required |
| New feature |
brainstorming + writing-plans + TDD |
| Cross-module refactor | Plan + verification strongly recommended |
| Pre-merge / pre-deploy |
requesting-code-review + verification-before-completion
|
Skills should add friction proportional to the blast radius of getting it wrong.
Three Skills to Start With
If you're integrating Superpowers into an existing project, don't try to use everything at once. Start with three:
1. systematic-debugging
Tell the agent:
"Use systematic-debugging. Do not modify any code yet. List your root cause hypotheses first, then we'll validate them one by one."
This stops the shotgun-patch reflex before it starts.
2. writing-plans
Before any non-trivial feature or change:
"Use writing-plans. Produce an executable plan first. I'll confirm before you implement anything."
This surfaces scope creep before it happens, not after you're reviewing a 500-line diff.
3. verification-before-completion
Add this to your project's CLAUDE.md or AGENTS.md:
"Before declaring any task complete, use verification-before-completion. Run tests, verify in browser or device, report exactly what you checked and what the result was."
This closes the gap between "I think it works" and "I confirmed it works."
The Broader Pattern: Startup Superpowers
Startup Superpowers—a companion project that applies the same framework to startup validation—illustrates why this pattern generalizes beyond coding.
It applies the same idea (codify a professional workflow into loadable agent protocols) to hypothesis tracking, competitor research, customer interviews, and MVP scoping. Available slash commands:
| Command | Purpose |
|---|---|
/whats-next |
Assess current stage, recommend next action |
/competitors |
Map direct and indirect competitors |
/market-research |
Research customers, pricing, and trends |
/hypotheses |
Write testable hypotheses with evidence tracking |
/interviews |
Design scripts and analyze transcripts |
/surveys |
Design surveys and manage responses |
/mvp |
Design the minimum testable product |
Everything is stored as Markdown in a startup/ directory—version-controllable, agent-readable, no SaaS dependency.
That's the actual pattern: take a repeatable professional workflow, encode it as agent steps with defined inputs and outputs, make it loadable in any session, and store all state in files the agent can read and write. The AI doesn't get smarter. The process gets stable.
Summary
Superpowers Skills solves a specific problem: AI coding agents that know how to write code but don't know how to do software development.
The six questions it forces an agent to answer before declaring a task complete:
- Did you clarify the requirements before implementing?
- Did you make a verifiable plan before writing code?
- Did you write tests before the implementation?
- Did you find the root cause before patching?
- Did you get a review before merging?
- Did you actually verify—not just assume—that it works?
Without workflow constraints, developers have to ask these questions themselves, every session, every task. With Superpowers, the constraints are stable, loadable, and consistent across sessions, developers, and projects.
If you're using AI coding agents in real projects today, start with three skills: systematic-debugging, writing-plans, and verification-before-completion. They won't make development magical. They'll make your agent behave like a collaborator with engineering discipline instead of one without it.
Superpowers: github.com/obra/superpowers
Startup Superpowers: github.com/SergeiGorbatiuk/startup-superpowers
Top comments (0)