Your AI coding agent is winging it. Here's how to stop that.

Sharp Dev Eye — Fri, 10 Apr 2026 15:22:39 +0000

I spent months watching AI coding agents make the same mistakes across every project I threw at them:

Unstructured wall-of-text prompts
Context windows stuffed until they overflow
15+ tools exposed with vague one-line descriptions
Zero error handling — happy path only
Multi-agent orchestration for tasks a single agent handles fine
"It seems to work" as the entire evaluation strategy

I call this workflow slop. And every AI coding tool ships with it by default.

So I built Maestro — 21 skills and 20 commands that inject workflow discipline into any AI coding agent. One install. Works with Cursor, Claude Code, Gemini CLI, Copilot, Codex, and 5 more.

What Does "Workflow Slop" Actually Look Like?

Run /diagnose on any project. You'll get a scored audit across 5 dimensions:

╔══════════════════════════════════════╗
║          MAESTRO DIAGNOSTIC         ║
╠══════════════════════════════════════╣
║ Prompt Quality       ████░  4/5     ║
║ Context Efficiency   ███░░  3/5     ║
║ Tool Health          ██░░░  2/5     ║
║ Architecture         ████░  4/5     ║
║ Safety & Reliability ██░░░  2/5     ║
╠══════════════════════════════════════╣
║ Overall Score:       15/25          ║
╚══════════════════════════════════════╝

Every finding maps to a specific remediation command:

Score	Meaning	Auto-prescribed action
5	Excellent	No action needed
4	Minor gaps	`/refine` for polish
3	Functional but risky	`/fortify` or `/streamline`
2	Significant issues	`/fortify` + `/guard` immediately
1	Broken	`/onboard-agent` — rebuild

No generic advice. No "consider adding tests." The agent tells you exactly which command to run next.

The 20 Commands

Every command is a structured skill file with explicit instructions, checklists, anti-patterns, and a recommended next step so the agent never leaves you hanging.

Analysis — find the problems:

/diagnose — Full workflow health audit with scored dimensions
/evaluate — Test workflow quality against realistic scenarios

Fix & Improve — targeted repairs:

/fortify — Add error handling, retries, fallbacks
/streamline — Remove over-engineering and complexity
/calibrate — Align naming, formatting, conventions
/refine — Final quality pass before shipping

Enhancement — add new capabilities:

/amplify /compose /enrich /accelerate /chain /guard /iterate /temper /turbocharge

Utility — setup and adaptation:

/teach-maestro /onboard-agent /specialize /adapt-workflow /extract-pattern

Install in 30 Seconds

Option A: Skill Files (any provider)

npx skills add sharpdeveye/maestro

Works with Cursor, Claude Code, Gemini CLI, Codex CLI, VS Code Copilot / Antigravity, Kiro, Trae, OpenCode, and Pi.

Option B: MCP Server (any MCP client)

{
  "mcpServers": {
    "maestro": {
      "command": "npx",
      "args": ["-y", "maestro-workflow-mcp"]
    }
  }
}

Drop that in your MCP config. Done. 20 prompts, 4 tools, 8 knowledge resources — instantly available.

Why This Isn't Just Another Prompt Collection

Most "AI skill" repos are prompt dumps. Maestro is an ecosystem:

Feature	Prompt dumps	Maestro
Structure	Random .md files	YAML frontmatter + versioned skills
Flow	Dead ends	Every command recommends the next step
Anti-patterns	None	Explicit "NEVER do X" in every skill
Context	Hope the AI figures it out	`.maestro.md` project context protocol
Delivery	Copy-paste files	`npx install` + MCP server + 10 providers
Evaluation	None	`/diagnose` scores 5 dimensions 1-5

The ecosystem forms a loop:

/teach-maestro → /diagnose → /fortify → /evaluate → /refine
       ↑                                                  |
       └──────────────── continuous improvement ──────────┘

Real Example: What `/diagnose` Found in My Project

I ran /diagnose on my production app. It found:

Wallet service handling real money with zero test coverage. Idempotency keys were implemented, but no tests verified they actually prevent double-credits. Score: Safety 2/5.
Two services using DB transactions without try/catch. If a deadlock occurs, the exception bubbles unhandled and the user gets a raw 500 error.
Frontend deploying to Cloudflare Pages without tsc --noEmit. Type errors were reaching production undetected.

Each finding came with a specific command: /fortify WalletService, /guard financial-flows, /fortify frontend-build.

That's the difference between "you should probably add tests" and "Run /guard on your wallet service because your financial operations have zero test coverage and idempotency keys are unverified."

The Architecture

source/skills/           ← 21 skill definitions (source of truth)
├── agent-workflow/      ← Core skill + 7 reference docs
│   └── reference/       ← Prompt engineering, context mgmt, etc.
├── diagnose/            ← Analysis commands
├── fortify/             ← Fix commands
├── amplify/             ← Enhancement commands
└── teach-maestro/       ← Utility commands

scripts/
├── build.js             ← Copies to 10 provider directories
├── bundle-skills.js     ← Bundles into MCP server
└── validate.js          ← Validates frontmatter + references

mcp-server/              ← npm package: maestro-workflow-mcp
├── tools.ts             ← 4 MCP tools with template resolution
├── prompts.ts           ← 20 MCP prompts
└── resources.ts         ← 8 read-only knowledge resources

One source. 10 providers. One MCP server. Everything validated, bundled, and versioned.

What's Next

More references — domain-specific guides for testing, deployment, observability
Scoring trends — track /diagnose scores over time
Community skills — contribute your own commands via PR

Try It

# Install skills
npx skills add sharpdeveye/maestro

# Or add the MCP server
# → add to your mcp config: npx -y maestro-workflow-mcp

# Then run your first diagnostic
/diagnose

If it finds workflow slop — it will.

GitHub: github.com/sharpdeveye/maestro
npm: maestro-workflow-mcp
License: MIT

If this saved you from one more "it seems to work" deployment, consider dropping a ⭐ on the repo. It helps more than you think.

DEV Community: Sharp Dev Eye