I spent months watching AI coding agents make the same mistakes across every project I threw at them:
- Unstructured wall-of-text prompts
- Context windows stuffed until they overflow
- 15+ tools exposed with vague one-line descriptions
- Zero error handling — happy path only
- Multi-agent orchestration for tasks a single agent handles fine
- "It seems to work" as the entire evaluation strategy
I call this workflow slop. And every AI coding tool ships with it by default.
So I built Maestro — 21 skills and 20 commands that inject workflow discipline into any AI coding agent. One install. Works with Cursor, Claude Code, Gemini CLI, Copilot, Codex, and 5 more.
What Does "Workflow Slop" Actually Look Like?
Run /diagnose on any project. You'll get a scored audit across 5 dimensions:
╔══════════════════════════════════════╗
║ MAESTRO DIAGNOSTIC ║
╠══════════════════════════════════════╣
║ Prompt Quality ████░ 4/5 ║
║ Context Efficiency ███░░ 3/5 ║
║ Tool Health ██░░░ 2/5 ║
║ Architecture ████░ 4/5 ║
║ Safety & Reliability ██░░░ 2/5 ║
╠══════════════════════════════════════╣
║ Overall Score: 15/25 ║
╚══════════════════════════════════════╝
Every finding maps to a specific remediation command:
| Score | Meaning | Auto-prescribed action |
|---|---|---|
| 5 | Excellent | No action needed |
| 4 | Minor gaps |
/refine for polish |
| 3 | Functional but risky |
/fortify or /streamline
|
| 2 | Significant issues |
/fortify + /guard immediately |
| 1 | Broken |
/onboard-agent — rebuild |
No generic advice. No "consider adding tests." The agent tells you exactly which command to run next.
The 20 Commands
Every command is a structured skill file with explicit instructions, checklists, anti-patterns, and a recommended next step so the agent never leaves you hanging.
Analysis — find the problems:
-
/diagnose— Full workflow health audit with scored dimensions -
/evaluate— Test workflow quality against realistic scenarios
Fix & Improve — targeted repairs:
-
/fortify— Add error handling, retries, fallbacks -
/streamline— Remove over-engineering and complexity -
/calibrate— Align naming, formatting, conventions -
/refine— Final quality pass before shipping
Enhancement — add new capabilities:
-
/amplify/compose/enrich/accelerate/chain/guard/iterate/temper/turbocharge
Utility — setup and adaptation:
-
/teach-maestro/onboard-agent/specialize/adapt-workflow/extract-pattern
Install in 30 Seconds
Option A: Skill Files (any provider)
npx skills add sharpdeveye/maestro
Works with Cursor, Claude Code, Gemini CLI, Codex CLI, VS Code Copilot / Antigravity, Kiro, Trae, OpenCode, and Pi.
Option B: MCP Server (any MCP client)
{
"mcpServers": {
"maestro": {
"command": "npx",
"args": ["-y", "maestro-workflow-mcp"]
}
}
}
Drop that in your MCP config. Done. 20 prompts, 4 tools, 8 knowledge resources — instantly available.
Why This Isn't Just Another Prompt Collection
Most "AI skill" repos are prompt dumps. Maestro is an ecosystem:
| Feature | Prompt dumps | Maestro |
|---|---|---|
| Structure | Random .md files | YAML frontmatter + versioned skills |
| Flow | Dead ends | Every command recommends the next step |
| Anti-patterns | None | Explicit "NEVER do X" in every skill |
| Context | Hope the AI figures it out |
.maestro.md project context protocol |
| Delivery | Copy-paste files |
npx install + MCP server + 10 providers |
| Evaluation | None |
/diagnose scores 5 dimensions 1-5 |
The ecosystem forms a loop:
/teach-maestro → /diagnose → /fortify → /evaluate → /refine
↑ |
└──────────────── continuous improvement ──────────┘
Real Example: What /diagnose Found in My Project
I ran /diagnose on my production app. It found:
Wallet service handling real money with zero test coverage. Idempotency keys were implemented, but no tests verified they actually prevent double-credits. Score: Safety 2/5.
Two services using DB transactions without try/catch. If a deadlock occurs, the exception bubbles unhandled and the user gets a raw 500 error.
Frontend deploying to Cloudflare Pages without
tsc --noEmit. Type errors were reaching production undetected.
Each finding came with a specific command: /fortify WalletService, /guard financial-flows, /fortify frontend-build.
That's the difference between "you should probably add tests" and "Run /guard on your wallet service because your financial operations have zero test coverage and idempotency keys are unverified."
The Architecture
source/skills/ ← 21 skill definitions (source of truth)
├── agent-workflow/ ← Core skill + 7 reference docs
│ └── reference/ ← Prompt engineering, context mgmt, etc.
├── diagnose/ ← Analysis commands
├── fortify/ ← Fix commands
├── amplify/ ← Enhancement commands
└── teach-maestro/ ← Utility commands
scripts/
├── build.js ← Copies to 10 provider directories
├── bundle-skills.js ← Bundles into MCP server
└── validate.js ← Validates frontmatter + references
mcp-server/ ← npm package: maestro-workflow-mcp
├── tools.ts ← 4 MCP tools with template resolution
├── prompts.ts ← 20 MCP prompts
└── resources.ts ← 8 read-only knowledge resources
One source. 10 providers. One MCP server. Everything validated, bundled, and versioned.
What's Next
- More references — domain-specific guides for testing, deployment, observability
-
Scoring trends — track
/diagnosescores over time - Community skills — contribute your own commands via PR
Try It
# Install skills
npx skills add sharpdeveye/maestro
# Or add the MCP server
# → add to your mcp config: npx -y maestro-workflow-mcp
# Then run your first diagnostic
/diagnose
If it finds workflow slop — it will.
GitHub: github.com/sharpdeveye/maestro
npm: maestro-workflow-mcp
License: MIT
If this saved you from one more "it seems to work" deployment, consider dropping a ⭐ on the repo. It helps more than you think.
Top comments (0)