Sharp Dev Eye

Posted on Apr 10

Your AI coding agent is winging it. Here's how to stop that.

#ai #claude #opensource #productivity

I spent months watching AI coding agents make the same mistakes across every project I threw at them:

Unstructured wall-of-text prompts
Context windows stuffed until they overflow
15+ tools exposed with vague one-line descriptions
Zero error handling — happy path only
Multi-agent orchestration for tasks a single agent handles fine
"It seems to work" as the entire evaluation strategy

I call this workflow slop. And every AI coding tool ships with it by default.

So I built Maestro — 21 skills and 20 commands that inject workflow discipline into any AI coding agent. One install. Works with Cursor, Claude Code, Gemini CLI, Copilot, Codex, and 5 more.

What Does "Workflow Slop" Actually Look Like?

Run /diagnose on any project. You'll get a scored audit across 5 dimensions:

╔══════════════════════════════════════╗
║          MAESTRO DIAGNOSTIC         ║
╠══════════════════════════════════════╣
║ Prompt Quality       ████░  4/5     ║
║ Context Efficiency   ███░░  3/5     ║
║ Tool Health          ██░░░  2/5     ║
║ Architecture         ████░  4/5     ║
║ Safety & Reliability ██░░░  2/5     ║
╠══════════════════════════════════════╣
║ Overall Score:       15/25          ║
╚══════════════════════════════════════╝

Every finding maps to a specific remediation command:

Score	Meaning	Auto-prescribed action
5	Excellent	No action needed
4	Minor gaps	`/refine` for polish
3	Functional but risky	`/fortify` or `/streamline`
2	Significant issues	`/fortify` + `/guard` immediately
1	Broken	`/onboard-agent` — rebuild

No generic advice. No "consider adding tests." The agent tells you exactly which command to run next.

The 20 Commands

Every command is a structured skill file with explicit instructions, checklists, anti-patterns, and a recommended next step so the agent never leaves you hanging.

Analysis — find the problems:

/diagnose — Full workflow health audit with scored dimensions
/evaluate — Test workflow quality against realistic scenarios

Fix & Improve — targeted repairs:

/fortify — Add error handling, retries, fallbacks
/streamline — Remove over-engineering and complexity
/calibrate — Align naming, formatting, conventions
/refine — Final quality pass before shipping

Enhancement — add new capabilities:

/amplify /compose /enrich /accelerate /chain /guard /iterate /temper /turbocharge

Utility — setup and adaptation:

/teach-maestro /onboard-agent /specialize /adapt-workflow /extract-pattern

Install in 30 Seconds

Option A: Skill Files (any provider)

npx skills add sharpdeveye/maestro

Works with Cursor, Claude Code, Gemini CLI, Codex CLI, VS Code Copilot / Antigravity, Kiro, Trae, OpenCode, and Pi.

Option B: MCP Server (any MCP client)

{
  "mcpServers": {
    "maestro": {
      "command": "npx",
      "args": ["-y", "maestro-workflow-mcp"]
    }
  }
}

Drop that in your MCP config. Done. 20 prompts, 4 tools, 8 knowledge resources — instantly available.

Why This Isn't Just Another Prompt Collection

Most "AI skill" repos are prompt dumps. Maestro is an ecosystem:

Feature	Prompt dumps	Maestro
Structure	Random .md files	YAML frontmatter + versioned skills
Flow	Dead ends	Every command recommends the next step
Anti-patterns	None	Explicit "NEVER do X" in every skill
Context	Hope the AI figures it out	`.maestro.md` project context protocol
Delivery	Copy-paste files	`npx install` + MCP server + 10 providers
Evaluation	None	`/diagnose` scores 5 dimensions 1-5

The ecosystem forms a loop:

/teach-maestro → /diagnose → /fortify → /evaluate → /refine
       ↑                                                  |
       └──────────────── continuous improvement ──────────┘

Real Example: What `/diagnose` Found in My Project

I ran /diagnose on my production app. It found:

Wallet service handling real money with zero test coverage. Idempotency keys were implemented, but no tests verified they actually prevent double-credits. Score: Safety 2/5.
Two services using DB transactions without try/catch. If a deadlock occurs, the exception bubbles unhandled and the user gets a raw 500 error.
Frontend deploying to Cloudflare Pages without tsc --noEmit. Type errors were reaching production undetected.

Each finding came with a specific command: /fortify WalletService, /guard financial-flows, /fortify frontend-build.

That's the difference between "you should probably add tests" and "Run /guard on your wallet service because your financial operations have zero test coverage and idempotency keys are unverified."

The Architecture

source/skills/           ← 21 skill definitions (source of truth)
├── agent-workflow/      ← Core skill + 7 reference docs
│   └── reference/       ← Prompt engineering, context mgmt, etc.
├── diagnose/            ← Analysis commands
├── fortify/             ← Fix commands
├── amplify/             ← Enhancement commands
└── teach-maestro/       ← Utility commands

scripts/
├── build.js             ← Copies to 10 provider directories
├── bundle-skills.js     ← Bundles into MCP server
└── validate.js          ← Validates frontmatter + references

mcp-server/              ← npm package: maestro-workflow-mcp
├── tools.ts             ← 4 MCP tools with template resolution
├── prompts.ts           ← 20 MCP prompts
└── resources.ts         ← 8 read-only knowledge resources

One source. 10 providers. One MCP server. Everything validated, bundled, and versioned.

What's Next

More references — domain-specific guides for testing, deployment, observability
Scoring trends — track /diagnose scores over time
Community skills — contribute your own commands via PR

Try It

# Install skills
npx skills add sharpdeveye/maestro

# Or add the MCP server
# → add to your mcp config: npx -y maestro-workflow-mcp

# Then run your first diagnostic
/diagnose

If it finds workflow slop — it will.

GitHub: github.com/sharpdeveye/maestro
npm: maestro-workflow-mcp
License: MIT

If this saved you from one more "it seems to work" deployment, consider dropping a ⭐ on the repo. It helps more than you think.

DEV Community

Your AI coding agent is winging it. Here's how to stop that.

What Does "Workflow Slop" Actually Look Like?

The 20 Commands

Install in 30 Seconds

Option A: Skill Files (any provider)

Option B: MCP Server (any MCP client)

Why This Isn't Just Another Prompt Collection

Real Example: What `/diagnose` Found in My Project

The Architecture

What's Next

Try It

Top comments (0)

What Does "Workflow Slop" Actually Look Like?

The 20 Commands

Install in 30 Seconds

Option A: Skill Files (any provider)

Option B: MCP Server (any MCP client)

Why This Isn't Just Another Prompt Collection

Real Example: What /diagnose Found in My Project

The Architecture

What's Next

Try It

Real Example: What `/diagnose` Found in My Project