oh-my-agent — A Production-Grade Multi AI IDE Agent Harness

#ai #agents #programming #productivity

When you tell an agent to "build a TODO app," it does build something. The problem is that it often builds the wrong thing, drifts out of scope, and repeats the same mistakes.

To address this, structural approaches like AGENTS.md and, more recently, Skills have emerged. But looking at the skills actually being shared, a few recurring problems stand out:

The most critical piece — library version information — is missing.
Role descriptions end at hollow declarations like "You are a Senior engineer."
Content that could be covered by a few keywords gets padded into lengthy prose, wasting tokens.

As a result, these skills are poorly followed by models, burn context for nothing, and over time become dead code that nobody wants to open.

[Approach]

With oh-my-agent, we wanted to solve this through process, not prompts. Instead of simply telling the agent to "redo it" when something goes wrong, we record why it went wrong and feed that back into the next run.

The core mechanism is Clarification Debt (CD) Scoring. When the agent misinterprets a requirement or drifts out of scope, points accumulate:

clarify: +10 — simple confirmation question
correct: +25 — direction change due to misunderstood intent
redo: +40 — rollback and restart due to scope deviation
Starting work without checking the Charter: +15
Modifying files outside the allowed scope: +20
Repeating the same error: x1.5 multiplier

Above 50 points, writing a Root Cause Analysis (RCA) is mandatory. Above 80, the session is halted. Lessons extracted are accumulated in lessons-learned.md and reflected from the very next session. Even with simple prompts, the process compensates.

Beyond that, several common protocols keep the agent from going rogue:

Clarification Protocol — Requirement ambiguity is classified as LOW / MEDIUM / HIGH. LOW means proceed, MEDIUM means present options, HIGH means stop and clarify first.

Difficulty Guide — Tasks are categorized as Simple / Medium / Complex, adjusting the required protocol depth accordingly.

Context Budget — Token budgets are set per model to reduce unnecessary context consumption.

This approach aligns with the Harness Engineering concept discussed by OpenAI. Getting the most out of agents isn't a one-liner prompt problem — it's about what control structure you wrap around them.

[Project Structure]

oh-my-agent manages all of this within the project directory.

.agents/ = SSOT — Skills, workflows, and configurations live under .agents/ as the single source of truth. No dependency on any specific IDE.

Role-based agent team — Core roles include PM, QA, Frontend, Backend, Mobile, and Debug, with DB Agent and TF Infra Agent newly added.

DB Agent: SQL / NoSQL / Vector DB modeling, including ISO 27001 security recommendations
TF Infra Agent: Multi-cloud Terraform, OPA / Sentinel policies, ISO 42000 series control guidance

Workflow-centric orchestration — Planning, review, debug, and parallel execution form the default flow. The newly added /brainstorm workflow explores design before writing code: codebase analysis → clarification questions → approach proposal → user approval → design document saved, then followed by /plan → implementation.

[Two Orchestration Modes]

/coordinate is built for speed — iterate fast, fix problems as they surface. The PM breaks down tasks, dispatches agents, and QA runs a single review pass. If CRITICAL/HIGH issues appear, the affected task is re-run. It's a lightweight 7-step loop.

/ultrawork emphasizes quality gates. It's divided into five phases — PLAN → IMPL → VERIFY → REFINE → SHIP — each with a gate that blocks progression until passed. Of the 17 steps, 11 are reviews. The REFINE phase handles file splitting, deduplication, side-effect analysis, and dead code removal.

It might seem like overkill, but as programming abstraction climbs from machine language to high-level languages and now to natural language, verification only becomes more critical — a point that's hard to argue with.

[Expansion Background]

A month ago, this project launched as oh-my-ag, an orchestrator exclusive to Antigravity. Since then, multiple AI IDEs started adopting .agents/skills/ as the project skill path, and there was no longer a reason to keep it locked to a single IDE. So it was expanded into a universal harness format and became oh-my-agent.

[Getting Started]

curl -fsSL https://raw.githubusercontent.com/first-fluke/oh-my-agent/refs/heads/main/cli/install.sh | bash

Supports all major AI IDEs: Antigravity, Claude Code, Codex CLI, Cursor, and more.

If you're already using an AI IDE, give it a try. At the end of the day, the developer's goal is to hit QCD (Quality, Cost, Delivery) all at once. Agent-driven development is no exception — and that's the mindset behind this project.

🔗 GitHub: first-fluke/oh-my-agent