Most "AI for ops" tools fail in one of two ways. Fully autonomous agents go off the rails and ship the wrong thing. Draft-only assistants do 10% of the job and leave the breakdown, sequencing, and execution on you.
I wanted a third option: agents own decomposition and execution, humans own approval. The substrate is Notion, because that's where the work already lives.
This post is the pattern, not the pitch. The repo is at the bottom.
The problem with one-model-per-workflow
Direct LLM chat handles a single prompt well. It does not handle "launch the pricing page" well, because that isn't a prompt - it's a project. The model either improvises sub-steps in one long stream of consciousness, or you do the project management by hand and copy-paste between threads.
The fix is to separate three things that get conflated:
- Decomposition - turning one task into a graph of subtasks with dependencies
- Dispatch - choosing the right model for each subtask and running them in the right order
- Approval - the human gate between planning and execution
Once those are separate, you can keep humans in the decision seat without making them babysit the work.
The loop
Four phases, all anchored in a single Notion task database:
- Submit. Operator creates a task in Notion in plain language.
- Plan. A planner agent reads the task and emits a subtask graph. Each subtask gets a description, a list of dependencies, and an explicit model tier.
- Approve. The plan lands back in Notion as child pages. Operator reviews, edits, or rejects. Nothing dispatches without sign-off.
- Execute. On approval, an orchestrator dispatches subtasks in parallel where the graph allows, sequentially where it doesn't. Each subtask runs on its assigned model. Outputs land back in the same Notion task tree.
A concrete example
A "launch the website" task decomposes into 21 subtasks across three model tiers:
| Tier | Count | Work | Parallelism |
|---|---|---|---|
| Opus | 1 | Information architecture + outline | sequential (root) |
| Sonnet | 4 | Page copy drafts | parallel |
| Haiku | 6 | Asset/image prompts | parallel |
| Haiku | 10 | Directory submission entries | parallel |
One operator approval. Twenty-one outputs back in Notion, ready for review.
Why per-task model assignment matters
Defaulting to the smartest model for every subtask is the easiest way to burn money on agentic workflows. Directory submissions don't need Opus reasoning. Architecture decisions shouldn't run on Haiku.
In practice, routing clerical work to Haiku and reserving Opus for the reasoning-heavy nodes cuts model spend roughly an order of magnitude, with no quality loss on the parts that matter.
Implementation notes
A few things that turned out to be load-bearing once I started running this daily:
- The approval gate is doing real work. Without it, the planner occasionally invents subtasks or misjudges scope. With a 30-second human review, those get caught before they consume tokens.
-
Force the planner to declare dependencies explicitly. "Run in parallel where possible" only works if the planner outputs
dependsOn: []for every node. Implicit ordering doesn't survive contact with fan-out. - Give the planner a short rubric for tier selection. Without it, the planner over-picks the flagship "to be safe." A one-paragraph rubric in the system prompt (Haiku for clerical, Sonnet for writing, Opus for reasoning/architecture) is enough.
- Notion as the substrate is the unlock. It means non-technical operators can drive the workflow, edit plans, and consume outputs without a custom UI.
Caveats
- Not a fit if your team doesn't already live in Notion.
- The pattern is strongest for parallel fan-out of independent subtasks. Workflows that need iterative refinement between two agents mid-run are weaker.
- Once you fan out 10+ Haiku tasks at once, rate limits start mattering - back-pressure in the orchestrator is non-optional.
Repo
I've open-sourced the implementation:
github.com/ratamaha-git/agency-os
Happy to answer questions on the planner prompt shape, the dependency-graph schema, or the model-tier rubric.



Top comments (0)