How We Chose Which Claude Model for Each Agent (Opus vs Sonnet vs Haiku)

#ai #claudeai #multiagent #devops

We run 13 AI agents. Giving every agent Claude Opus 4 would cost $40/day and slow everything down. Here's the exact decision framework we use.

The Stack We Run

Agent	Model	Reason
Atlas (CEO)	Claude Opus 4	Strategic decisions, cross-agent coordination
Prometheus	Claude Opus 4	Titan-tier peer, creative direction, foresight
Hermes	Claude Sonnet 4.5	Trading logic, fast execution
Athena	Claude Sonnet 4.5	Revenue ops, Stripe workflows
Apollo	Claude Sonnet 4.5	Research, competitive intel
Orpheus	Claude Haiku 4.5	Copy writing, captions
Hephaestus	Claude Haiku 4.5	File rendering, video builds

The Decision Framework

Use Opus when:

The agent makes decisions that affect other agents
Output quality directly impacts revenue
The task requires multi-step reasoning with no retry budget

Use Sonnet when:

Domain-specific execution (trading, billing, research)
Output goes through a review layer before shipping
30-60s latency is acceptable

Use Haiku when:

Volume tasks: captions, file transforms, templated outputs
Speed matters more than nuance
The output is reviewed by a Sonnet/Opus agent before it ships

The Math

A single Opus session on a complex strategic task: ~$0.80.
The same task on Sonnet: ~$0.12.
On Haiku: ~$0.02.

We run 5+ agent sessions per night. Wrong model selection would cost $200+/month in unnecessary API spend.

The Rule We Actually Use

If the agent's output gets reviewed by another agent before shipping, drop one model tier.

This means Haiku can write sleep story drafts because Prometheus reviews them before upload. Haiku can format captions because Hephaestus's output passes through Atlas's quality filter.

The review layer is the unlock. Without it, every agent needs to be Opus-tier because every mistake ships.