DEV Community

정상록
정상록

Posted on

Claude Code Advisor: Opus Steering Sonnet Inside a Single API Call

Claude Code Advisor: Opus Steering Sonnet Inside a Single API Call

Anthropic shipped the Advisor Strategy on April 9, 2026. It's not a new model. It's a pattern: a fast executor (Haiku/Sonnet) does the work, and at decision points it calls a stronger advisor (Opus) for strategic guidance — all inside a single /v1/messages request, handled server-side.

The numbers are the headline.

The Benchmark Story

SWE-bench Multilingual (real coding tasks):

  • Sonnet 4.6 solo: 72.1%
  • Sonnet 4.6 + Opus advisor: 74.8% (+2.7pp)
  • Cost: 11.9% cheaper (yes, higher quality and cheaper)

BrowseComp (web research):

  • Haiku 4.5 solo: 19.7%
  • Haiku 4.5 + Opus advisor: 41.2% (2x+)
  • Cost: 85% cheaper than Sonnet solo

That last one is the jaw-dropper. You get Haiku latency, Opus judgment, and a price point below Sonnet.

How It Works

The Advisor is server-side. From the client's perspective, it's just another tool in the tools array:

from anthropic import Anthropic

client = Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",  # executor
    max_tokens=4096,
    tools=[
        {
            "type": "advisor_20260301",
            "model": "claude-opus-4-7",  # must be >= executor
            "max_uses": 5,
            "caching": {"type": "ephemeral", "ttl": "5m"}
        }
    ],
    extra_headers={
        "anthropic-beta": "advisor-tool-2026-03-01"
    },
    messages=[{"role": "user", "content": "Refactor the payment module."}]
)

# Per-call token breakdown in usage.iterations[]
for it in response.usage.iterations:
    print(it.model, it.input_tokens, it.output_tokens)
Enter fullscreen mode Exit fullscreen mode

What happens internally:

  1. Executor emits a server_tool_use block (name="advisor", input={})
  2. Anthropic's server forwards current context to the advisor
  3. Advisor (Opus 4.7) runs extended thinking, returns guidance
  4. An advisor_tool_result block lands in the stream
  5. Executor continues

The advisor does not call tools. The advisor does not respond to the user. It only whispers strategy to the executor.

When to Call the Advisor

Anthropic's official prompting guide says four moments:

  1. Before substantive work — before writing, before committing to an interpretation
  2. When the task feels complete — but persist the deliverable first
  3. When stuck — same error recurring, approach not converging
  4. When considering a change of approach

What not to call it for: pure mechanical work, single-turn Q&A, workloads that need top-tier intelligence on every turn.

The Trimming Trick

Add this line to your system prompt:

"The advisor should respond in under 100 words and use enumerated steps, not explanations."

Per Anthropic's own measurements: 35-45% token reduction on advisor calls with no quality degradation. Pair it with max_uses to cap per-request spend.

Claude Code Users: One Command

If you're using Claude Code CLI, skip the SDK entirely:

/advisor
Enter fullscreen mode Exit fullscreen mode

That's it. The toggle flips, and every subsequent agent loop uses the Executor-Advisor pattern.

Error Codes to Know

Code Cause Fix
max_uses_exceeded Per-request cap hit Raise max_uses or tighten executor prompt
overloaded Advisor capacity Retry with backoff, fallback to Opus 4.6
context_mismatch You stripped advisor_tool_result blocks Keep full assistant response in multi-turn
invalid_advisor_model Advisor weaker than executor Always use Opus 4.7 as advisor

Customer Evidence

  • Bolt (Eric Simmons, CEO): "Architectural decisions on complex tasks visibly improved. The plans and trajectories are completely different."
  • Eve Legal (Anuraj Pandey): "Haiku 4.5 with Opus 4.6 advisory hit frontier quality at 5x less cost on structured document extraction."

The Takeaway for Indie Developers

The model-selection dilemma — "Opus quality but at Haiku prices" — just got structurally solved. If you're running long-horizon coding agents, multi-step research pipelines, or Computer Use flows, this pattern changes your cost curve.

Three lines of code to try it. One CLI command if you're in Claude Code.


Reference: Anthropic Blog — The Advisor Strategy

Top comments (0)