Claude Code supports three model tiers, seven aliases, four effort levels, and per-subagent model overrides. Most developers use the default for everything. That means they're paying Opus prices for tasks Haiku handles in half the time.
This guide covers 5 patterns for matching the right model to the right task in Claude Code, based on the official model configuration system as of March 2026.
The Three Tiers and When Each Wins
Claude Code gives you three model families, each with a different speed-cost-quality tradeoff:
| Model | Best For | Speed | Cost |
|---|---|---|---|
| Haiku | File search, simple refactors, quick lookups | Fastest | Lowest |
| Sonnet | Daily coding, implementation, test writing | Balanced | Medium |
| Opus | Architecture decisions, complex debugging, multi-file refactoring | Slowest | Highest |
The default model depends on your subscription. Max and Team Premium plans default to Opus 4.6. Pro and Team Standard default to Sonnet 4.6. Claude Code may automatically fall back to Sonnet if you hit a usage threshold with Opus.
Knowing this matters because many developers on Max plans run Opus for every task, including tasks where Haiku would finish 3x faster with identical results.
Pattern 1: Switch Models Mid-Session With /model
You don't need to restart Claude Code to change models. The /model command switches instantly:
# Planning a complex migration? Switch to Opus
/model opus
# Now implementing the plan? Switch to Sonnet
/model sonnet
# Quick file search or simple rename? Switch to Haiku
/model haiku
You can also start a session with a specific model:
# Start with Haiku for a quick task
claude --model haiku
# Start with Opus for a design session
claude --model opus
Or set it permanently in your settings file (~/.claude/settings.json):
{
"model": "sonnet"
}
The priority order is: /model during session > --model flag > ANTHROPIC_MODEL environment variable > settings file.
This alone saves significant time. If you're in the middle of implementing a feature and need to look up how a module works, switching to Haiku for that exploration means you get answers faster and spend less context on a task that doesn't need deep reasoning.
Pattern 2: Use opusplan for Automatic Routing
The most underused model alias in Claude Code is opusplan. It uses Opus when you're in plan mode and automatically switches to Sonnet for execution:
claude --model opusplan
Here's why this matters. Planning requires the model to reason about architecture, weigh tradeoffs, and consider edge cases. That's where Opus excels. But once the plan is set, implementation is largely mechanical: write the code, follow the patterns, run the tests. Sonnet handles that efficiently.
With opusplan, you get Opus-quality planning and Sonnet-speed execution without manually switching. The transition happens automatically when you exit plan mode.
To enter plan mode during a session, use:
/plan
Claude Code switches to read-only exploration mode. If you're on opusplan, the model upgrades to Opus for this phase. When you approve the plan and exit plan mode, it drops back to Sonnet for implementation.
This is the closest thing to automatic model routing in Claude Code today, and it works well for the most common workflow: think first, then build.
Pattern 3: Control Effort Levels
Model selection isn't the only lever. Effort levels control how much thinking the model does per turn, independent of which model you're using.
Three levels persist across sessions: low, medium, and high. A fourth level, max, is available on Opus 4.6 only and does not persist.
# Quick formatting fix? Low effort
/effort low
# Standard coding work? Medium (default)
/effort medium
# Hard debugging or architecture? High effort
/effort high
# The hardest problem in the codebase? Max (Opus only)
/effort max
You can also set effort at startup:
claude --effort high
Or via environment variable:
export CLAUDE_CODE_EFFORT_LEVEL=high
Or in your settings file:
{
"effortLevel": "medium"
}
The key insight: medium effort is the recommended default for most coding tasks. Higher effort levels can cause the model to overthink routine work, actually making it slower without improving quality.
Reserve high or max for tasks that genuinely benefit from deeper reasoning: tracing a complex bug across multiple files, designing a new system architecture, or debugging a race condition.
For one-off deep reasoning without changing your session setting, include "ultrathink" in your prompt. This triggers high effort for that single turn.
The combination of model tier and effort level gives you a 2D control surface. Haiku at low effort is near-instant for simple lookups. Opus at max effort is the deepest reasoning available, but takes longer and costs more. Matching both dimensions to the task is what separates efficient Claude Code usage from wasteful usage.
Pattern 4: Route Models to Subagents
This is where model routing gets powerful. Claude Code subagents can each run on a different model, set in their definition file.
Here's a practical example. You want a code review agent that runs on Sonnet (good enough for pattern matching and style checks) and a research agent that runs on Haiku (fast lookups, no need for deep reasoning):
Create .claude/agents/code-reviewer.md:
---
name: code-reviewer
description: Reviews code for quality and best practices. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
model: sonnet
---
You are a senior code reviewer. When invoked:
1. Run git diff to see recent changes
2. Focus on modified files
3. Review for clarity, security, error handling, and test coverage
Provide feedback organized by priority:
- Critical issues (must fix)
- Warnings (should fix)
- Suggestions (consider improving)
Create .claude/agents/researcher.md:
---
name: researcher
description: Fast codebase exploration and documentation lookup.
tools: Read, Grep, Glob
model: haiku
---
You are a fast research assistant. Find files, search code,
and answer questions about the codebase structure.
Return concise summaries, not full file contents.
Now your main session can run on Opus for complex work, while delegating reviews to Sonnet and lookups to Haiku. Each subagent runs in its own context window, so the verbose output from exploration doesn't consume your main conversation's context.
Claude Code's built-in subagents already follow this pattern. The Explore agent runs on Haiku for fast, read-only codebase searches. The Plan agent inherits your session model for research during plan mode. The general-purpose agent also inherits your model for complex multi-step tasks.
You can also set a global default for all subagents via environment variable:
export CLAUDE_CODE_SUBAGENT_MODEL=haiku
This overrides the model for every subagent invocation, regardless of what's set in their definition files. Useful for cost control during development.
The model resolution order for subagents is:
-
CLAUDE_CODE_SUBAGENT_MODELenvironment variable - Per-invocation model parameter (set by Claude when delegating)
- Subagent's
modelfrontmatter field - Main conversation's model
Pattern 5: Pin Models for Team Consistency
When multiple developers share a project, model inconsistency becomes a real problem. One developer tests on Opus, another on Haiku. Code that works with Opus-level reasoning might fail when a teammate runs it with Haiku.
Pin models with environment variables to ensure everyone uses the same configuration:
# In your team's .envrc or shell profile
export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-6"
export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5-20251001"
For enterprise deployments on AWS Bedrock or Google Vertex AI, this is critical. Without pinning, Claude Code uses aliases that resolve to the latest version. When Anthropic releases a new model, users whose accounts don't have the new version enabled will break silently.
You can also restrict which models your team can select:
{
"availableModels": ["sonnet", "haiku"]
}
This prevents developers from switching to models outside the approved list via /model, --model, or environment variables. The default model for their subscription tier always remains available.
For full control, combine both settings:
{
"model": "sonnet",
"availableModels": ["sonnet", "haiku"]
}
This ensures everyone starts on Sonnet and can only switch between Sonnet and Haiku. No surprise Opus bills. No inconsistent behavior across the team.
The Decision Framework
Here's how to decide which model and effort to use for common tasks:
| Task | Model | Effort | Why |
|---|---|---|---|
| Find a file or function | Haiku | Low | Pure lookup, no reasoning needed |
| Rename a variable across files | Haiku | Low | Mechanical, well-defined scope |
| Write a new function | Sonnet | Medium | Needs context awareness, not deep reasoning |
| Write tests for existing code | Sonnet | Medium | Pattern matching against existing code |
| Debug a failing test | Sonnet | High | Needs reasoning about causality |
| Design a new module architecture | Opus | High | Multi-factor tradeoff analysis |
| Refactor across 10+ files | Opus | High | Needs to hold full dependency graph in context |
| Plan then implement a feature | opusplan | Auto | Opus for the plan, Sonnet for the code |
The 1M context window option (opus[1m] or sonnet[1m]) is available for long sessions with large codebases. On Max, Team, and Enterprise plans, Opus automatically gets 1M context. For Sonnet or other plans, it requires extra usage billing.
# Explicitly request 1M context
/model opus[1m]
/model sonnet[1m]
Start Here
If you're currently using the default model for everything, start with one change: switch to opusplan. It gives you the best of both tiers with zero manual switching.
Then add effort levels. Most of your work is /effort medium. When you hit a hard problem, /effort high gives you deeper reasoning without changing models.
Finally, create subagent definitions for your most common delegated tasks. A Haiku-powered explorer and a Sonnet-powered reviewer cover 80% of subagent use cases.
The goal isn't to use the cheapest model everywhere. It's to use the right model for each task, so you get faster answers on simple work and deeper reasoning on hard problems.
Follow @klement_gunndu for more Claude Code content. We're building in public.
Top comments (6)
nice write-up
Glad it landed — if you've been using model routing in practice, curious whether you default to Haiku for most tasks or find yourself reaching for Sonnet more than expected.
I'm going to be honest with you. My company pays my Claude and I tend to default to Opus most of the time :D
Still on Cursor/Composer and happy with it. v2 is Opus-level.
Cursor v2 is solid — the agent loop improvements really closed the gap. Where Claude Code pulls ahead for me is the terminal-native workflow and CLAUDE.md project memory that persists across sessions.
Cursor can have project memory files as well. We keep a separate "wiki" repository with plans and decision-making logs. Plus, Cursor can search past discussions and adjust the behaviour accordingly, even if something was not noted down in a wiki file.