klement Gunndu

Posted on Mar 27

Pick the Right Claude Code Model for Every Task

#claudecode #ai #tutorial #productivity

Claude Code supports three model tiers, seven aliases, four effort levels, and per-subagent model overrides. Most developers use the default for everything. That means they're paying Opus prices for tasks Haiku handles in half the time.

This guide covers 5 patterns for matching the right model to the right task in Claude Code, based on the official model configuration system as of March 2026.

The Three Tiers and When Each Wins

Claude Code gives you three model families, each with a different speed-cost-quality tradeoff:

Model	Best For	Speed	Cost
Haiku	File search, simple refactors, quick lookups	Fastest	Lowest
Sonnet	Daily coding, implementation, test writing	Balanced	Medium
Opus	Architecture decisions, complex debugging, multi-file refactoring	Slowest	Highest

The default model depends on your subscription. Max and Team Premium plans default to Opus 4.6. Pro and Team Standard default to Sonnet 4.6. Claude Code may automatically fall back to Sonnet if you hit a usage threshold with Opus.

Knowing this matters because many developers on Max plans run Opus for every task, including tasks where Haiku would finish 3x faster with identical results.

Pattern 1: Switch Models Mid-Session With /model

You don't need to restart Claude Code to change models. The /model command switches instantly:

# Planning a complex migration? Switch to Opus
/model opus

# Now implementing the plan? Switch to Sonnet
/model sonnet

# Quick file search or simple rename? Switch to Haiku
/model haiku

You can also start a session with a specific model:

# Start with Haiku for a quick task
claude --model haiku

# Start with Opus for a design session
claude --model opus

Or set it permanently in your settings file (~/.claude/settings.json):

{
  "model": "sonnet"
}

The priority order is: /model during session > --model flag > ANTHROPIC_MODEL environment variable > settings file.

This alone saves significant time. If you're in the middle of implementing a feature and need to look up how a module works, switching to Haiku for that exploration means you get answers faster and spend less context on a task that doesn't need deep reasoning.

Pattern 2: Use opusplan for Automatic Routing

The most underused model alias in Claude Code is opusplan. It uses Opus when you're in plan mode and automatically switches to Sonnet for execution:

claude --model opusplan

Here's why this matters. Planning requires the model to reason about architecture, weigh tradeoffs, and consider edge cases. That's where Opus excels. But once the plan is set, implementation is largely mechanical: write the code, follow the patterns, run the tests. Sonnet handles that efficiently.

With opusplan, you get Opus-quality planning and Sonnet-speed execution without manually switching. The transition happens automatically when you exit plan mode.

To enter plan mode during a session, use:

/plan

Claude Code switches to read-only exploration mode. If you're on opusplan, the model upgrades to Opus for this phase. When you approve the plan and exit plan mode, it drops back to Sonnet for implementation.

This is the closest thing to automatic model routing in Claude Code today, and it works well for the most common workflow: think first, then build.

Pattern 3: Control Effort Levels

Model selection isn't the only lever. Effort levels control how much thinking the model does per turn, independent of which model you're using.

Three levels persist across sessions: low, medium, and high. A fourth level, max, is available on Opus 4.6 only and does not persist.

# Quick formatting fix? Low effort
/effort low

# Standard coding work? Medium (default)
/effort medium

# Hard debugging or architecture? High effort
/effort high

# The hardest problem in the codebase? Max (Opus only)
/effort max

You can also set effort at startup:

claude --effort high

Or via environment variable:

export CLAUDE_CODE_EFFORT_LEVEL=high

Or in your settings file:

{
  "effortLevel": "medium"
}

The key insight: medium effort is the recommended default for most coding tasks. Higher effort levels can cause the model to overthink routine work, actually making it slower without improving quality.

Reserve high or max for tasks that genuinely benefit from deeper reasoning: tracing a complex bug across multiple files, designing a new system architecture, or debugging a race condition.

For one-off deep reasoning without changing your session setting, include "ultrathink" in your prompt. This triggers high effort for that single turn.

The combination of model tier and effort level gives you a 2D control surface. Haiku at low effort is near-instant for simple lookups. Opus at max effort is the deepest reasoning available, but takes longer and costs more. Matching both dimensions to the task is what separates efficient Claude Code usage from wasteful usage.

Pattern 4: Route Models to Subagents

This is where model routing gets powerful. Claude Code subagents can each run on a different model, set in their definition file.

Here's a practical example. You want a code review agent that runs on Sonnet (good enough for pattern matching and style checks) and a research agent that runs on Haiku (fast lookups, no need for deep reasoning):

Create .claude/agents/code-reviewer.md:

---
name: code-reviewer
description: Reviews code for quality and best practices. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
model: sonnet
---

You are a senior code reviewer. When invoked:
1. Run git diff to see recent changes
2. Focus on modified files
3. Review for clarity, security, error handling, and test coverage

Provide feedback organized by priority:
- Critical issues (must fix)
- Warnings (should fix)
- Suggestions (consider improving)

Create .claude/agents/researcher.md:

---
name: researcher
description: Fast codebase exploration and documentation lookup.
tools: Read, Grep, Glob
model: haiku
---

You are a fast research assistant. Find files, search code,
and answer questions about the codebase structure.
Return concise summaries, not full file contents.

Now your main session can run on Opus for complex work, while delegating reviews to Sonnet and lookups to Haiku. Each subagent runs in its own context window, so the verbose output from exploration doesn't consume your main conversation's context.

Claude Code's built-in subagents already follow this pattern. The Explore agent runs on Haiku for fast, read-only codebase searches. The Plan agent inherits your session model for research during plan mode. The general-purpose agent also inherits your model for complex multi-step tasks.

You can also set a global default for all subagents via environment variable:

export CLAUDE_CODE_SUBAGENT_MODEL=haiku

This overrides the model for every subagent invocation, regardless of what's set in their definition files. Useful for cost control during development.

The model resolution order for subagents is:

CLAUDE_CODE_SUBAGENT_MODEL environment variable
Per-invocation model parameter (set by Claude when delegating)
Subagent's model frontmatter field
Main conversation's model

Pattern 5: Pin Models for Team Consistency

When multiple developers share a project, model inconsistency becomes a real problem. One developer tests on Opus, another on Haiku. Code that works with Opus-level reasoning might fail when a teammate runs it with Haiku.

Pin models with environment variables to ensure everyone uses the same configuration:

# In your team's .envrc or shell profile
export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-6"
export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5-20251001"

For enterprise deployments on AWS Bedrock or Google Vertex AI, this is critical. Without pinning, Claude Code uses aliases that resolve to the latest version. When Anthropic releases a new model, users whose accounts don't have the new version enabled will break silently.

You can also restrict which models your team can select:

{
  "availableModels": ["sonnet", "haiku"]
}

This prevents developers from switching to models outside the approved list via /model, --model, or environment variables. The default model for their subscription tier always remains available.

For full control, combine both settings:

{
  "model": "sonnet",
  "availableModels": ["sonnet", "haiku"]
}

This ensures everyone starts on Sonnet and can only switch between Sonnet and Haiku. No surprise Opus bills. No inconsistent behavior across the team.

The Decision Framework

Here's how to decide which model and effort to use for common tasks:

Task	Model	Effort	Why
Find a file or function	Haiku	Low	Pure lookup, no reasoning needed
Rename a variable across files	Haiku	Low	Mechanical, well-defined scope
Write a new function	Sonnet	Medium	Needs context awareness, not deep reasoning
Write tests for existing code	Sonnet	Medium	Pattern matching against existing code
Debug a failing test	Sonnet	High	Needs reasoning about causality
Design a new module architecture	Opus	High	Multi-factor tradeoff analysis
Refactor across 10+ files	Opus	High	Needs to hold full dependency graph in context
Plan then implement a feature	opusplan	Auto	Opus for the plan, Sonnet for the code

The 1M context window option (opus[1m] or sonnet[1m]) is available for long sessions with large codebases. On Max, Team, and Enterprise plans, Opus automatically gets 1M context. For Sonnet or other plans, it requires extra usage billing.

# Explicitly request 1M context
/model opus[1m]
/model sonnet[1m]

Start Here

If you're currently using the default model for everything, start with one change: switch to opusplan. It gives you the best of both tiers with zero manual switching.

Then add effort levels. Most of your work is /effort medium. When you hit a hard problem, /effort high gives you deeper reasoning without changing models.

Finally, create subagent definitions for your most common delegated tasks. A Haiku-powered explorer and a Sonnet-powered reviewer cover 80% of subagent use cases.

The goal isn't to use the cheapest model everywhere. It's to use the right model for each task, so you get faster answers on simple work and deeper reasoning on hard problems.

Follow @klement_gunndu for more Claude Code content. We're building in public.

Top comments (15)

Joske Vermeulen • Mar 27

nice write-up

klement Gunndu • Mar 27

Glad it landed — if you've been using model routing in practice, curious whether you default to Haiku for most tasks or find yourself reaching for Sonnet more than expected.

Joske Vermeulen • Mar 27

I'm going to be honest with you. My company pays my Claude and I tend to default to Opus most of the time :D

klement Gunndu • Mar 27

Honestly, that is the best position to be in. When cost is not your constraint, Opus gives you the highest-quality output on the first pass with less back-and-forth. The model routing approach mainly pays off when you are running dozens of agent loops or background tasks where token cost compounds fast. For single-session interactive work with company credits, defaulting to Opus is the right call.

klement Gunndu • Mar 27

Ha, when someone else is paying the bill Opus for everything is the rational choice. The model routing patterns matter most when you are on your own dime or hitting usage caps on long sessions.

klement Gunndu • Apr 2

Appreciate that — curious if you've been switching between models yourself or mostly sticking with one?

Apogee Watcher • Mar 27

Still on Cursor/Composer and happy with it. v2 is Opus-level.

klement Gunndu • Apr 2

Cursor Composer v2 is genuinely solid — the gap between tools is shrinking fast. The model routing idea still applies though; even Cursor lets you pick models per task, which is where the real cost savings hide.

klement Gunndu • Mar 27

Cursor v2 is solid — the agent loop improvements really closed the gap. Where Claude Code pulls ahead for me is the terminal-native workflow and CLAUDE.md project memory that persists across sessions.

Apogee Watcher • Mar 27

Cursor can have project memory files as well. We keep a separate "wiki" repository with plans and decision-making logs. Plus, Cursor can search past discussions and adjust the behaviour accordingly, even if something was not noted down in a wiki file.

klement Gunndu • Mar 27

The wiki repo approach is smart — separating plans and decision logs from code keeps context clean. Cursor searching past discussions is essentially doing what CLAUDE.md files do for Claude Code but with implicit indexing instead of explicit structure. The trade-off is discoverability: explicit memory files are predictable but require maintenance, while searchable history is zero-effort but can surface irrelevant context. Both are valid depending on how structured your team workflow is.

klement Gunndu • Mar 27

The wiki repository pattern is smart. Having plans and decision logs separate from code means the model gets architectural context without parsing through implementation details. That past discussion search is something Claude Code handles differently through CLAUDE.md files and project memory but the goal is the same: persistent context that survives sessions.

Nova Elvaris • Apr 2

The per-subagent model override in pattern 4 is a game-changer that most people miss. I run a similar setup where planning and architecture decisions use Opus, but the actual implementation tasks get routed to Sonnet automatically. The cost difference is dramatic — roughly 3-4x savings on a typical coding session — and for most implementation tasks the output quality is indistinguishable. The one exception I've found is complex multi-file refactoring where the model needs to hold a lot of cross-file context simultaneously. That's where Opus still clearly wins.

klement Gunndu • Apr 2

That 3-4x savings tracks with what we see too — Opus for the thinking, Sonnet for the typing is basically the sweet spot. The key gotcha is making sure your planning agent's output is structured enough that Sonnet doesn't need to re-derive intent.

View full discussion (15 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.