DEV Community

Cover image for Pick the Right Claude Code Model for Every Task
klement Gunndu
klement Gunndu

Posted on

Pick the Right Claude Code Model for Every Task

Claude Code supports three model tiers, seven aliases, four effort levels, and per-subagent model overrides. Most developers use the default for everything. That means they're paying Opus prices for tasks Haiku handles in half the time.

This guide covers 5 patterns for matching the right model to the right task in Claude Code, based on the official model configuration system as of March 2026.

The Three Tiers and When Each Wins

Claude Code gives you three model families, each with a different speed-cost-quality tradeoff:

Model Best For Speed Cost
Haiku File search, simple refactors, quick lookups Fastest Lowest
Sonnet Daily coding, implementation, test writing Balanced Medium
Opus Architecture decisions, complex debugging, multi-file refactoring Slowest Highest

The default model depends on your subscription. Max and Team Premium plans default to Opus 4.6. Pro and Team Standard default to Sonnet 4.6. Claude Code may automatically fall back to Sonnet if you hit a usage threshold with Opus.

Knowing this matters because many developers on Max plans run Opus for every task, including tasks where Haiku would finish 3x faster with identical results.

Pattern 1: Switch Models Mid-Session With /model

You don't need to restart Claude Code to change models. The /model command switches instantly:

# Planning a complex migration? Switch to Opus
/model opus

# Now implementing the plan? Switch to Sonnet
/model sonnet

# Quick file search or simple rename? Switch to Haiku
/model haiku
Enter fullscreen mode Exit fullscreen mode

You can also start a session with a specific model:

# Start with Haiku for a quick task
claude --model haiku

# Start with Opus for a design session
claude --model opus
Enter fullscreen mode Exit fullscreen mode

Or set it permanently in your settings file (~/.claude/settings.json):

{
  "model": "sonnet"
}
Enter fullscreen mode Exit fullscreen mode

The priority order is: /model during session > --model flag > ANTHROPIC_MODEL environment variable > settings file.

This alone saves significant time. If you're in the middle of implementing a feature and need to look up how a module works, switching to Haiku for that exploration means you get answers faster and spend less context on a task that doesn't need deep reasoning.

Pattern 2: Use opusplan for Automatic Routing

The most underused model alias in Claude Code is opusplan. It uses Opus when you're in plan mode and automatically switches to Sonnet for execution:

claude --model opusplan
Enter fullscreen mode Exit fullscreen mode

Here's why this matters. Planning requires the model to reason about architecture, weigh tradeoffs, and consider edge cases. That's where Opus excels. But once the plan is set, implementation is largely mechanical: write the code, follow the patterns, run the tests. Sonnet handles that efficiently.

With opusplan, you get Opus-quality planning and Sonnet-speed execution without manually switching. The transition happens automatically when you exit plan mode.

To enter plan mode during a session, use:

/plan
Enter fullscreen mode Exit fullscreen mode

Claude Code switches to read-only exploration mode. If you're on opusplan, the model upgrades to Opus for this phase. When you approve the plan and exit plan mode, it drops back to Sonnet for implementation.

This is the closest thing to automatic model routing in Claude Code today, and it works well for the most common workflow: think first, then build.

Pattern 3: Control Effort Levels

Model selection isn't the only lever. Effort levels control how much thinking the model does per turn, independent of which model you're using.

Three levels persist across sessions: low, medium, and high. A fourth level, max, is available on Opus 4.6 only and does not persist.

# Quick formatting fix? Low effort
/effort low

# Standard coding work? Medium (default)
/effort medium

# Hard debugging or architecture? High effort
/effort high

# The hardest problem in the codebase? Max (Opus only)
/effort max
Enter fullscreen mode Exit fullscreen mode

You can also set effort at startup:

claude --effort high
Enter fullscreen mode Exit fullscreen mode

Or via environment variable:

export CLAUDE_CODE_EFFORT_LEVEL=high
Enter fullscreen mode Exit fullscreen mode

Or in your settings file:

{
  "effortLevel": "medium"
}
Enter fullscreen mode Exit fullscreen mode

The key insight: medium effort is the recommended default for most coding tasks. Higher effort levels can cause the model to overthink routine work, actually making it slower without improving quality.

Reserve high or max for tasks that genuinely benefit from deeper reasoning: tracing a complex bug across multiple files, designing a new system architecture, or debugging a race condition.

For one-off deep reasoning without changing your session setting, include "ultrathink" in your prompt. This triggers high effort for that single turn.

The combination of model tier and effort level gives you a 2D control surface. Haiku at low effort is near-instant for simple lookups. Opus at max effort is the deepest reasoning available, but takes longer and costs more. Matching both dimensions to the task is what separates efficient Claude Code usage from wasteful usage.

Pattern 4: Route Models to Subagents

This is where model routing gets powerful. Claude Code subagents can each run on a different model, set in their definition file.

Here's a practical example. You want a code review agent that runs on Sonnet (good enough for pattern matching and style checks) and a research agent that runs on Haiku (fast lookups, no need for deep reasoning):

Create .claude/agents/code-reviewer.md:

---
name: code-reviewer
description: Reviews code for quality and best practices. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
model: sonnet
---

You are a senior code reviewer. When invoked:
1. Run git diff to see recent changes
2. Focus on modified files
3. Review for clarity, security, error handling, and test coverage

Provide feedback organized by priority:
- Critical issues (must fix)
- Warnings (should fix)
- Suggestions (consider improving)
Enter fullscreen mode Exit fullscreen mode

Create .claude/agents/researcher.md:

---
name: researcher
description: Fast codebase exploration and documentation lookup.
tools: Read, Grep, Glob
model: haiku
---

You are a fast research assistant. Find files, search code,
and answer questions about the codebase structure.
Return concise summaries, not full file contents.
Enter fullscreen mode Exit fullscreen mode

Now your main session can run on Opus for complex work, while delegating reviews to Sonnet and lookups to Haiku. Each subagent runs in its own context window, so the verbose output from exploration doesn't consume your main conversation's context.

Claude Code's built-in subagents already follow this pattern. The Explore agent runs on Haiku for fast, read-only codebase searches. The Plan agent inherits your session model for research during plan mode. The general-purpose agent also inherits your model for complex multi-step tasks.

You can also set a global default for all subagents via environment variable:

export CLAUDE_CODE_SUBAGENT_MODEL=haiku
Enter fullscreen mode Exit fullscreen mode

This overrides the model for every subagent invocation, regardless of what's set in their definition files. Useful for cost control during development.

The model resolution order for subagents is:

  1. CLAUDE_CODE_SUBAGENT_MODEL environment variable
  2. Per-invocation model parameter (set by Claude when delegating)
  3. Subagent's model frontmatter field
  4. Main conversation's model

Pattern 5: Pin Models for Team Consistency

When multiple developers share a project, model inconsistency becomes a real problem. One developer tests on Opus, another on Haiku. Code that works with Opus-level reasoning might fail when a teammate runs it with Haiku.

Pin models with environment variables to ensure everyone uses the same configuration:

# In your team's .envrc or shell profile
export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-6"
export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5-20251001"
Enter fullscreen mode Exit fullscreen mode

For enterprise deployments on AWS Bedrock or Google Vertex AI, this is critical. Without pinning, Claude Code uses aliases that resolve to the latest version. When Anthropic releases a new model, users whose accounts don't have the new version enabled will break silently.

You can also restrict which models your team can select:

{
  "availableModels": ["sonnet", "haiku"]
}
Enter fullscreen mode Exit fullscreen mode

This prevents developers from switching to models outside the approved list via /model, --model, or environment variables. The default model for their subscription tier always remains available.

For full control, combine both settings:

{
  "model": "sonnet",
  "availableModels": ["sonnet", "haiku"]
}
Enter fullscreen mode Exit fullscreen mode

This ensures everyone starts on Sonnet and can only switch between Sonnet and Haiku. No surprise Opus bills. No inconsistent behavior across the team.

The Decision Framework

Here's how to decide which model and effort to use for common tasks:

Task Model Effort Why
Find a file or function Haiku Low Pure lookup, no reasoning needed
Rename a variable across files Haiku Low Mechanical, well-defined scope
Write a new function Sonnet Medium Needs context awareness, not deep reasoning
Write tests for existing code Sonnet Medium Pattern matching against existing code
Debug a failing test Sonnet High Needs reasoning about causality
Design a new module architecture Opus High Multi-factor tradeoff analysis
Refactor across 10+ files Opus High Needs to hold full dependency graph in context
Plan then implement a feature opusplan Auto Opus for the plan, Sonnet for the code

The 1M context window option (opus[1m] or sonnet[1m]) is available for long sessions with large codebases. On Max, Team, and Enterprise plans, Opus automatically gets 1M context. For Sonnet or other plans, it requires extra usage billing.

# Explicitly request 1M context
/model opus[1m]
/model sonnet[1m]
Enter fullscreen mode Exit fullscreen mode

Start Here

If you're currently using the default model for everything, start with one change: switch to opusplan. It gives you the best of both tiers with zero manual switching.

Then add effort levels. Most of your work is /effort medium. When you hit a hard problem, /effort high gives you deeper reasoning without changing models.

Finally, create subagent definitions for your most common delegated tasks. A Haiku-powered explorer and a Sonnet-powered reviewer cover 80% of subagent use cases.

The goal isn't to use the cheapest model everywhere. It's to use the right model for each task, so you get faster answers on simple work and deeper reasoning on hard problems.


Follow @klement_gunndu for more Claude Code content. We're building in public.

Top comments (6)

Collapse
 
ai_made_tools profile image
Joske Vermeulen

nice write-up

Collapse
 
klement_gunndu profile image
klement Gunndu

Glad it landed — if you've been using model routing in practice, curious whether you default to Haiku for most tasks or find yourself reaching for Sonnet more than expected.

Collapse
 
ai_made_tools profile image
Joske Vermeulen

I'm going to be honest with you. My company pays my Claude and I tend to default to Opus most of the time :D

Collapse
 
apogeewatcher profile image
Apogee Watcher

Still on Cursor/Composer and happy with it. v2 is Opus-level.

Collapse
 
klement_gunndu profile image
klement Gunndu

Cursor v2 is solid — the agent loop improvements really closed the gap. Where Claude Code pulls ahead for me is the terminal-native workflow and CLAUDE.md project memory that persists across sessions.

Collapse
 
apogeewatcher profile image
Apogee Watcher

Cursor can have project memory files as well. We keep a separate "wiki" repository with plans and decision-making logs. Plus, Cursor can search past discussions and adjust the behaviour accordingly, even if something was not noted down in a wiki file.