RAXXO Studios

Posted on Apr 15 • Originally published at raxxo.shop

Claude Opus 4 and Sonnet 4 Retire June 15

#ai #automation #claudecode #productivity

Anthropic deprecated claude-opus-4 and claude-sonnet-4 on April 14 with retirement set for June 15
Both models get replaced by their 4.6 versions with 1M context, adaptive thinking, and higher output limits
Opus 4.6 supports 128K max output tokens versus Opus 4's 32K, and 1M context is included at standard pricing
Sonnet 4.6 scores 79.6% on SWE-bench Verified, nearly matching Opus at 80.8% for most coding tasks
The biggest API change is adaptive thinking replacing manual budget_tokens, which requires code updates

On April 14, Anthropic officially deprecated Claude Opus 4 and Claude Sonnet 4. Both models retire on June 15, 2026. After that date, every API request using claude-opus-4-20250514 or claude-sonnet-4-20250514 returns an error. No fallback. No grace period.

If you have production systems running on either model, you have exactly 60 days to migrate. Here is what changes, what breaks, and how to handle the transition without downtime.

What Is Being Deprecated

Two specific model IDs are affected:

| Model | Status | Retirement Date |

|-------|--------|----------------|

| claude-opus-4-20250514 | Deprecated | June 15, 2026 |

| claude-sonnet-4-20250514 | Deprecated | June 15, 2026 |

The replacements are claude-opus-4-6 and claude-sonnet-4-6, both released in February 2026. These are not aliases or minor updates. They are different model generations with different capabilities, different defaults, and at least one breaking API change.

What You Gain by Migrating

The 4.6 models are not just newer. They are substantially better across every metric that matters for production work.

1M Context Window at Standard Pricing. Opus 4 and Sonnet 4 required a beta header and long-context pricing to use the 1M token context window. The 4.6 versions include it at standard pricing with no beta header. Requests over 200K tokens just work.

Higher Output Limits. Opus 4.6 supports 128K max output tokens, up from Opus 4's 32K. Sonnet 4.6 supports 64K. For code generation, long-form content, and structured data extraction, this is a 4x improvement on Opus and a 2x improvement on Sonnet.

Adaptive Thinking. Both 4.6 models recommend adaptive thinking (thinking: {type: "adaptive"}), where Claude dynamically decides when and how deeply to think. This replaces the manual budget_tokens approach from Opus 4, which required you to guess how many tokens Claude needed for reasoning.

Better Benchmarks. Sonnet 4.6 scores 79.6% on SWE-bench Verified. Opus 4.6 scores 80.8%. For most coding tasks, the gap between the two is negligible. Both outperform their predecessors by significant margins.

300K Batch Output. The Message Batches API now supports up to 300K output tokens for both 4.6 models with the output-300k-2026-03-24 beta header. Long-form generation at batch pricing makes large-scale content and data processing dramatically cheaper.

What Breaks When You Switch

Not everything is a drop-in replacement. Three changes require code modifications.

Adaptive Thinking Replaces budget_tokens

If you use extended thinking with Opus 4 and set budget_tokens manually, that parameter is deprecated on Opus 4.6. The recommended approach is:


# Old (Opus 4)
response = client.messages.create(
    model="claude-opus-4-20250514",
    thinking={"type": "enabled", "budget_tokens": 8192},
    max_tokens=4096,
    messages=[{"role": "user", "content": "Analyze this code."}]
)

# New (Opus 4.6)
response = client.messages.create(
    model="claude-opus-4-6",
    thinking={"type": "adaptive"},
    max_tokens=4096,
    messages=[{"role": "user", "content": "Analyze this code."}]
)

Adaptive thinking means Claude decides whether to think and how much. For most workloads, this produces better results with fewer wasted tokens. But if you relied on precise control over thinking budget for cost management, you need to adjust your approach.

The effort parameter (effort: "low" | "medium" | "high") is now the recommended way to control thinking depth instead of raw token budgets.

No Assistant Message Prefilling

Opus 4.6 does not support prefilling assistant messages. If your application starts Claude's response with specific text to guide the output format, this will not work with Opus 4.6. You need to move that guidance into the system prompt or user message instead.

Sonnet 4.6 still supports prefilling, so this only affects Opus migrations.

output_format Moved to output_config.format

If you use structured outputs, the parameter location changed:


# Old
response = client.messages.create(
    output_format={"type": "json", "schema": my_schema},
    ...
)

# New
response = client.messages.create(
    output_config={"format": {"type": "json", "schema": my_schema}},
    ...
)

The old location still works during a transition period, but updating now prevents issues later.

Migration Checklist

Here is a step-by-step plan for migrating without production incidents.

Week 1: Audit. Find every place your codebase references the old model IDs. Check environment variables, configuration files, CI/CD pipelines, and any hardcoded strings. The Console Usage page has an Export button that shows usage broken down by model, which catches references you might miss in code.


# Quick grep for old model references
grep -r "claude-opus-4-20250514\|claude-sonnet-4-20250514" .
grep -r "claude-opus-4\b\|claude-sonnet-4\b" . --include="*.py" --include="*.ts" --include="*.yaml"

Week 2: Test. Run your evaluation suite against the 4.6 models in a staging environment. Pay attention to:

Output format consistency (especially if you parse structured responses)
Thinking token usage (adaptive vs. fixed budget)
Any prefilled assistant message patterns
Cost per request (4.6 models may use tokens differently)

Week 3: Gradual Rollout. Switch non-critical paths first. Internal tools, development environments, batch processing jobs. Monitor for regressions before touching customer-facing systems.

Week 4: Production. Update production model references. Keep the old model ID in a commented-out fallback for one week in case you need to diagnose issues by comparing outputs.

Before June 1: Clean Up. Remove all references to deprecated model IDs. Update documentation. Notify downstream consumers if you expose model selection to users.

Cost Implications

Pricing between the generations is identical for standard requests:

| Model | Input (per 1M tokens) | Output (per 1M tokens) |

|-------|----------------------|----------------------|

| Opus 4.6 | 15 USD | 75 USD |

| Sonnet 4.6 | 3 USD | 15 USD |

The cost difference comes from behavior changes. Adaptive thinking may use more or fewer thinking tokens depending on task complexity. If you had a tight budget_tokens cap that kept Opus 4 costs predictable, adaptive thinking might increase costs on complex tasks while decreasing them on simple ones.

Monitor your usage closely during the first week after migration. The Console Usage page shows token breakdowns that help you spot unexpected changes.

The 1M context window moving to standard pricing is a net cost reduction for anyone who was paying the long-context premium. If you were using the beta header with Sonnet 4 or Sonnet 4.5 for 1M context, switching to Sonnet 4.6 eliminates that surcharge.

The Bigger Pattern

This is the sixth deprecation cycle Anthropic has run since September 2024. The cadence is clear: new models launch, old models get 60 days notice, then they shut down. Anthropic has published a commitment to long-term preservation of model weights, but operational API access ends on the retirement date.

For production systems, this means model migration is not a one-time task. It is a recurring maintenance item. The teams that handle it best are the ones who abstract their model selection behind a configuration layer rather than hardcoding model IDs throughout the codebase.


# config.py
CLAUDE_MODEL = os.environ.get("CLAUDE_MODEL", "claude-sonnet-4-6")
CLAUDE_REASONING_MODEL = os.environ.get("CLAUDE_REASONING_MODEL", "claude-opus-4-6")

One environment variable change instead of a codebase-wide find-and-replace. This is not over-engineering. It is preparation for the next deprecation cycle, which will come in roughly 6 months based on Anthropic's current pace.

Also Retiring: 1M Context for Older Sonnet Models

A related change that is easy to miss: the 1M token context window beta for Claude Sonnet 4.5 and Claude Sonnet 4 retires on April 30, 2026. After that date, the context-1m-2025-08-07 beta header will have no effect on these models. Requests exceeding 200K tokens will return an error.

If you rely on long-context processing with Sonnet 4 or Sonnet 4.5, you need to migrate to Sonnet 4.6 before April 30, not June 15. That is two weeks away, not two months.

Sonnet 4.6 includes 1M context at standard pricing with no beta header. This is actually a better deal than the previous arrangement, where long-context requests incurred premium pricing. The migration removes a cost surcharge while giving you a more capable model.

Haiku 3 Is Also on the Clock

While the headline is Opus 4 and Sonnet 4, Claude Haiku 3 (claude-3-haiku-20240307) is deprecated with retirement on April 20, 2026. That is 5 days from now. If you have any systems still running Haiku 3, that migration is more urgent than the June 15 deadline.

The replacement is Haiku 4.5 (claude-haiku-4-5-20251001), which is faster, more capable, and priced comparably. The upgrade path is a model ID swap with no API changes.

Timeline

| Date | Event |

|------|-------|

| April 14, 2026 | Opus 4 and Sonnet 4 deprecation announced |

| April 20, 2026 | Haiku 3 retired (API requests fail) |

| April 30, 2026 | 1M context beta ends for Sonnet 4 and 4.5 |

| June 1, 2026 | Recommended migration deadline for Opus 4 and Sonnet 4 |

| June 15, 2026 | Opus 4 and Sonnet 4 retired (API requests fail) |

The clock is running. Haiku 3 users have days, not weeks. Sonnet long-context users have two weeks. Opus 4 and Sonnet 4 users have two months. Regardless of which bucket you fall into, start your audit this week. The migration is straightforward for most applications, but testing takes time. Do not wait until the deadline.

DEV Community