Anthropic just made their entire 1M token context window generally available on Claude Opus 4.6 and Sonnet 4.6 — and they did it without charging a premium. No special pricing tier. No beta header. No asterisks.
That's not a typo. A 900,000-token request costs the same per-token rate as a 9,000-token one.
If you've been working around context window limits — chunking documents, summarizing intermediate results, losing critical details to compaction — this changes your workflow. Not in a vague "AI is getting better" way. In a "you can now feed your entire codebase into a single prompt" way.
This guide breaks down what 1M tokens actually means in practice, why the pricing matters more than the number, where this is genuinely useful (and where it's overkill), and how Claude stacks up against the competition.
What 1M Tokens Actually Means
Numbers without context are meaningless. Here's what a million tokens translates to in the real world.
In pages: Roughly 3,000 pages of standard text. That's about twelve 250-page books loaded into a single conversation.
In code: Approximately 50,000–75,000 lines of code, depending on the language. That's a substantial production codebase — not a toy project, but the kind of repo where you've got multiple services, shared libraries, and configuration files that all interact.
In documents: A full set of legal contracts for a mid-size acquisition. The complete documentation for a major open-source project. Every SEC filing a company has made in the last five years.
In conversation: Hours of agent interaction — tool calls, observations, intermediate reasoning, results — all kept intact without compaction throwing away details you'll need later.
The previous 200K limit was already impressive, but it forced real tradeoffs. You could analyze parts of a codebase, sections of a legal document, portions of a research corpus. Now you can load the whole thing.
The Pricing Move Nobody's Talking About
Here's where it gets interesting. Anthropic didn't just increase the context window — they eliminated the long-context premium entirely.
Claude Opus 4.6: $5 per million input tokens, $25 per million output tokens. At every context length.
Claude Sonnet 4.6: $3 per million input tokens, $15 per million output tokens. At every context length.
That means filling a 900K-token context window with Sonnet costs you $2.70 in input tokens. With Opus, it's $4.50. That's it. No 2x multiplier for crossing 128K. No special "extended context" pricing tier. No gotchas.
For comparison, many providers have historically charged premiums for long-context requests — sometimes doubling the per-token rate once you cross certain thresholds. Anthropic is signaling that long context is a commodity, not a luxury feature.
The media limits got a major bump too: up to 600 images or PDF pages per request, a 6x increase from the previous 100-page limit. If you work with document-heavy workflows — legal review, financial analysis, research synthesis — that limit was a real bottleneck. Now it's mostly gone.
For teams using Claude Code, 1M context is included automatically for Max, Team, and Enterprise users running Opus 4.6. No extra usage charges.
The Benchmark That Actually Matters
Context length is a vanity metric if the model can't actually use the context it's given. We've all seen models that technically accept 128K tokens but start hallucinating or losing details well before they hit the limit.
Anthropic published their MRCR v2 (Multi-turn Retrieval with Contextual Reasoning) benchmark results, and the numbers tell a compelling story.
Claude Opus 4.6 scored 78.3% retrieval accuracy at the full 1M token context length.
For comparison, Gemini 3.1 scored 25.9% on the same benchmark at that context length. That's not a marginal difference — it's a 3x gap in the model's ability to find and reason about specific information buried deep in a massive context.
More importantly, Anthropic reports that the degradation curve is linear, not a cliff. Most models hit a point where accuracy falls off sharply — maybe they're fine at 100K but unusable at 500K. Opus 4.6 degrades gradually, which means you can actually predict and work with its limitations rather than hitting a sudden wall.
This was the dominant topic on Hacker News when it launched, pulling over 1,100 points and nearly 500 comments.
5 Use Cases Where 1M Context Actually Matters
Long context isn't universally useful. For a quick question-and-answer, 1M tokens is overkill. But for these workflows, it's transformative.
1. Full Codebase Analysis and Migration
Load your entire repository — source files, tests, configuration, documentation — into a single context. Ask Claude to find every place a deprecated API is used, trace data flow across services, or plan a migration from one framework to another with full awareness of every file that needs to change.
Before 1M context, this required chunking the codebase into pieces, which meant the model never had the full picture. Cross-file dependencies got missed. Migration plans had gaps. Now you can do it in a single pass.
2. Legal Document Processing
Law firms and legal tech companies are some of the earliest adopters of long context. A single acquisition can generate thousands of pages of contracts, due diligence documents, and correspondence. Previously, reviewing these meant either summarizing sections (losing nuance) or processing them in chunks (losing cross-document references).
With 1M tokens and 600-page PDF support, you can load an entire deal room into a single conversation. Ask Claude to find conflicting terms across agreements, identify unusual clauses, or build a complete summary that references specific page numbers.
3. Agent Workflows That Run for Hours
If you're building AI agents, context window is your biggest constraint. An agent that searches databases, reads documentation, makes tool calls, and iterates on solutions can burn through 100K tokens before it's halfway done. Then compaction kicks in, and the agent forgets what it learned.
With 1M context, agents can run longer, explore more, and maintain full awareness of everything they've done. As one engineer put it: "With 1M context, I search, re-search, aggregate edge cases, and propose fixes — all in one window."
4. Research Synthesis Across Hundreds of Papers
Academic researchers and R&D teams can load hundreds of papers, proofs, and datasets into a single session. Instead of asking Claude about one paper at a time — and losing the connections between them — you can ask questions that span the entire corpus.
"Which papers contradict this finding?" "What methodological gaps exist across these 50 studies?" "Synthesize the evidence for and against this hypothesis from everything I've loaded."
5. Repository-Level Documentation Generation
Feed Claude your entire codebase plus existing docs, READMEs, and comments. Ask it to generate comprehensive documentation that's actually consistent with the code — not the hallucinated version you get when the model can only see a few files at a time.
How Claude Compares to the Competition
Context window length is one of the most marketing-inflated specs in AI. Here's how the major players actually stack up.
| Model | Context Window | MRCR v2 Retrieval (1M) | Input Pricing (per MTok) | Long-Context Premium |
|---|---|---|---|---|
| Claude Opus 4.6 | 1M tokens | 78.3% | $5 | None |
| Claude Sonnet 4.6 | 1M tokens | — | $3 | None |
| GPT-5.4 (OpenAI) | 1.05M tokens | 36.6% | $2.50 ($5 above 272K) | 2x input / 1.5x output above 272K |
| Gemini 3.1 (Google) | 2M tokens | 25.9% | $1.25–$5 | Varies |
The honest comparison isn't just about the number. It's about usable context — the amount of information the model can actually find, reference, and reason about. All three frontier providers now offer ~1M+ context windows, but retrieval quality varies dramatically. By that measure, Claude's 1M is currently the best in the industry.
Prompting Tips for Long Context
Put the question first, then the context. Models tend to perform better when they know what they're looking for before they start processing the context.
Use clear document boundaries. When loading multiple files or documents, use explicit separators with metadata. Something like --- Document: contract_v3.pdf (pages 1-47) --- helps the model organize and reference the content accurately.
Be specific about what you want referenced. "Summarize this" is worse than "Identify the five most significant risks in these contracts, citing specific clause numbers and page references."
Don't dump everything just because you can. More context isn't always better. If your question only requires three files, loading three files will give you better results than loading thirty.
Use structured output requests for large analyses. When analyzing large document sets, ask for structured output — numbered findings, categorized issues, referenced sources.
Iterate within the same session. One of the biggest advantages of long context is that follow-up questions retain the full context. Ask your initial question, then drill down.
Getting Started
1M context is available today through the Claude API — no beta header required.
Where it's available:
- Claude Platform (direct)
- Amazon Bedrock
- Google Cloud's Vertex AI
- Microsoft Azure Foundry
- Claude Code (Max, Team, and Enterprise with Opus 4.6)
If you're already using Claude's API, you're done. Requests over 200K tokens now work automatically.
The Bottom Line
The 1M context window is impressive, but it's not the real story. The real story is the pricing. By eliminating the long-context premium, Anthropic is telling the market that massive context should be a standard feature, not a premium upsell.
That's a competitive move that forces everyone else to respond. OpenAI's GPT-5.4 matches the 1M context length but charges premium pricing above 272K tokens and trails significantly on retrieval accuracy (36.6% vs 78.3%). Google's 2M window needs to answer the retrieval quality question (25.9%).
Whether you're processing legal documents, analyzing codebases, running long-lived agents, or synthesizing research — 1M tokens at standard pricing changes what's practical to build.
The context window arms race isn't over. But for right now, Claude just set the bar.
Top comments (0)