Henry Godnick

Posted on Mar 13

Claude Code Just Got 1M Context — Here's Why That Makes Cost Tracking More Important, Not Less

#ai #productivity #buildinpublic #programming

Anthropic just quietly made 1 million token context generally available for Claude Opus 4.6 and Sonnet 4.6 — and if you're on a Max plan with Claude Code, that context is included at no extra cost.

This is genuinely exciting. But it also creates a problem that almost nobody is talking about yet.

Bigger Context = Bigger Bills (If You're Not Careful)

Here's the math that most developers aren't doing:

Before: A typical Claude Code session might use 50-100K tokens of context
Now: That same session can silently expand to 500K+ tokens if you let it load your entire codebase
The difference: A 5-10x increase in per-request cost that happens invisibly

With the old context limits, you were naturally constrained. The model would compact, you'd hit limits, and that friction was actually protecting your wallet. With 1M context, those guardrails are gone.

The "Just Throw Everything In" Trap

I've been watching this happen in real time with my own usage. The temptation with larger context is to stop being selective about what you include. Why carefully curate your context when you can just dump your entire repo in?

Three reasons:

Cost: Each additional token of context costs money on every single request, not just the first one
Quality: Models actually perform worse with irrelevant context — more noise means more hallucinations
Speed: Larger context = slower responses, which means you're paying more AND waiting longer

What I Changed

After tracking my per-request costs for a few weeks (I use TokenBar — a menu bar app that shows real-time token costs on macOS), I noticed some patterns:

Expensive sessions (>$5/session):

Loaded full repo context "just in case"
Let the agent run autonomously without stop conditions
Didn't specify what files were relevant

Cheap sessions (<$0.50/session):

Told Claude Code exactly which files to look at
Set clear stop conditions
Used focused prompts instead of "figure it out"

The difference wasn't in the quality of output — it was entirely in how much unnecessary context I was feeding in.

The Productivity Paradox

This connects to something bigger I've been thinking about: the tools that make us most productive can also make us most wasteful.

Same thing happens with attention. The infinite scroll feeds that "keep us connected" are the same ones that drain 3 hours before we notice. I've been using a feed-level blocker on my Mac (Monk Mode) that doesn't block apps entirely — just removes the algorithmic scroll. The app opens, you can search and post, but the endless feed is gone.

The parallel is striking:

AI context: More available ≠ more useful. Be intentional about what you include.
Information feeds: More available ≠ more informed. Be intentional about what you consume.

5 Rules for the 1M Context Era

Don't load what you don't need. Just because you can include your entire codebase doesn't mean you should.
Track your costs per session. You can't optimize what you don't measure. Even just checking your API dashboard daily changes behavior.
Set explicit context boundaries. Tell Claude Code which directories and files are relevant before starting.
Watch for context creep. Sessions that start focused tend to expand over time. Reset periodically.
Compare cost vs. outcome. A $0.30 focused session often produces better code than a $5 kitchen-sink session.

The Bottom Line

1M context is a genuine capability upgrade. It means Claude Code can now understand entire codebases in a single pass, handle massive refactors, and maintain coherence across huge projects.

But capability without visibility is just expensive guessing. Track what you spend, be intentional about what you include, and treat your context window like your attention — the more you have available, the more important it becomes to use it wisely.

What's your experience been with larger context windows? Are you tracking your AI tool costs? Drop your approach in the comments.

Top comments (2)

Apex Stack • Mar 14

This maps exactly to what I've been seeing running Claude Code agents across a large programmatic SEO site (100k+ pages, 12 languages). Your expensive vs. cheap session breakdown is spot on, but I'd add a dimension you didn't mention: the cost difference between generation tasks and analysis tasks.

When I use Claude Code for content generation — writing stock analysis pages, building automation skills, structuring data pipelines — the larger context is genuinely valuable because the model needs to understand the full template system, data schema, and output format simultaneously. Those sessions run $3-8 and the output quality justifies it.

But for analysis tasks — debugging a build error, reviewing a single component, answering a specific question about a codebase pattern — dumping the full repo context is pure waste. We learned this the hard way. A focused "look at this file and tell me why the hreflang tags aren't rendering" costs pennies. The same question with full repo context costs dollars and actually gives worse answers because the model gets distracted by irrelevant code paths.

Your Rule #3 (set explicit context boundaries) is the one that moved the needle most for us. We started treating Claude Code sessions like database queries — you wouldn't SELECT * FROM every_table to answer a question about one row. Same principle applies to context.

The attention/feed analogy is interesting too. We see the same pattern with crawl budgets in SEO: Google gives you a finite crawl budget, and the sites that waste it on low-value pages get penalized the same way developers waste token budgets on irrelevant context. Constraint forces prioritization, and that's usually a good thing.

Would be curious what your per-session cost distribution looks like over a month. I bet it follows a power law — a few expensive sessions eating most of the budget.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.