brian austin

Posted on Apr 2

Claude Code memory: how to survive a 200k context window filling up

#productivity #ai #claudecode #programming

Claude Code memory: how to survive a 200k context window filling up

If you've used Claude Code for more than a few hours on a big project, you've hit this wall.

You're in the middle of a refactor. Claude is tracking 15 files, your CLAUDE.md, the conversation history, tool call outputs. Then it slows down. Responses get shorter. It starts forgetting things you told it an hour ago.

You're not imagining it. Claude Code's context window is filling up — and there's a specific way to handle it.

What's actually consuming your context

ClaudeCode tracks several layers of context simultaneously:

┌─────────────────────────────────────┐
│ System prompt (CLAUDE.md)     ~2k   │
│ Project context (settings)    ~1k   │
│ Conversation history          fills │
│ Tool call results             large │
│ File contents (read_file)     large │
│ Available: 200k total               │
└─────────────────────────────────────┘

The biggest culprits:

Tool call results — every read_file, bash, grep appends its full output
Long conversation threads — each message adds up
Repeated file reads — Claude re-reads the same files multiple times

The early warning signs

Before the window fills completely, you'll notice:

Responses get shorter and less specific
Claude starts asking you to re-explain things you already covered
read_file outputs get truncated
Multi-step plans lose track of earlier steps
Claude starts hedging more: "I'm not sure if we already..."

Strategy 1: The /clear command

The nuclear option. Wipes the conversation history entirely.

/clear

But you lose all context — including the reasoning behind decisions. Use this as a last resort.

Strategy 2: Checkpoint summaries

Before hitting the limit, ask Claude to summarize its own progress:

Before we continue, write a markdown summary of:
1. What we've accomplished so far
2. What files we've modified and why
3. What the next 3 steps are
4. Any decisions or constraints I should know about

Save it to PROGRESS.md

Now you can /clear and paste the summary back in. You lose the conversation but keep the knowledge.

Strategy 3: Compact the context mid-session

Instead of clearing everything, compress it:

Summarize everything we've done in this session into a single paragraph I can paste 
into a fresh conversation to continue. Be extremely concise — just the decisions, 
changes made, and current state.

This gives you a handoff document that fits in ~500 tokens instead of 50k.

Strategy 4: Prevent it with targeted reads

Instead of:

# Bad: loads entire files into context
read the codebase and understand how auth works

Do:

# Good: surgical reads
grep -n 'auth\|login\|token' src/routes/*.js | head -30

Prompt Claude to use grep/find before read_file. It gets the answer with 1/10th the context cost.

Strategy 5: Use subagents for isolated tasks

For work that doesn't need full project context, spin up a subagent:

Create a subagent that ONLY has access to src/utils/format.js.
Its only job: add JSDoc comments to every function.
Report back the changes when done.

The subagent runs in its own context window. The results come back to your main session as a compact summary, not a 10k token diff.

Strategy 6: CLAUDE.md context pruning

Your CLAUDE.md gets loaded every session. Keep it under 500 lines.

Periodically audit:

Review my CLAUDE.md and identify:
1. Instructions that are outdated
2. Instructions that are redundant
3. Instructions that could be shorter

Suggest a pruned version under 200 lines.

Strategy 7: The session handoff pattern

For long projects, end each session with a handoff ritual:

# In CLAUDE.md, add this section:

## Session Handoff Protocol
At the end of each session, create SESSION-NOTES.md with:
- What was accomplished
- What was NOT done (and why)
- Current blockers
- Next session starting point
- Any important context future sessions need

This externalizes memory to the filesystem, where it's free.

What about rate limits?

Here's the thing that's frustrating with standard Claude access: hitting context limits often coincides with hitting rate limits, because you're doing your most intensive work.

If you're using Claude Code via ANTHROPIC_BASE_URL pointed at a flat-rate proxy:

export ANTHROPIC_BASE_URL=https://api.simplylouie.com

You get unlimited requests at $2/month — so at least rate limits stop compounding the context window problem.

TL;DR: the memory survival kit

1. Watch for early warning signs (shorter responses, hedging)
2. Use checkpoint summaries before /clear
3. Surgical grep instead of full file reads
4. Subagents for isolated tasks
5. Keep CLAUDE.md under 200 lines
6. End every long session with SESSION-NOTES.md

The 200k context window feels huge until you're doing real work. These patterns keep you productive across the full project lifecycle.

Claude Code power user? Try pointing ANTHROPIC_BASE_URL at simplylouie.com for flat-rate access — no rate limits, $2/month.

DEV Community

Claude Code memory: how to survive a 200k context window filling up

Claude Code memory: how to survive a 200k context window filling up

What's actually consuming your context

The early warning signs

Strategy 1: The /clear command

Strategy 2: Checkpoint summaries

Strategy 3: Compact the context mid-session

Strategy 4: Prevent it with targeted reads

Strategy 5: Use subagents for isolated tasks

Strategy 6: CLAUDE.md context pruning

Strategy 7: The session handoff pattern

What about rate limits?

TL;DR: the memory survival kit

Top comments (0)