You're 45 minutes into a complex build session. Claude Code has made 30 tool calls, written several files, run tests, iterated on a tricky design problem. Then you notice it — the agent seems to have forgotten something it knew 20 minutes ago. You re-explain. It re-derives. You wonder if something broke.
Nothing broke. Context compaction happened.
Understanding it takes about five minutes. Not understanding it wastes hours.
What Context Compaction Is
Claude Code runs inside a context window with a fixed token limit. Your conversation — every message, every tool call, every tool result — fills that window. When it gets full, something has to give.
Compaction is Claude Code's answer: it automatically summarizes the oldest turns in your conversation to free up token budget for new work. The recent turns (last N exchanges) stay verbatim. The older turns get compressed into a summary.
This is not a bug. It's the only way a stateful agent can handle long sessions without hitting hard limits.
The problem is that compaction is invisible. There's no notification. No checkpoint marker. The session just... continues, and some things you said or established early in the session are now only partially represented in what the model can see.
What Gets Lost vs. What Survives
This is the part that matters.
Survives compaction: Anything written to disk. Files you created, edits you made, scripts that ran — all of that is in the filesystem. Claude Code can read it back at any point. The actual work product is safe.
Survives compaction: The most recent turns. The current active context — what's in the last few exchanges — is not summarized. Compaction only touches old turns.
Degrades through compaction: In-context state that was never externalized. Instructions you gave verbally and never wrote to a file. Constraints you established in early messages. Design decisions that were discussed but never committed to code or documentation. Nuanced context about why a particular approach was chosen.
The summary that replaces those early turns captures broad strokes. "User asked agent to build X with Y approach" — but the twenty specific constraints and edge cases discussed in those turns may not all survive at full fidelity.
This is where sessions go wrong.
The Three Patterns That Prevent Compaction Loss
These are not theoretical. I run Claude Code agents autonomously — no human watching the session — so compaction loss is a real operational risk, not just a developer annoyance.
Pattern 1: Write progress to files before long operations
Before any operation that will consume significant context (running tests, doing a large refactor, invoking a sub-agent), write the current state to a log file. Not a summary — the actual decisions.
# Session State — 2026-04-09 14:32
## Active constraints
- Auth module: must use existing JWT library, no new deps
- Database: read-only in this session, no schema changes
- Output format: existing consumers expect snake_case keys
## Current task
Refactoring user service to support multi-tenant lookup.
Strategy decided: tenant_id prefix on all cache keys.
## Known issues
- Line 847 of user_service.py has a race condition — DO NOT modify that block
That file is now in the filesystem. If compaction swallows the conversation where you established those constraints, the agent can Read the state file and recover the full picture.
Pattern 2: Use the TodoWrite tool to externalize in-flight state
Claude Code's native TodoWrite tool maintains a task list that persists through the session independently of the conversation. It's not affected by compaction. Think of it as an external working memory.
Any time you establish a multi-step plan at the start of a session, get it into the todo list immediately. The agent can always read back what it committed to, even if the exchange where you established the plan is now in compacted form.
This is especially important for:
- Sequences of tasks where each step depends on earlier decisions
- Sessions where you'll be away while the agent works (the plan was agreed at the start, compaction shouldn't lose it)
- Any constraint that must persist for the entire session duration
Pattern 3: Write intermediate results to outputs/, not just memory
This is the simplest one, and the most often skipped: make the agent write meaningful intermediate results to disk as it goes.
Not just the final output. Not just on error. At natural checkpoints — "here's what I found", "here's the decision I made and why", "here's the current state of X" — write it to a file.
# Instead of telling the agent in conversation:
# "Good, now remember that we decided to use approach A because..."
# Write it:
cat > ~/project/outputs/decision_auth_approach.md << 'INNER_EOF'
## Auth approach decision — 2026-04-09 14:47
Chose approach A (session tokens + refresh) over B (stateless JWT only)
because: existing mobile clients rely on revocation. B would break them.
INNER_EOF
The agent can always read files. It cannot always read the full original conversation.
Recovering When Compaction Already Happened
If you notice the agent acting as if it forgot something:
Don't re-explain in conversation. Re-explaining fills more context with the same content.
Write it to a file instead. Create
session_context.mdwith the constraints and decisions that need to survive. Have the agent Read it.Ask the agent to summarize its current understanding. It will surface what's missing. Fill those gaps in the file, not the conversation.
Check your outputs/. If you established the write-intermediate-results pattern earlier in the session, the relevant decisions are probably already there.
The Underlying Principle
Compaction is a reminder that the conversation is ephemeral. The filesystem is not.
Anything that needs to outlast the session — or outlast compaction — belongs on disk. This is true for long sessions with a human in the loop. It's even more true for fully autonomous agent runs where no one is watching.
The agents that fail on long tasks are usually the ones treating the conversation as their primary storage. The ones that succeed externalize state continuously, so compaction is just a token management detail rather than a threat to coherence.
The model doesn't need to remember everything. It needs to be able to find everything. Give it files to read, not context to hold.
~K¹ (W. Kyle Million) / IntuiTek¹ — Building autonomous AI infrastructure for solo operators.
Top comments (0)