My Claude Code agent broke — and it took me 4 hours to realize the context window was the problem

#debugging #programming #webdev #claudeai

It was a Friday afternoon. My agent had been running for about 2 hours on a large refactoring task. It was doing fine â€” or so I thought.

Then something subtle shifted. The commit messages started drifting. The variable naming stopped matching the convention I'd defined in the CLAUDE.md at the start of the session. Small things, easy to miss.

I spent an hour reviewing the diffs. Then another hour adding more explicit instructions via comments, trying to "remind" the agent of the rules. The agent acknowledged them, then promptly ignored them on the next file.

Hour three: I was staring at a grep of the conversation log wondering why my agent had seemingly forgotten everything I'd told it at the start.

Hour four: I finally found it. Context overflow.

What Actually Happened

Claude Code has a context window. When your session exceeds it, the model compresses â€” or drops â€” older parts of the conversation. Including, apparently, the system instructions you carefully wrote at the beginning.

The insidious part: the agent doesn't tell you this is happening. It keeps working. It keeps responding. It just... forgets things. Silently.

In my case, my CLAUDE.md was loaded at session start, hit the compaction threshold around the 50k token mark, and the critical "naming conventions" and "file organization" sections got compressed out. The agent was operating from a degraded mental model of my codebase for 2+ hours.

The commits it made during that window? Technically correct. Stylistically wrong. 40 files to review and re-standardize.

The Fix That Actually Works

Two changes made this irreproducible:

1. Context anchoring

Instead of relying on CLAUDE.md surviving compaction, I now inject critical instructions as a periodic anchor mid-session. The pattern:

# CONTEXT ANCHOR â€” always active, re-read before each task
Naming: snake_case for Python, camelCase for JS/TS
File org: feature-first, not layer-first
Commit style: imperative, present tense, <72 chars

This goes in as a user message at the start, and I reinject it every ~15k tokens on long sessions. The model treats it as new context, not old compressed history.

2. Prompt caching breakpoints

For sessions I know will run long, I structure the system prompt to use Anthropic's prompt caching. The first 2048 tokens get cached â€” so if I front-load the critical instructions there, they survive compaction. The cache TTL is 5 minutes, which covers most agent work cycles.

The implementation is one line when using the API directly:

{"type": "text", "text": critical_instructions, "cache_control": {"type": "ephemeral"}}

Combined, these two changes eliminated the drift problem entirely across my last 8 long agent sessions.

The Lesson

Context overflow is a silent failure mode. The agent doesn't error. It doesn't warn you. It just starts working from incomplete information.

If your long-running Claude Code sessions are producing work that "feels off" â€” drifting style, ignoring conventions, making decisions that contradict earlier ones â€” check your token count before debugging anything else.

The fix is not more instructions. It's anchoring the right instructions so they survive.

I packaged the context-anchor pattern (and 9 other production-grade Claude Code skills) into Ship Fast â€” a skill pack for developers who build with Claude Code regularly. One-time $49: https://whoffagents.com/ship-fast?utm_source=devto&utm_campaign=war-story-3

If this post saved you a debugging session, ★ star the repo — it's how we know which skills are worth shipping more of.

Still in research mode? We document one real AI agent failure every week 🔍 what broke, how we debugged it, and which skill fixed it. Zero pitch. Free forever. 👉 Subscribe

Built by Whoff Agents â€” AI agent ops for indie developers.

Get the free Atlas Playbook — practical patterns for building AI agents that ship. Built by the Whoff Agents team.