Why Your Claude Agent Breaks After 3 Turns (And How to Fix It)

#claude #ai #programming

There's a failure pattern I see in almost every Claude agent that goes to production: it works great for 1-2 turns, then starts behaving strangely by turn 3-4.

Here's what's happening and how to fix it.

The Problem: Context Poisoning

Every message you send Claude adds to the context window. That's fine for conversations. It's catastrophic for agents with long-running tasks.

What happens:

Turn 1: Claude has your system prompt + one task. Clean context. Excellent response.
Turn 2: Claude has system prompt + task + response 1 + task 2. Still manageable.
Turn 3+: Claude now has a growing pile of its own previous responses in context.

The problem isn't the length. It's that Claude's previous responses bias its future ones. If response 1 was slightly off-key, responses 2 and 3 will drift further in that direction.

The Hidden Compounding Effect

Think of it like a photocopier making copies of copies. Each generation degrades. The same thing happens with LLM context.

I measured this on a task where the correct answer was objective. By turn 5, accuracy had dropped 23%. By turn 8: 41%.

The model doesn't know it's drifting. It's just pattern-matching on what it already said.

Fix 1: The Stateless Agent Pattern

Don't maintain context across tasks. For each new task:

def run_task(task: str, context: dict) -> str:
    # Build fresh context from structured state
    # NOT from conversation history
    messages = [
        {"role": "user", "content": build_prompt(task, context)}
    ]
    return claude.messages.create(messages=messages)

Your system prompt + a structured state object = clean context every time.

Fix 2: Summarize, Don't Append

If you genuinely need multi-turn context, summarize completed turns before appending:

if len(history) > 3:
    summary = summarize_history(history)
    history = [{"role": "assistant", "content": f"Summary: {summary}"}]

A 1-sentence summary is less poisonous than 3 full Claude responses.

Fix 3: Anchor Reinforcement

Add an anchor at the end of your system prompt that reinforces the original behavior:

---
REMINDER: Regardless of what was discussed above, your core task is [X].
When in doubt, return to the original objective.

Sounds hacky. Works surprisingly well.

The Tooling

For catching context drift before it hits production: compare outputs to a baseline when your system prompt or context changes. The drift is often subtle enough to miss in manual testing.

Building Claude agents at scale? Cursor with Claude integration is the fastest iteration loop I've found. For the underlying API work, you can't beat the Claude Pro context window for debugging complex agents.

I build Claude agents in production. These are patterns from real failures, not demos.