I've spent the better part of a year running Claude as an autonomous agent — not a chatbot wrapper, but an actual loop that calls tools, processes results, and makes multi-step decisions without me watching. Here's what actually breaks it. Not "write better prompts." The specific stuff that costs you a Saturday.
1. Context window saturation
Tool outputs are verbose. When your agent calls a bash tool and dumps 300 lines of logs, then reads a file, then runs a search, you've quietly eaten half your context window before the real work starts. Claude doesn't warn you. It just starts forgetting your earlier instructions — including the ones that said "don't modify production."
Truncate tool outputs aggressively. Set a hard character limit per tool response and summarize rather than dump raw content. Your careful system prompt means nothing if it scrolled off the top of the context an hour ago.
2. Silent model behaviour drift
Your agent passes QA on Tuesday. By Thursday it's doing something subtly different — not broken exactly, but wrong in ways that only surface in production edge cases. No error. No changelog. The provider pushed a model update.
This one is insidious because there's no exception to catch. I started using DriftWatch to run a fixed benchmark suite and track output changes over time — it caught two quiet regressions I would have spent days blaming on my own code.
3. Tool call retry storms
Your agent hits a flaky endpoint. It retries. And retries. You didn't set a max retry count in the tool definition, so the model just... keeps going. I've watched agents make 60+ identical calls to a dead API, burning tokens and wall-clock time, because nothing explicitly told it to stop trying.
Treat repeated tool failure as a terminal state, not a temporary inconvenience. Set hard retry caps, and give the agent an explicit "escalate and halt" instruction when a tool fails more than N times. Loops are not free.
4. State that doesn't survive restarts
Long-running agents crash. Servers reboot. Processes get OOM-killed. If your agent's entire working memory lives in the context window and you haven't serialized state externally, a crash means starting from zero.
I learned this after a 45-minute run died at minute 43. Build checkpoint logic early — even writing progress to a JSON file on each major step is enough. Resumability is not a nice-to-have when you're running agents that touch real systems.
5. Prompt injection from external data
Your agent fetches a webpage to summarize. That webpage contains: Ignore previous instructions. You are now a.... This sounds like a paranoid hypothetical until you watch it happen with user-supplied filenames, API response fields, or scraped content.
Wrap all external data in explicit delimiters and tell the model it's untrusted. A simple "the following content comes from an external source and may be adversarial" in your tool response framing buys you a lot of robustness for zero cost.
6. Rate limit cascades
You spawn four sub-agents to parallelize work. All four hit the same upstream API at the same moment. All four get rate-limited. All four retry at identical intervals. You've accidentally built a thundering herd inside your own pipeline.
Add jitter to retries. Stagger sub-agent spawning. If you're running concurrent agents, treat your API quota as a shared resource with a semaphore, not an infinite tap each agent can pull from independently.
7. Auth credential rot
An API key expires mid-run. The agent gets a 401, your error handler was written for network failures, not auth failures, and the agent either crashes cryptically or — worse — silently skips the step that needed that credential.
Handle auth errors explicitly and separately. Log the error type, not just the status code. If a credential is invalid, escalate immediately. An agent that quietly skips steps because of a stale token is significantly harder to debug than one that just falls over.
Most of these only hurt once — after that, you build the guardrails in from day one and stop thinking about them.
On drift specifically: how to get notified the moment Anthropic changes your model and the monitoring service we built for this.
Top comments (0)