The most unintuitive AI agent lesson I read recently:
Switching to a CHEAPER model mid-conversation can actually increase your costs.
Why?
Because prompt caches are model-specific.
You lose the entire cached context and recompute everything from scratch.
Another wild one:
Adding a single tool mid-session can invalidate 100k+ cached tokens because tools are part of the prompt prefix.
AI agents are slowly becoming a distributed systems problem disguised as prompting.
Excellent engineering write-up from Anthropic:
https://claude.com/blog/lessons-from-building-claude-code-prompt-caching-is-everything
Top comments (0)