After using Cursor and Claude Code daily, I’ve noticed that when an AI coding agent drifts or forgets constraints, we assume it’s a model limitation.
In many cases, it’s context management.
A few observations:
- Tokens are not just limits. They’re attention competition.
- Even before hitting the hard window limit, attention dilution happens.
- Coding tasks degrade faster than chat because of dependency density and multi-representation juggling (diffs, logs, tests).
I started managing context deliberately:
- Always write a contract
- Chunk sessions by intent
- Snapshot state and restart
- Prefer on-demand CLI instead of preloading large MCP responses
It dramatically improved the stability of the agent.
Curious how others are handling context optimization.
I also wrote a detailed breakdown of:
- How tokens and context windows actually affect stability
- Why coding degrades faster
- A practical context stack model
- Why on-demand CLI retrieval is often more context-efficient
Top comments (1)
Hey, stumbled upon your post while doing my research about this topic.
I'm observing it more and more: context engineering is where we need to put the effort to make agents efficient. And it works! But it's a very manual and sort of a "magic" process.
I'm thinking about a way to have something more dynamic that incorporates both the temporal aspect of context change and a sort of confidence gradient for the data in the context.
Wrote about it here if you're interested medium.com/@a.mandyev/the-missing-...