For a while, I thought AI memory was basically just…a smarter grep.
Search some files, grab context, send it to the model. That's it.
And to be fair, that works at the beginning. But once your agent starts doing anything even slightly complex, things get weird. It forgets what it just did, repeats mistakes, or confidently breaks something it had already fixed five minutes ago.
At some point it hits you, it's not that the model is bad, it's that the memory model is wrong.
We recorded a super short clip about this if you want the quick version.
The thing that changed how I think about this is realizing that not all “memory” should behave the same way. Most setups just dump everything into context like it's one big pool, but that's exactly what creates the problem. You end up with noisy, expensive context and an agent that still acts like it has amnesia.
What's been working better (at least for me) is thinking of memory more like two separate systems.
On one side, you have something closer to a library, your cached context. Docs, system rules, known structures…things that don't change much. This is the stuff you want pre-loaded and reused efficiently.
On the other side, there's something more like a journal. Not what the agent knows, but what it just did. The last decisions it made, the changes it applied, the mistakes it shouldn't repeat. That's the piece that actually makes the agent feel consistent over time.
Mix those two, and everything gets blurry. Separate them, and suddenly the behavior starts making more sense.
The biggest shift for me was stopping the question:
“How do I give the agent more context?”
And replacing it with:
“What should be remembered, and what should just be reloaded?”
Curious how others are handling this, especially in longer-running agents.
Are you structuring memory already, or still kind of piping everything into context and hoping for the best?
Top comments (0)