Your AI coding agent has amnesia. Here's how I fixed it across every tool.

#ai #mcp #programming #productivity

Coding agents forget everything between sessions and share nothing across tools. Here's the pattern that fixes it: a memory layer over MCP.

You open Cursor on Monday. It has no clue what you decided on Friday.
So you paste the architecture again. You explain, again, why you're not using that library. You watch it make the exact mistake you corrected last week, fully confident about it. Then you switch
over to Claude Code for some terminal job and do the whole dance again, because Cursor and Claude Code don't share a single thing they know about your project.

Every session starts from zero. Every agent walks in blind.
The model isn't the problem. The models are great. They just have no memory. And the context window isn't memory; it's scratch paper. It gets wiped when the session ends, and you pay tokens
to fill it back up every single time.

"Just use a bigger context window"

That's the reflex. Bigger context, longer system prompt, paste the whole repo in and pray.

It doesn't hold up:
• It resets. Close the session; it's gone.
• It doesn't travel. What Cursor knows and what Claude Code knows are two separate universes.
• It's expensive, and it's lossy. You're paying to re-send the same stuff, and the model still has to dig for the one line that mattered.

Context is what the model is looking at right now. Memory is what it knows over time. Two different things. Agents ship with the first one and act like that's enough.

The actual fix: a memory layer the agent talks to over MCP

Here's the move. Stop shoving memory into the prompt. Put it behind something the agent can read from and write to, and expose that over a protocol the agents already speak: MCP, the Model Context Protocol.

If you've touched Cursor, Claude Code, Cline or Windsurf, you've already used MCP, or you're one config block away from it. It's the open standard these tools use to reach outside data. A memory server is just an MCP server with a couple of tools on it. One to save something, one to pull the right bits back later.

Now the loop flips. Before the agent answers, it grabs what it already knows about your project.

After it does something that matters, it writes that back. The memory sits outside any one tool, so every agent you run pulls from the same place.

Memory isn't one thing

This is the bit most people get wrong. They treat memory as one big pile of text.

It's three things, and you want all three:
• Semantic. What's true about your project? The decisions, the constraints, how the pieces fit
together. A graph, not a transcript.
• Episodic. What happened and when? The timestamped trail. This is the one that answers "wait,
why is it built like this" six weeks later.
• Procedural. How things get done around here. Your conventions, the steps that actually worked, the setup nobody remembers by Friday.

A flat pile of notes gives you none of that cleanly. Splitting it into three is what turns "search my old chats" into something the agent can reason over.

Wiring it up

One config block. Grab a key from platform.octamem.com and point your client at the server.
Cursor, or anything that reads an mcpServers config, in.cursor/mcp.json:

*JSON · .cursor/mcp.json
*{
"mcpServers": {
"octamem": {
"url": "https://mcp.octamem.com",
"headers": {
"Authorization": "Bearer ${OCTAMEM_API_KEY}"
}
}
}
}

Claude Code, one line:

BASH

hosted server, so HTTP transport

claude mcp add --transport http octamem https://mcp.octamem.com \
--header "Authorization: Bearer $OCTAMEM_API_KEY"

add --scope user if you want it in every project, not just this one

claude mcp add --scope user --transport http octamem https://mcp.octamem.com \
--header "Authorization: Bearer $OCTAMEM_API_KEY"

Restart the client, run /mcp to check it is connected, done. The agent's got store and retrieve tools now. After that, it's just prompting. It decides when to write something down and when to pull it back.

That endpoint is OctaMem, the thing I'm building. More on that at the bottom. The pattern works with any MCP memory server, including a scrappy one you write yourself in an afternoon.

The hard parts, because someone in the comments will ask

I'm not going to act like this is solved and easy. The real problems start after it works.

Forgetting. Memory that never forgets is a liability. If you decided on something three months ago and reversed it last week, the agent can't keep quoting the dead version back at you. It has to decay and update, not just hoard.

Trusting what it pulled. When the agent retrieves something, you want to see what it grabbed and why. That gets serious fast the closer you are to a regulated codebase, where "the AI did it" is not
a sentence you want to say to an auditor. Retrieval has to be something you can actually look at.

Access. The moment memory is shared across tools and people, you've got to decide who reads what. A shared brain is great right up until it drops the wrong context into the wrong session.

That last one is most of what I work on.

Where I'm coming from

I'm building OctaMem. Model-agnostic memory layer, talks over MCP, the three memory types and the audit trail baked in.

So yeah, I've got skin in the game, not pretending otherwise.
But the point isn't "use my thing." It's that memory is its own layer. Not the model, not the editor, its own thing. The second you treat it that way, the whole setup calms down. One brain, every tool, holding across time.

So how are you doing this right now? Scripts, a notes file you keep pasting in, an actual MCP server, nothing at all? Curious what's working for people. Tell me in the comments.