Last week, I watched an AI coding agent make the exact same mistake for the third time.
It reintroduced a bug we’d already fixed, ignored a project convention we’d explained twice, and confidently suggested an architecture decision we had already rejected. None of this was because the model was “bad.” It was because every new session started with amnesia.
If you’re using Claude Code, Cursor, Windsurf, Cline, Aider, or Roo Code, you’ve probably felt this too:
- you restate the same rules every session
- your agent rediscovers old gotchas
- useful fixes stay trapped in chat history
- knowledge from Project A never helps in Project B
That’s the real problem: AI agents are good at reasoning, but terrible at remembering over time.
So we stopped treating memory like “just more prompt context” and gave the agent a persistent knowledge layer through MCP.
The pattern that finally clicked
Instead of stuffing more instructions into a giant system prompt, we split the job in two:
- The agent thinks
- A memory server stores what matters
That means the LLM doesn’t need to permanently “remember” your architecture decisions, bug fixes, coding patterns, or weird deployment gotchas. It just needs a reliable place to retrieve them.
Here’s the basic idea:
┌───────────────┐ MCP tools ┌────────────────────┐
│ AI Coding │ ───────────────▶ │ Memory Server │
│ Agent │ │ (persistent graph) │
│ (Claude/Cursor│ ◀─────────────── │ decisions, fixes, │
│ /Cline/etc.) │ relevant context │ patterns, gotchas │
└───────────────┘ └────────────────────┘
Once we started doing this, the workflow changed:
- after solving a bug, store the fix
- after choosing a pattern, store the reasoning
- before making changes, retrieve related knowledge
- when switching projects, reuse what still applies
That’s what solved the amnesia.
Why MCP is the right place to do this
If your agent supports MCP, memory becomes a tool instead of a hack.
That matters because memory should be:
- searchable
- structured
- reusable across sessions
- available across projects
- separate from any one model vendor
This is also why I like the BYOLLM approach: your agent does the reasoning, while the memory system handles storage and retrieval. You’re not locked into one model just to keep your accumulated knowledge.
If a simpler setup works for you, use it. For some teams, a well-maintained AGENTS.md, CLAUDE.md, or project wiki is enough. But once you’re juggling multiple repos, repeated bugs, and long-running architecture decisions, plain text docs start to break down.
What we ended up using
We built this around PeKG, a personal knowledge graph for MCP-compatible agents.
It stores things like:
- implementation decisions
- codebase patterns
- bug fixes
- “don’t do this again” gotchas
- architecture knowledge
- relationships between concepts like
depends_on,replaces, andconflicts_with
The useful part isn’t just storage. It’s that the knowledge gets compiled into something the agent can actually use later.
So instead of raw notes like:
“Auth middleware broke because token refresh runs after route guard”
…you end up with structured, searchable knowledge the agent can pull back when it’s working on auth again next week.
It also supports cross-project synthesis, which is more valuable than I expected. If your agent learns a useful retry pattern in one Node service, it can reuse that idea in another project instead of rediscovering it from scratch.
A minimal MCP setup example
Here’s a simple Node example to connect an MCP-compatible workflow.
npm install @modelcontextprotocol/sdk
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
const transport = new StdioClientTransport({
command: "pekg",
args: ["mcp"]
});
const client = new Client({ name: "memory-demo", version: "1.0.0" });
await client.connect(transport);
const tools = await client.listTools();
console.log(tools.tools.map(t => t.name));
Once connected, your agent can use MCP tools to ingest knowledge, search it, retrieve relevant context, and query relationships in the graph.
PeKG exposes 11 MCP tools for this, including ingestion, graph queries, context retrieval, and deep scans of source files.
What made the biggest difference
Three things mattered most:
1. Store decisions, not just facts
The highest-value memory isn’t “Redis is installed.” It’s “We chose BullMQ over raw queues because we needed retry visibility.”
2. Capture gotchas immediately
If you wait until later, the weird details disappear. The best memory entry is the one created right after the issue is solved.
3. Let knowledge compound
The real win is not one saved prompt. It’s when your agent stops repeating old mistakes across dozens of sessions.
That’s where graph-based memory starts outperforming ad hoc notes.
Try it yourself
If you’re already using an MCP-compatible agent and you’re tired of re-explaining your codebase every session, this is worth testing.
- Check out https://pekg.ai/docs for MCP setup
- See https://pekg.ai/hints.txt for 115 practical tips on capturing and organizing useful knowledge
- Try https://app.pekg.ai — free tier available with 100 articles and 1 project
Free is enough to see whether persistent memory actually changes how your agent works.
Final thought
The models are getting better fast. But better reasoning doesn’t fix missing memory.
If your agent forgets every lesson the moment the session ends, you’re paying an intelligence tax over and over again.
Persistent memory doesn’t make an agent smarter. It makes it stop starting from zero.
How are you handling agent memory today: giant prompts, repo docs, custom RAG, or something else? Drop your approach below.
-- PeKG team
This post was created with AI assistance.
Top comments (0)