Last week, a coding agent in a test repo did something weird: it opened the right files, referenced the wrong API version, and confidently wrote code for a migration we had already rolled back.
Nothing was “broken” in the usual sense. The prompts were fine. The tools were available. The model was good.
The problem was memory drift.
If you’ve built anything with long-running agents, you’ve probably seen it too: the agent starts strong, then gradually retrieves stale facts, outdated decisions, or half-relevant chunks from old work. Over time, its “memory” turns into a confidence amplifier for bad context.
A lot of teams try to solve this with a bigger vector store. That helps… until it doesn’t.
The real issue: vector stores decay quietly
Vector stores are great for fuzzy retrieval. If your agent needs “something similar to this design doc” or “the auth code near this endpoint,” embeddings are useful.
But agent memory is not just similarity search.
It’s often:
- what changed
- what supersedes what
- who approved a decision
- which fact is still valid
- what depends on what
- what should never be forgotten
That’s where vector-only memory starts to decay.
A simple example
Suppose your agent stores these facts over time:
JWT auth is used for internal APIsMoved to mTLS for service-to-service authJWT still used for browser sessionsDeprecated auth middleware in v3Hotfix restored old middleware for admin routes
A vector store can retrieve “similar auth-related stuff,” but it won’t naturally answer:
- which statement is the latest truth?
- which fact overrides another?
- which context applies only to admin routes?
- which decision was temporary?
That’s not an embedding problem. That’s a relationship problem.
Knowledge graphs don’t replace vectors — they constrain them
The best pattern I’ve seen is:
- vector store for recall
- knowledge graph for truth maintenance
Think of it like this:
User query
|
v
[Vector Search] ---> finds possibly relevant notes/docs/chunks
|
v
[Knowledge Graph] ---> resolves relationships:
- supersedes
- depends_on
- approved_by
- valid_for
- expires_at
|
v
[LLM Context] ---> smaller, fresher, less contradictory
A knowledge graph gives your system structure around memory:
- entities: services, APIs, users, incidents, tasks
-
edges:
supersedes,blocked_by,owned_by,approved_by - timestamps: when a fact became true
- scope: where that fact applies
- confidence: whether it’s canonical or provisional
Instead of asking “what text looks similar?”, you can ask:
- “What is the current auth method for internal APIs?”
- “What decision replaced this one?”
- “Which open task depends on this migration?”
- “What facts are stale after last deploy?”
That’s how you stop memory from becoming a junk drawer.
A practical rule of thumb
Use a vector store when you need:
- semantic search
- fuzzy recall
- document retrieval
- broad context gathering
Use a knowledge graph when you need:
- state over time
- versioned truth
- explicit dependencies
- conflict resolution
- auditable memory
If you only use vectors, your agent will eventually retrieve both the old answer and the new answer and act like they’re equally valid.
A tiny runnable example
Here’s a minimal Node example using a graph to resolve the “latest truth” for a fact.
npm install graphology
const Graph = require("graphology");
const graph = new Graph();
graph.addNode("auth_v1", { value: "JWT for internal APIs", ts: 1 });
graph.addNode("auth_v2", { value: "mTLS for internal APIs", ts: 2 });
graph.addDirectedEdge("auth_v2", "auth_v1", { type: "supersedes" });
function currentFact(nodes) {
return nodes
.filter((n) => graph.inDegree(n) === 0)
.map((n) => graph.getNodeAttribute(n, "value"));
}
console.log(currentFact(["auth_v1", "auth_v2"]));
// => [ 'mTLS for internal APIs' ]
Obviously, real systems need more than this. But the core idea matters: memory should encode replacement, not just storage.
What this looks like in production
A useful pattern is:
- Store raw docs, chats, and artifacts in a vector index
- Extract durable facts into a graph
- Mark facts with:
- source
- timestamp
- scope
- confidence
- supersession links
- Retrieve from both systems
- Let the graph filter or rank what the LLM actually sees
If you already have a policy engine like OPA in your stack, this is also a good place to enforce rules like:
- only approved memories can be treated as canonical
- expired decisions should not be retrieved
- temporary incident workarounds should not leak into normal planning
That’s usually a better answer than trying to prompt-engineer your way out of stale context.
The trap nobody mentions
The biggest mistake isn’t “using vectors.”
It’s treating all memory as text.
Some memory is text.
Some memory is state.
Some memory is policy.
Some memory is provenance.
If you flatten all of that into embeddings, your agent can retrieve context — but it can’t reliably reason about whether that context is still true.
That’s where drift starts.
Try it yourself
If you’re building agents and want to pressure-test the surrounding security and tooling:
- Want to check your MCP server? Try https://tools.authora.dev
- Run
npx @authora/agent-auditto scan your codebase - Add a verified badge to your agent: https://passport.authora.dev
- Check out https://github.com/authora-dev/awesome-agent-security for more resources
My take
Vector stores are still the right tool for retrieval.
But if you want long-lived agents that don’t slowly poison themselves with stale context, you need something that models truth over time.
Usually that means adding a knowledge graph, or at least graph-like relationships, on top of your retrieval layer.
How are you handling agent memory today: pure RAG, graph-backed memory, or something else? Drop your approach below.
-- Authora team
This post was created with AI assistance.
Top comments (0)