If you've built anything with AI agents, you've hit this wall.
Your agent has a great conversation. It learns the user's preferences, picks up context, starts feeling like it actually knows something. Then the session ends. Next time? Blank slate. It asks the same onboarding questions. It forgot the user hates dark mode. It forgot the decision you made last Tuesday.
This isn't a bug — it's how LLMs work. But it doesn't have to be how your agent works.
The Problem With "Just Use Context"
The first instinct is to dump everything into the context window. Just pass in the conversation history, right?
This breaks down fast:
- Context windows are expensive. Sending 50k tokens of history every request adds up.
- They have limits. Even 200k tokens isn't infinite — and most relevant history is older than that.
- More context ≠ better recall. LLMs are famously bad at finding the needle in a haystack. Relevant information buried in a long context often gets missed.
- They don't persist. Context is ephemeral by definition. When the session ends, it's gone.
What you need isn't more context. You need memory.
Memory vs. Context: What's the Difference?
Context is what the model can see right now. Memory is what the agent retains across sessions.
Real memory has properties that raw context doesn't:
- Semantic retrieval — find related memories by meaning, not just keyword match
- Importance weighting — not all information is equally worth remembering
- Persistence — survives session resets
- Agent-scoped — each agent has its own memory space
This is what we built @cartisien/engram for.
How Engram Works
Engram gives your agent a persistent memory store with semantic search. The API is intentionally simple:
import { Engram } from '@cartisien/engram'
const mem = new Engram({
adapter: 'memory',
agentId: 'my-agent',
})
await mem.wake()
// Store something worth remembering
await mem.store({
content: 'The user prefers dark mode and works late at night',
metadata: { source: 'observation', confidence: 0.9 },
importance: 0.7,
})
// Later — semantic search, not keyword search
const results = await mem.search('user interface preferences', { limit: 5 })
results.forEach(({ memory, score }) => {
console.log(score.toFixed(3), memory.content)
})
await mem.sleep()
The wake() / sleep() lifecycle mirrors how agents actually work — they come online, do work, and go dormant. Memory initializes on wake and persists on sleep.
The importance Field Actually Matters
One thing that separates this from just "storing strings in a database" is the importance score.
Not all memories are equal. "User mentioned they like coffee" is less important than "User said they're about to cancel their subscription." When you retrieve memories, importance influences what surfaces first.
This is closer to how human memory works — emotionally significant or practically important information is retained more reliably than background noise.
Multiple Adapters, Same API
adapter: 'memory' // In-process, great for testing
adapter: 'sqlite' // Local file, no server needed
adapter: 'postgres' // Production scale with pgvector
Same Engram interface regardless of where you're storing. Swap adapters without changing your agent code.
Where This Fits in the Stack
Engram sits in the middle of the Cartisien memory stack:
Cogito ←→ Engram ←→ Extensa
identity memory vectors
Cogito handles agent identity and lifecycle. Extensa handles the vector infrastructure and embeddings layer. Engram is the bridge — the part your agent actually talks to.
You don't need the whole stack. Engram works standalone.
Install
npm install @cartisien/engram
Docs and source: github.com/cartisien/engram
If you're building agents that need to remember things across sessions, give it a try. And if you're hitting memory architecture questions that aren't covered here — drop them in the comments. This is a problem worth solving properly.
Top comments (0)