Jeff Witters

Posted on Mar 11

Why AI Agents Forget Everything (And How to Fix It)

#ai #agents #typescript #machinelearning

If you've built anything with AI agents, you've hit this wall.

Your agent has a great conversation. It learns the user's preferences, picks up context, starts feeling like it actually knows something. Then the session ends. Next time? Blank slate. It asks the same onboarding questions. It forgot the user hates dark mode. It forgot the decision you made last Tuesday.

This isn't a bug — it's how LLMs work. But it doesn't have to be how your agent works.

The Problem With "Just Use Context"

The first instinct is to dump everything into the context window. Just pass in the conversation history, right?

This breaks down fast:

Context windows are expensive. Sending 50k tokens of history every request adds up.
They have limits. Even 200k tokens isn't infinite — and most relevant history is older than that.
More context ≠ better recall. LLMs are famously bad at finding the needle in a haystack. Relevant information buried in a long context often gets missed.
They don't persist. Context is ephemeral by definition. When the session ends, it's gone.

What you need isn't more context. You need memory.

Memory vs. Context: What's the Difference?

Context is what the model can see right now. Memory is what the agent retains across sessions.

Real memory has properties that raw context doesn't:

Semantic retrieval — find related memories by meaning, not just keyword match
Importance weighting — not all information is equally worth remembering
Persistence — survives session resets
Agent-scoped — each agent has its own memory space

This is what we built @cartisien/engram for.

How Engram Works

Engram gives your agent a persistent memory store with semantic search. The API is intentionally simple:

import { Engram } from '@cartisien/engram'

const mem = new Engram({
  adapter: 'memory',
  agentId: 'my-agent',
})

await mem.wake()

// Store something worth remembering
await mem.store({
  content: 'The user prefers dark mode and works late at night',
  metadata: { source: 'observation', confidence: 0.9 },
  importance: 0.7,
})

// Later — semantic search, not keyword search
const results = await mem.search('user interface preferences', { limit: 5 })
results.forEach(({ memory, score }) => {
  console.log(score.toFixed(3), memory.content)
})

await mem.sleep()

The wake() / sleep() lifecycle mirrors how agents actually work — they come online, do work, and go dormant. Memory initializes on wake and persists on sleep.

The `importance` Field Actually Matters

One thing that separates this from just "storing strings in a database" is the importance score.

Not all memories are equal. "User mentioned they like coffee" is less important than "User said they're about to cancel their subscription." When you retrieve memories, importance influences what surfaces first.

This is closer to how human memory works — emotionally significant or practically important information is retained more reliably than background noise.

Multiple Adapters, Same API

adapter: 'memory'    // In-process, great for testing
adapter: 'sqlite'    // Local file, no server needed
adapter: 'postgres'  // Production scale with pgvector

Same Engram interface regardless of where you're storing. Swap adapters without changing your agent code.

Where This Fits in the Stack

Engram sits in the middle of the Cartisien memory stack:

Cogito  ←→  Engram  ←→  Extensa
identity    memory      vectors

Cogito handles agent identity and lifecycle. Extensa handles the vector infrastructure and embeddings layer. Engram is the bridge — the part your agent actually talks to.

You don't need the whole stack. Engram works standalone.

Install

npm install @cartisien/engram

Docs and source: github.com/cartisien/engram

If you're building agents that need to remember things across sessions, give it a try. And if you're hitting memory architecture questions that aren't covered here — drop them in the comments. This is a problem worth solving properly.

DEV Community

Why AI Agents Forget Everything (And How to Fix It)

The Problem With "Just Use Context"

Memory vs. Context: What's the Difference?

How Engram Works

The `importance` Field Actually Matters

Multiple Adapters, Same API

Where This Fits in the Stack

Install

Top comments (0)

The Problem With "Just Use Context"

Memory vs. Context: What's the Difference?

How Engram Works

The importance Field Actually Matters

Multiple Adapters, Same API

Where This Fits in the Stack

Install

The `importance` Field Actually Matters