Building Persistent Memory for Autonomous AI Agents
How we built knowledge graph-powered memory for Claude Code that actually persists
The Problem Nobody's Talking About
Ralph Wiggum changed everything for Claude Code users.
For the first time, you could set Claude loose on your codebase and let it work autonomously for hours. Teams shipped entire repositories overnight. One team famously ran a 3-month loop that built a complete programming language.
But there's a problem.
Every morning, Ralph forgets everything.
The Groundhog Day Problem
Picture this: You run Ralph for 8 hours. He refactors your entire API layer, discovers patterns, makes architectural decisions, and learns your preferences.
You close the terminal.
The next morning, you start a new session.
Ralph: "Hi! What should we work on?"
You: "Remember yesterday when you—"
Ralph: "Yesterday?"
This is the Groundhog Day Problem of autonomous AI agents. They can run forever, but they can't remember anything.
Why Existing Solutions Don't Work
You might think: "Just save the conversation history!"
We tried that. It doesn't work. Here's why:
1. Context Window Limits
Claude's context window is large (200K tokens), but not infinite. An 8-hour Ralph session generates millions of tokens. You can't fit that in a context window.
Even if you could, it's inefficient. Claude doesn't need to re-read every line of every conversation. It needs structured knowledge.
2. Text Embeddings Aren't Enough
Many memory solutions use vector embeddings:
- Chunk the conversation
- Embed each chunk
- Retrieve similar chunks when needed
This works for semantic search, but fails for:
- Temporal reasoning: "What changed between yesterday and today?"
- Relationship queries: "How does Module A relate to Module B?"
- Decision tracking: "Why did we choose approach X over Y?"
3. The Lost Context Problem
Even with embeddings, you lose:
- Causal relationships
- Decision rationale
- Pattern evolution over time
- Cross-project learning
The Solution: Knowledge Graphs + Skills
Lisa takes a different approach. Instead of saving everything, we build structured knowledge using Graphiti's knowledge graph engine.
Architecture Overview
┌─────────────────────────────────────────────────────────┐
│ Claude Code │
│ (Ralph Wiggum) │
└──────────────┬──────────────────────────────────────────┘
│
│ Session events (hooks)
│
┌──────────────▼──────────────────────────────────────────┐
│ Lisa (Skills Layer) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Memory │ │ Tasks │ │ Prompt │ ... │
│ │ Skill │ │ Skill │ │ Skill │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└──────────────┬──────────────────────────────────────────┘
│
│ MCP Protocol
│
┌──────────────▼──────────────────────────────────────────┐
│ Graphiti MCP Server │
│ (Knowledge Graph Engine) │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Nodes: Entities (files, modules, decisions) │ │
│ │ Edges: Relationships (modified, depends_on) │ │
│ │ Facts: Timestamped knowledge │ │
│ │ Episodes: Conversation context │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────┬──────────────────────────────────────────┘
│
│ Storage
│
┌──────────────▼──────────────────────────────────────────┐
│ Neo4j (Local Docker or Zep Cloud) │
└──────────────────────────────────────────────────────────┘
How It Works
1. Event Capture (Hooks)
Lisa uses Claude Code's hook system to capture events:
// session-start.ts
export async function onSessionStart(context) {
// Load relevant memories for this project
const memories = await mcpClient.searchFacts({
group_id: context.projectId,
query: context.projectContext
});
return {
systemPrompt: `You have these memories: ${memories}`
};
}
// user-prompt-submit.ts
export async function onPromptSubmit(prompt, context) {
// Store the prompt as an episode
await mcpClient.addEpisode({
content: prompt,
timestamp: Date.now(),
group_id: context.projectId
});
}
// session-stop.ts
export async function onSessionStop(context) {
// Extract and store learnings from this session
const learnings = extractLearnings(context.transcript);
await mcpClient.addFacts(learnings);
}
2. Knowledge Extraction (Graphiti)
When you close a session, Lisa sends the transcript to Graphiti, which:
- Extracts entities: files, functions, modules, people, concepts
- Identifies relationships: modified, depends_on, implements, etc.
- Creates facts: timestamped assertions ("File X was modified to fix bug Y")
- Links episodes: connects conversations to facts
Example output:
{
"nodes": [
{"uuid": "...", "name": "src/api/users.ts", "type": "file"},
{"uuid": "...", "name": "Tony", "type": "person"},
{"uuid": "...", "name": "RESTful architecture", "type": "concept"}
],
"edges": [
{
"source": "Tony",
"target": "src/api/users.ts",
"type": "MODIFIED",
"fact": "Tony refactored users.ts to follow RESTful patterns",
"timestamp": "2026-01-11T10:30:00Z"
}
]
}
3. Queryable Memory (Skills)
Next session, Claude can query this knowledge:
> lisa, what did we decide about the API architecture?
👧 Recent decision: You refactored the API to follow RESTful patterns.
Specifically:
- users.ts now uses resource-based endpoints
- Moved from RPC-style to REST
- Rationale: Better caching, clearer semantics
Related files: users.ts, posts.ts, comments.ts
This isn't search. It's reasoning over a knowledge graph.
Skills: Extensible Memory Modules
Lisa uses a "skills" architecture. Each skill is a specialized memory module:
.agents/skills/
├── memory/ # Core knowledge graph persistence
│ ├── scripts/memory.js
│ └── SKILL.md
├── tasks/ # Task tracking across sessions
│ ├── scripts/tasks.js
│ └── SKILL.md
├── prompt/ # Prompt history and patterns
│ ├── scripts/prompt.js
│ └── SKILL.md
└── lisa/ # Meta-skill for orchestration
├── scripts/storage.js
└── SKILL.md
Each skill:
- Handles a specific type of memory
- Uses Graphiti MCP under the hood
- Can be invoked with natural language
Example: The tasks skill
> lisa, add task: refactor authentication
👧 Task added: "Refactor authentication"
Status: todo
Added to group: my-project
Later:
> lisa, what tasks are pending?
👧 Tasks:
1. [todo] Refactor authentication
2. [doing] Update API docs
3. [done] Fix login bug
Tasks persist across sessions, projects, and even months.
The Ralph + Lisa Stack
Here's where it gets powerful. Combine Ralph (autonomous execution) with Lisa (persistent memory):
Day 1: Ralph Learns
# Start Ralph with a goal
ralph --goal "Refactor the API layer to follow RESTful principles"
# Ralph runs for 6 hours:
# - Analyzes current code
# - Identifies non-RESTful patterns
# - Refactors endpoints
# - Updates tests
# - Makes 47 commits
# Meanwhile, Lisa is watching:
# - Captures architectural decisions
# - Stores refactoring patterns
# - Tracks file relationships
# - Records test updates
Day 2: Ralph Remembers
# Start a new Ralph session
ralph --goal "Add new /products endpoint"
# Lisa loads memories:
# "Yesterday you refactored to RESTful patterns.
# Users endpoint uses: GET /users, POST /users, GET /users/:id
# You preferred Joi for validation.
# Tests follow AAA pattern."
# Ralph uses this context:
# - Creates /products with same pattern
# - Uses Joi validation (without being told)
# - Follows AAA test pattern
# - Stays consistent with Day 1
The result: Ralph doesn't just loop. Ralph learns.
Real-World Use Cases
1. Multi-Day Features
Break a large feature across multiple sessions:
Day 1: "Design the authentication system"
Day 2: "Implement JWT tokens"
Day 3: "Add refresh token rotation"
Day 4: "Write integration tests"
Each day builds on previous decisions without re-explanation.
2. Team Memory
Share knowledge across team members:
# Alice's session
alice> "We decided to use PostgreSQL for user data"
# Bob's session (next day)
bob> "What database should I use for orders?"
lisa> "You're using PostgreSQL for user data. Use the same for orders."
3. Project Templates
Lisa learns your preferences:
# After several projects:
> "Start a new API project"
lisa> "Creating Express project with:
- TypeScript (you always use this)
- Joi validation (your preference)
- Jest for tests (95% coverage target)
- PostgreSQL (your standard)
Based on your last 3 projects."
4. Bug Pattern Recognition
# Week 1: Fix a null pointer bug
> "Fixed NPE in users.ts by adding null check"
# Week 3: Similar code, similar bug
lisa> "Warning: This looks like the null pointer pattern from users.ts.
Consider adding the same null check."
Technical Deep Dive: Graphiti Integration
Why Graphiti?
We evaluated several memory backends:
| Solution | Pros | Cons |
|---|---|---|
| Plain text files | Simple | No querying, no relationships |
| Vector embeddings | Good for similarity | No temporal reasoning |
| SQL database | Queryable | Rigid schema, poor for graphs |
| Graphiti | Knowledge graphs, temporal reasoning, MCP protocol | Requires Neo4j |
Graphiti won because:
- Knowledge Graphs: Natural fit for code relationships
- Temporal Reasoning: Tracks changes over time
- MCP Protocol: Standard integration with Claude Code
- Enterprise Ready: Battle-tested by Zep
MCP Integration
Lisa communicates with Graphiti via the Model Context Protocol (MCP):
// lib/mcp.ts - Simplified
export class McpClient {
async searchFacts(query: string, groupId: string) {
return this.callTool('search_facts', {
query,
group_id: groupId,
max_facts: 10
});
}
async addEpisode(content: string, groupId: string) {
return this.callTool('add_episode', {
content,
group_id: groupId,
timestamp: Date.now()
});
}
}
This gives us:
- Standard protocol (not vendor-locked)
- Multiple backend support (local Docker, Zep Cloud)
- Future-proof architecture
Storage Modes
Lisa supports two storage modes:
Local Mode (Docker)
# docker-compose.graphiti.yml
services:
neo4j:
image: neo4j:latest
ports:
- "7474:7474"
- "7687:7687"
graphiti:
image: zepai/knowledge-graph-mcp:standalone
environment:
- NEO4J_URI=bolt://neo4j:7687
Pros:
- Full control
- Private data
- No API costs
Cons:
- Requires Docker
- Manual updates
Zep Cloud Mode
# .agents/.env
GRAPHITI_ENDPOINT=https://api.getzep.com
GRAPHITI_API_KEY=your-key-here
Pros:
- Zero setup
- Managed updates
- Team sharing
Cons:
- Requires account
- Data leaves localhost
Cache Fallback
If Graphiti is unavailable, Lisa falls back to local cache:
try {
const facts = await mcpClient.searchFacts(query);
// Cache the result
await writeCache('memory.log', facts);
return facts;
} catch (error) {
// Return cached result
return readCache('memory.log');
}
This ensures Lisa works even offline.
Performance Considerations
Token Efficiency
Instead of stuffing the full conversation history into every prompt, Lisa:
- Queries relevant facts: ~500 tokens
- Injects into system prompt
- Claude reasons over knowledge, not raw logs
Result: 10x reduction in context usage
Query Speed
Graphiti queries are fast:
- Simple fact lookup: <50ms
- Complex graph traversal: <200ms
- Full project search: <500ms
Fast enough for real-time usage.
Storage Scaling
Neo4j scales to billions of nodes. Unless you're storing every keystroke, you'll never hit limits.
Typical usage:
- 1 session = 10-50 facts
- 100 sessions = 5,000 facts
- Storage: <1 MB
Lessons Learned
1. Not Everything Deserves Memory
Early versions tried to remember everything. Bad idea.
Noise drowns signal. Now we filter:
- ✅ Architectural decisions
- ✅ Bug patterns
- ✅ User preferences
- ❌ Routine operations
- ❌ Transient state
2. Timestamps Are Critical
Without timestamps, you can't answer:
- "What changed since yesterday?"
- "When did we decide X?"
- "What was the sequence?"
Every fact needs valid_at and created_at.
3. Group IDs Enable Multi-Project
Use group_id to separate projects:
group_id: "project-alpha" // Work project
group_id: "project-beta" // Side project
group_id: "shared" // Team knowledge
This prevents cross-contamination.
4. Skills Beat Monoliths
The skills architecture allows:
- Independent development
- Community contributions
- Gradual learning curve
Much better than one giant "memory" blob.
The Future: What's Next
Week 2-3: Ralph Integration Plugin
Official plugin for Ralph Wiggum:
npm install @lisa/ralph-integration
# Automatic memory capture during Ralph loops
# Zero configuration needed
Month 2: Team Features
- Shared memory pools
- Permission controls
- Audit logs
Month 3: Advanced Query
Natural language queries over knowledge:
> "Show me all API endpoints modified in the last week
that depend on the auth module and have failing tests"
lisa> [Graphical output with relationships]
Long-term: Cross-Agent Memory
Imagine:
- Cursor stores code patterns
- Claude Code stores decisions
- GitHub Copilot stores preferences
- Lisa unifies them all
One memory graph. All your tools.
Try It Yourself
Quick Start
# Install Lisa
npx lisa init
# Choose local or cloud
? Storage mode: Local (Docker)
# Lisa sets up everything
✓ Docker Compose configured
✓ Graphiti MCP installed
✓ Skills deployed
✓ Hooks configured
# Start using it
npx lisa
> lisa, remember that I prefer TypeScript over JavaScript
👧 Got it! I'll remember you prefer TypeScript.
> lisa, what do you know about me?
👧 Recent memories:
- You prefer TypeScript over JavaScript
- You're working on the 'lisa' project
Integration with Ralph
# Install both
npx lisa init
npx @anthropic-ai/claude-code install ralph-wiggum
# Run Ralph
ralph --goal "Your goal here"
# Lisa captures everything automatically
# Next session: Ralph remembers
Conclusion
Ralph Wiggum proved that autonomous AI agents can code for hours.
Lisa proves they can remember what they learned.
Together, they're not just tools. They're a platform for AI agents that actually get smarter over time.
The Simpsons accidentally gave us the perfect metaphor:
- Ralph runs wild and breaks things (autonomous loops)
- Lisa remembers everything and makes it better (persistent memory)
Stop repeating yourself. Lisa's listening.
Try Lisa: https://github.com/TonyCasey/lisa
Discussions: https://github.com/TonyCasey/lisa/discussions
Contributing: See CONTRIBUTING.md
Built with:
- Graphiti - Knowledge graph engine
- Claude Code - AI coding assistant
- Neo4j - Graph database
- MCP - Model Context Protocol
If you found this useful, star the repo and share your experience. What would you build with an AI agent that never forgets?
Top comments (0)