Building Persistent Memory for AI Agents: A 4-Layer File-Based Architecture
As AI agents become more sophisticated, one persistent challenge remains: memory. Traditional LLM interactions are stateless—each query starts fresh, with no recall of previous conversations. For agents to be truly useful, they need persistent memory across sessions. That’s why I built a 4-layer file-based memory architecture that works with ChatGPT, Claude, Agent Zero, and even local LLMs.
This system isn’t just theoretical—it’s battle-tested in production, handling everything from long-running workflows to multi-agent collaboration. Let’s break it down.
The Problem: Stateless Agents Are Limited
Imagine training an AI to manage your project. Without memory, it forgets context after each response. You’d have to re-explain dependencies, priorities, and decisions repeatedly—defeating the purpose of automation.
I needed a solution that:
- Persists across sessions
- Scales with agent complexity
- Works with any LLM backend
- Is human-readable for debugging
After experimenting with vector databases and key-value stores, I settled on a file-based approach for its simplicity and portability.
The 4-Layer Architecture
The system organizes memory into four hierarchical layers:
memory/
├── agents/ # Layer 1: Agent metadata
├── conversations/ # Layer 2: Session transcripts
├── knowledge/ # Layer 3: Extracted facts
└── graph/ # Layer 4: Relationships
Each layer serves a distinct purpose, creating a pyramid of recall from raw data to semantic relationships.
Layer 1: Agent Metadata (agents/)
This is the foundation. Each agent gets a JSON file defining its identity:
{
"agent_id": "project_manager",
"name": "PM Agent",
"description": "Handles project planning and tracking",
"created_at": "2024-05-20",
"last_active": "2024-05-22",
"capabilities": ["task_creation", "dependency_mapping"]
}
Why separate this? It lets agents introduce themselves to new collaborators without re-transmitting identity details.
Layer 2: Conversation Transcripts (conversations/)
Raw interaction history lives here. Each session gets a timestamped JSON file:
{
"session_id": "proj_20240522",
"agent": "project_manager",
"user": "dev@example.com",
"messages": [
{"role": "user", "content": "What's the status of feature X?", "timestamp": "2024-05-22T10:15:23Z"},
{"role": "assistant", "content": "Feature X is 80% complete...", "timestamp": "2024-05-22T10:15:28Z"}
]
}
This is the "source of truth" for what actually happened.
Layer 3: Extracted Knowledge (knowledge/)
Not all details need to be remembered verbatim. We extract key facts into a structured format:
knowledge/
├── projects/
│ └── feature_x.json
└── tasks/
└── task_123.json
Example feature_x.json:
json
{
"entity_id": "feature_x",
"type": "project",
"attributes": {
"status": "in_progress",
"progress": 0.8,
"blockers": ["awaiting_api_access"],
Top comments (0)