DEV Community

Daniel Vermillion
Daniel Vermillion

Posted on

Building Persistent Memory for AI Agents: A 4-Layer File-Based Architecture

Building Persistent Memory for AI Agents: A 4-Layer File-Based Architecture

As AI agents become more sophisticated, one persistent challenge remains: memory. Traditional LLM interactions are stateless—each query starts fresh, with no recall of previous conversations. For agents to be truly useful, they need persistent memory across sessions. That’s why I built a 4-layer file-based memory architecture that works with ChatGPT, Claude, Agent Zero, and even local LLMs.

This system isn’t just theoretical—it’s battle-tested in production, handling everything from long-running workflows to multi-agent collaboration. Let’s break it down.

The Problem: Stateless Agents Are Limited

Imagine training an AI to manage your project. Without memory, it forgets context after each response. You’d have to re-explain dependencies, priorities, and decisions repeatedly—defeating the purpose of automation.

I needed a solution that:

  • Persists across sessions
  • Scales with agent complexity
  • Works with any LLM backend
  • Is human-readable for debugging

After experimenting with vector databases and key-value stores, I settled on a file-based approach for its simplicity and portability.

The 4-Layer Architecture

The system organizes memory into four hierarchical layers:

memory/
├── agents/          # Layer 1: Agent metadata
├── conversations/   # Layer 2: Session transcripts
├── knowledge/       # Layer 3: Extracted facts
└── graph/           # Layer 4: Relationships
Enter fullscreen mode Exit fullscreen mode

Each layer serves a distinct purpose, creating a pyramid of recall from raw data to semantic relationships.

Layer 1: Agent Metadata (agents/)

This is the foundation. Each agent gets a JSON file defining its identity:

{
  "agent_id": "project_manager",
  "name": "PM Agent",
  "description": "Handles project planning and tracking",
  "created_at": "2024-05-20",
  "last_active": "2024-05-22",
  "capabilities": ["task_creation", "dependency_mapping"]
}
Enter fullscreen mode Exit fullscreen mode

Why separate this? It lets agents introduce themselves to new collaborators without re-transmitting identity details.

Layer 2: Conversation Transcripts (conversations/)

Raw interaction history lives here. Each session gets a timestamped JSON file:

{
  "session_id": "proj_20240522",
  "agent": "project_manager",
  "user": "dev@example.com",
  "messages": [
    {"role": "user", "content": "What's the status of feature X?", "timestamp": "2024-05-22T10:15:23Z"},
    {"role": "assistant", "content": "Feature X is 80% complete...", "timestamp": "2024-05-22T10:15:28Z"}
  ]
}
Enter fullscreen mode Exit fullscreen mode

This is the "source of truth" for what actually happened.

Layer 3: Extracted Knowledge (knowledge/)

Not all details need to be remembered verbatim. We extract key facts into a structured format:

knowledge/
├── projects/
│   └── feature_x.json
└── tasks/
    └── task_123.json
Enter fullscreen mode Exit fullscreen mode

Example feature_x.json:


json
{
  "entity_id": "feature_x",
  "type": "project",
  "attributes": {
    "status": "in_progress",
    "progress": 0.8,
    "blockers": ["awaiting_api_access"],
Enter fullscreen mode Exit fullscreen mode

Top comments (0)