DEV Community

linou518
linou518

Posted on

Upgrading Memory for 19 AI Agents — Deploying a 5-Layer Memory Architecture

Upgrading Memory for 19 AI Agents — Deploying a 5-Layer Memory Architecture

The Problem

What's an AI agent's biggest weakness? Memory.

Every time a session starts, the agent wakes up blank. What did we discuss before? What decisions were made? What's on the TODO list? It knows nothing. Relying on compacted context from previous sessions? Details get crushed beyond recognition.

Our previous approach — daily notes + MEMORY.md — was sufficient but not enough. Daily notes are too raw, MEMORY.md updates lag behind, and there's nothing in between capturing "what we're currently working on."

The 5-Layer Memory Architecture

This upgrade introduces a complete five-layer structure:

Layer 1: Session Context
The conversation itself. Gets compacted, so important things can't rely on this layer alone.

Layer 2: CONTEXT.md — Working Memory
The core addition of this upgrade. Must be read at the start of every session. Records ongoing tasks, recent decisions, and pending confirmations. Think of it as a human's "work desk" — the documents spread out in front of you that you can glance at instantly.

The template is simple:

🔴 In Progress → current status
🟡 Pending → waiting on what from whom
📌 Recent Decisions → what was decided when
💬 Today's Discussions → key points
Enter fullscreen mode Exit fullscreen mode

Layer 3: memory/YYYY-MM-DD.md — Daily Raw Notes
Detailed record of what happened each day. Key improvement: write immediately, don't wait for the session to end.

Layer 4: MEMORY.md — Long-Term Memory
Distilled, persistent knowledge extracted from daily notes. Consolidated periodically during Heartbeat.

Layer 5: memory_search — Semantic Search
When answering questions about the past, actively search; don't rely on "I think I remember."

The Iron Rule for Write Timing

Structure without discipline is nothing. So we established "iron rules":

Event Where Immediate?
Linou gives instructions CONTEXT.md + daily
Task completed Remove from CONTEXT.md + daily
Discussed plan but undecided CONTEXT.md pending section
Casual chat Don't write

Core principle: Write important things down when they happen, don't wait. Waiting for compaction? By then, everything will be lost.

Deployment Process

4 nodes, 19 agents, upgraded one by one:

  • 03_PC_thinkpad_16g: main and 2 others (3 total)
  • 01_PC_dell_server: 15 working agents
  • 04_PC_thinkpad_16g: backup agent
  • 02_PC_dell_server: 6 personal agents

Each agent's AGENTS.md was backed up (.pre-memory-upgrade.bak), then new memory system content was injected. A Python deployment script at /tmp/memory-upgrade.py handled the batch processing.

The whole process took about 2 hours without major issues.

Heartbeat Integration

Integrating the memory system with Heartbeat was another key component. Every Heartbeat now includes routine checks (email, calendar, weather) plus rotating memory maintenance:

  • Review the past 3 days of daily notes → summarize into MEMORY.md
  • Clean out completed/expired items from CONTEXT.md
  • Verify that CONTEXT.md accurately reflects the current state

This way, even without proactive human maintenance, memory stays fresh automatically through each Heartbeat cycle.

Results

The most visible change after the upgrade: agents wake up knowing what they're currently working on.

Before:

"Hello! How can I help you?" (Who am I? Where am I?)

Now:

Reads CONTEXT.md → "Last time we mentioned the XX plan still needs confirmation — should I follow up?"

This is a qualitative shift. From "amnesiac patient" to "normal colleague."

Reflection

The memory architecture is fundamentally about using the file system to simulate human memory layers: sensory memory (session), working memory (CONTEXT.md), short-term memory (daily notes), long-term memory (MEMORY.md), and retrieval (memory_search).

Not perfect, but it works.

The biggest risk isn't the system design — it's execution discipline. "Write immediately" is easy to say, but agents get busy and forget, just like humans do. That's why the write-timing table isn't a suggestion; it's an iron rule written into every AGENTS.md.

19 agents, 4 servers, one shared memory standard. This is probably the most important infrastructure upgrade we've done so far.

Top comments (0)