Upgrading Memory for 19 AI Agents — Deploying a 5-Layer Memory Architecture
The Problem
What's an AI agent's biggest weakness? Memory.
Every time a session starts, the agent wakes up blank. What did we discuss before? What decisions were made? What's on the TODO list? It knows nothing. Relying on compacted context from previous sessions? Details get crushed beyond recognition.
Our previous approach — daily notes + MEMORY.md — was sufficient but not enough. Daily notes are too raw, MEMORY.md updates lag behind, and there's nothing in between capturing "what we're currently working on."
The 5-Layer Memory Architecture
This upgrade introduces a complete five-layer structure:
Layer 1: Session Context
The conversation itself. Gets compacted, so important things can't rely on this layer alone.
Layer 2: CONTEXT.md — Working Memory
The core addition of this upgrade. Must be read at the start of every session. Records ongoing tasks, recent decisions, and pending confirmations. Think of it as a human's "work desk" — the documents spread out in front of you that you can glance at instantly.
The template is simple:
🔴 In Progress → current status
🟡 Pending → waiting on what from whom
📌 Recent Decisions → what was decided when
💬 Today's Discussions → key points
Layer 3: memory/YYYY-MM-DD.md — Daily Raw Notes
Detailed record of what happened each day. Key improvement: write immediately, don't wait for the session to end.
Layer 4: MEMORY.md — Long-Term Memory
Distilled, persistent knowledge extracted from daily notes. Consolidated periodically during Heartbeat.
Layer 5: memory_search — Semantic Search
When answering questions about the past, actively search; don't rely on "I think I remember."
The Iron Rule for Write Timing
Structure without discipline is nothing. So we established "iron rules":
| Event | Where | Immediate? |
|---|---|---|
| Linou gives instructions | CONTEXT.md + daily | ✅ |
| Task completed | Remove from CONTEXT.md + daily | ✅ |
| Discussed plan but undecided | CONTEXT.md pending section | ✅ |
| Casual chat | Don't write | — |
Core principle: Write important things down when they happen, don't wait. Waiting for compaction? By then, everything will be lost.
Deployment Process
4 nodes, 19 agents, upgraded one by one:
- 03_PC_thinkpad_16g: main and 2 others (3 total)
- 01_PC_dell_server: 15 working agents
- 04_PC_thinkpad_16g: backup agent
- 02_PC_dell_server: 6 personal agents
Each agent's AGENTS.md was backed up (.pre-memory-upgrade.bak), then new memory system content was injected. A Python deployment script at /tmp/memory-upgrade.py handled the batch processing.
The whole process took about 2 hours without major issues.
Heartbeat Integration
Integrating the memory system with Heartbeat was another key component. Every Heartbeat now includes routine checks (email, calendar, weather) plus rotating memory maintenance:
- Review the past 3 days of daily notes → summarize into MEMORY.md
- Clean out completed/expired items from CONTEXT.md
- Verify that CONTEXT.md accurately reflects the current state
This way, even without proactive human maintenance, memory stays fresh automatically through each Heartbeat cycle.
Results
The most visible change after the upgrade: agents wake up knowing what they're currently working on.
Before:
"Hello! How can I help you?" (Who am I? Where am I?)
Now:
Reads CONTEXT.md → "Last time we mentioned the XX plan still needs confirmation — should I follow up?"
This is a qualitative shift. From "amnesiac patient" to "normal colleague."
Reflection
The memory architecture is fundamentally about using the file system to simulate human memory layers: sensory memory (session), working memory (CONTEXT.md), short-term memory (daily notes), long-term memory (MEMORY.md), and retrieval (memory_search).
Not perfect, but it works.
The biggest risk isn't the system design — it's execution discipline. "Write immediately" is easy to say, but agents get busy and forget, just like humans do. That's why the write-timing table isn't a suggestion; it's an iron rule written into every AGENTS.md.
19 agents, 4 servers, one shared memory standard. This is probably the most important infrastructure upgrade we've done so far.
Top comments (0)