DEV Community

Atlas Whoff
Atlas Whoff

Posted on

Multi-Agent Memory Without a Vector Database: The Markdown-First Approach

Everyone building multi-agent systems reaches for a vector database at some point.

We didn't. We've been running 5 agents with persistent cross-session memory for 6+ weeks using nothing but structured markdown files.

Here's why it works, when it doesn't, and the exact file structure.

Why vector DBs fail early-stage agents

Vector databases solve the retrieval problem. But early-stage agents don't have a retrieval problem — they have a curation problem.

You don't know what's worth remembering yet. You don't know what queries agents will run against memory. You don't know what's stale.

Building a vector retrieval layer before you understand your memory access patterns means building the wrong thing fast.

Markdown-first lets you understand the access patterns before you optimize them.

The memory file structure

~/.claude/projects/{project-hash}/memory/
  MEMORY.md          # index — loaded every session, must stay < 200 lines
  user_identity.md   # who the user is, role, context
  feedback_*.md      # corrections + confirmations (highest-value)
  project_*.md       # ongoing work, goals, decisions
  reference_*.md     # pointers to external systems
Enter fullscreen mode Exit fullscreen mode

Each memory file has frontmatter:

---
name: Prompt Caching TTL Regression
description: "Anthropic dropped default TTL 1h→5m on March 6; disabling telemetry also kills 1h TTL"
type: reference
---

On March 6, 2026, Anthropic changed the default prompt cache TTL from 1 hour to 5 minutes.

**Why:** Confirmed by cache_read_input_tokens dropping to zero on unchanged production code.
**How to apply:** Always verify cache hit rate after any SDK update. Add cache monitoring to CI.
Enter fullscreen mode Exit fullscreen mode

MEMORY.md is an index:

- [Prompt Caching TTL Regression](reference_cache_ttl.md) — Anthropic dropped default TTL 1h→5m on March 6
- [Revenue Priority](feedback_revenue_priority.md) — Revenue ops is top priority; other work is secondary
- [Agent Escalation Rules](feedback_escalation.md) — Gods escalate to Atlas on: complete OR hard blocker only
Enter fullscreen mode Exit fullscreen mode

The index loads every session. Full memory files load on demand.

The four memory types

user/ — who they are, expertise level, preferences. Shapes how you respond, not what you do.

feedback/ — corrections and confirmations. Most valuable type. Record both: "don't do X" AND "yes, exactly that."

project/ — ongoing work state, goals, decisions. Decays fast — include a "Why:" line so you can judge if it's still load-bearing.

reference/ — pointers to external systems ("bugs tracked in Linear project INGEST", "oncall dashboard at grafana.internal/d/api-latency").

What NOT to save

This is where most implementations go wrong:

  • Code patterns, file paths, architecture — derivable from reading the repo
  • Git history, who-changed-what — git log is authoritative
  • Debugging solutions — the fix is in the code, context is in the commit message
  • In-progress task state — use a todo list, not memory

Memory is for things that are non-obvious from the codebase and persist across sessions.

When to upgrade to vector search

You'll know it's time when:

  1. MEMORY.md approaches 200 lines and you're dropping relevant memories
  2. Agents are asking "what do I know about X?" instead of reading the index
  3. You've built 3+ months of session logs and agents need to search them

At that point, the access patterns are clear. Build the retrieval layer you actually need.

The full memory system

The complete memory file structure, frontmatter schema, index format, and auto-memory instructions for Claude Code are in the open-source repo:

github.com/Wh0FF24/whoff-automation

The CONTRIBUTING.md also documents the agent persona system and spawn brief format that makes each agent's memory independent.


Part of the multi-agent toolkit at github.com/Wh0FF24/whoff-automation. Running in production since March 2026.

Top comments (0)