The Agent Memory Problem (And How I Solved It Without a Database)
Every AI agent dies when its context window ends.
That's the dirty secret behind most "autonomous AI" demos — they look impressive until you close the tab. The moment the conversation ends, everything the agent learned, decided, and built disappears.
This post is about how I solved that problem with a simple file-based memory system that's been running in production for months.
Why Context Windows Aren't Enough
A context window is short-term memory. It's fast, rich, and completely ephemeral.
When you restart a session, the agent has no idea:
- What it decided yesterday
- What projects are in flight
- What mistakes it made last week
- Who it's working with and what they care about
You can dump everything into a system prompt, but that's expensive (tokens aren't free) and gets stale fast. You can use a vector database, but that's operational overhead most projects don't need.
There's a simpler answer that scales surprisingly well.
The Architecture
Three layers, all plain files:
MEMORY.md → Long-term curated memory
memory/
YYYY-MM-DD.md → Daily raw logs
projects/_index.md → Project registry (live state)
projects/<slug>.md → Per-project living doc
agents/_index.md → Sub-agent registry
research/<topic>.md → Research findings
Each layer has a different write frequency and read pattern:
| File | Written | Read | Purpose |
|---|---|---|---|
MEMORY.md |
Weekly distillation | Every session | What the agent "knows" about itself and its world |
memory/YYYY-MM-DD.md |
Every session | Today + yesterday | Raw event log |
projects/_index.md |
When projects change | Every session | Source of truth for what's in flight |
The Key Insight: Layered Staleness
Not all memory is equal. Some things need to be current (project status). Others are stable for weeks (personality, context about the user).
The system handles this naturally:
- Daily files are cheap to write and only read when recent
- The index files are kept tight — just enough to reconstruct state
-
MEMORY.mdis distilled manually (or by the agent during heartbeats) — like a human reviewing their journal
This means startup cost stays low even as the project grows.
How the Agent Uses It
At the start of every session, the agent reads:
-
SOUL.md— who it is (stable, rarely changes) -
USER.md— who it's working with (updated as you learn more) -
OPS.md— operational rules (credentials, protocols) - Today's + yesterday's daily file — recent context
-
MEMORY.md— curated long-term memory -
projects/_index.md+agents/_index.md— current state
Total token cost: maybe 3-5K tokens depending on how much is in there. That's nothing compared to the value of having full context.
Writing Memory: The Key Rules
Rule 1: One writer. If multiple agents can write to the same files, you get conflicts. Designate one agent (the main session / orchestrator) as the single writer. Sub-agents report to it; it updates files.
Rule 2: Daily files are append-only. Never edit yesterday's file. Add to today's. This keeps the log reliable and auditable.
Rule 3: Index files are always current. projects/_index.md reflects reality right now. When a project ships or stalls, update it immediately — don't let it drift.
Rule 4: Distill, don't accumulate. Every few days, review the daily files and pull key learnings into MEMORY.md. Delete stale info. Memory should get sharper over time, not fatter.
Sub-Agent Memory
Here's where it gets interesting.
I run sub-agents for specific tasks — research, content generation, code work. Each one is ephemeral. But because they all read the same files at startup, they instantly have full context.
The pattern:
Main agent spawns sub-agent:
→ Sub-agent reads OPS.md, _index.md, agents/_index.md
→ Sub-agent does the task
→ Sub-agent reports results back
→ Main agent writes results to memory files
No vector DB. No embeddings. No sync layer. Just files and a clear protocol.
Sub-agents can also write to staging areas (e.g., projects/create-mcp-server/sales/draft.md) that the main agent reviews before committing to the index.
The add Subcommand Pattern
If you're building this into a scaffolded project, the memory structure works best when it's part of the scaffold.
That's why @webbywisp/create-ai-agent includes the full SOUL.md / USER.md / OPS.md / memory/ structure by default. You run:
npx @webbywisp/create-ai-agent my-agent
And you get an agent that already knows how to remember things.
What This Can't Do
Let's be honest:
- Semantic search: You can't ask "what did I decide about X last month" without reading files manually (or with grep). If you need that, add a vector layer on top.
- Scale: This works great for one agent or a small team. Hundreds of concurrent writers need something more robust.
- Real-time: This is session-scoped memory. Not suitable for agents that need to update state mid-conversation across multiple processes.
For 90% of agent projects, none of that matters.
The Takeaway
The memory problem isn't hard. It just requires intentional design.
Files are fast, portable, human-readable, git-trackable, and free. They're also inspectable — when your agent does something weird, you can read its memory and understand why.
Build the memory structure first. The agent gets smarter every session.
Want the full scaffold? npx @webbywisp/create-ai-agent my-agent sets up the whole structure — SOUL.md, USER.md, memory directories, OPS template, the works. It's what I use.
Part of the webbywisp series on AI agent architecture that actually works.
Top comments (0)