DEV Community

Webby Wisp
Webby Wisp

Posted on

The Agent Memory Problem (And How I Solved It Without a Database)

The Agent Memory Problem (And How I Solved It Without a Database)

Every AI agent dies when its context window ends.

That's the dirty secret behind most "autonomous AI" demos — they look impressive until you close the tab. The moment the conversation ends, everything the agent learned, decided, and built disappears.

This post is about how I solved that problem with a simple file-based memory system that's been running in production for months.


Why Context Windows Aren't Enough

A context window is short-term memory. It's fast, rich, and completely ephemeral.

When you restart a session, the agent has no idea:

  • What it decided yesterday
  • What projects are in flight
  • What mistakes it made last week
  • Who it's working with and what they care about

You can dump everything into a system prompt, but that's expensive (tokens aren't free) and gets stale fast. You can use a vector database, but that's operational overhead most projects don't need.

There's a simpler answer that scales surprisingly well.


The Architecture

Three layers, all plain files:

MEMORY.md                     → Long-term curated memory
memory/
  YYYY-MM-DD.md               → Daily raw logs
  projects/_index.md          → Project registry (live state)
  projects/<slug>.md          → Per-project living doc
  agents/_index.md            → Sub-agent registry
  research/<topic>.md         → Research findings
Enter fullscreen mode Exit fullscreen mode

Each layer has a different write frequency and read pattern:

File Written Read Purpose
MEMORY.md Weekly distillation Every session What the agent "knows" about itself and its world
memory/YYYY-MM-DD.md Every session Today + yesterday Raw event log
projects/_index.md When projects change Every session Source of truth for what's in flight

The Key Insight: Layered Staleness

Not all memory is equal. Some things need to be current (project status). Others are stable for weeks (personality, context about the user).

The system handles this naturally:

  • Daily files are cheap to write and only read when recent
  • The index files are kept tight — just enough to reconstruct state
  • MEMORY.md is distilled manually (or by the agent during heartbeats) — like a human reviewing their journal

This means startup cost stays low even as the project grows.


How the Agent Uses It

At the start of every session, the agent reads:

  1. SOUL.md — who it is (stable, rarely changes)
  2. USER.md — who it's working with (updated as you learn more)
  3. OPS.md — operational rules (credentials, protocols)
  4. Today's + yesterday's daily file — recent context
  5. MEMORY.md — curated long-term memory
  6. projects/_index.md + agents/_index.md — current state

Total token cost: maybe 3-5K tokens depending on how much is in there. That's nothing compared to the value of having full context.


Writing Memory: The Key Rules

Rule 1: One writer. If multiple agents can write to the same files, you get conflicts. Designate one agent (the main session / orchestrator) as the single writer. Sub-agents report to it; it updates files.

Rule 2: Daily files are append-only. Never edit yesterday's file. Add to today's. This keeps the log reliable and auditable.

Rule 3: Index files are always current. projects/_index.md reflects reality right now. When a project ships or stalls, update it immediately — don't let it drift.

Rule 4: Distill, don't accumulate. Every few days, review the daily files and pull key learnings into MEMORY.md. Delete stale info. Memory should get sharper over time, not fatter.


Sub-Agent Memory

Here's where it gets interesting.

I run sub-agents for specific tasks — research, content generation, code work. Each one is ephemeral. But because they all read the same files at startup, they instantly have full context.

The pattern:

Main agent spawns sub-agent:
  → Sub-agent reads OPS.md, _index.md, agents/_index.md
  → Sub-agent does the task
  → Sub-agent reports results back
  → Main agent writes results to memory files
Enter fullscreen mode Exit fullscreen mode

No vector DB. No embeddings. No sync layer. Just files and a clear protocol.

Sub-agents can also write to staging areas (e.g., projects/create-mcp-server/sales/draft.md) that the main agent reviews before committing to the index.


The add Subcommand Pattern

If you're building this into a scaffolded project, the memory structure works best when it's part of the scaffold.

That's why @webbywisp/create-ai-agent includes the full SOUL.md / USER.md / OPS.md / memory/ structure by default. You run:

npx @webbywisp/create-ai-agent my-agent
Enter fullscreen mode Exit fullscreen mode

And you get an agent that already knows how to remember things.


What This Can't Do

Let's be honest:

  • Semantic search: You can't ask "what did I decide about X last month" without reading files manually (or with grep). If you need that, add a vector layer on top.
  • Scale: This works great for one agent or a small team. Hundreds of concurrent writers need something more robust.
  • Real-time: This is session-scoped memory. Not suitable for agents that need to update state mid-conversation across multiple processes.

For 90% of agent projects, none of that matters.


The Takeaway

The memory problem isn't hard. It just requires intentional design.

Files are fast, portable, human-readable, git-trackable, and free. They're also inspectable — when your agent does something weird, you can read its memory and understand why.

Build the memory structure first. The agent gets smarter every session.


Want the full scaffold? npx @webbywisp/create-ai-agent my-agent sets up the whole structure — SOUL.md, USER.md, memory directories, OPS template, the works. It's what I use.


Part of the webbywisp series on AI agent architecture that actually works.

Top comments (0)