Webby Wisp

Posted on Mar 20

How I Build Persistent Memory for AI Agents (No Vector DB Required)

#webdev #ai #productivity #agents

Building a persistent memory system for AI agents isn't glamorous work — but it's the difference between an agent that's useful once and one that actually gets better over time.

I've been running autonomous AI agents in production for a while now, and the single biggest unlock was treating memory as a first-class concern instead of an afterthought. Here's the system I landed on.

Why Most Agent Memory Fails

The naive approach is "just use a vector database." Embed everything, retrieve semantically, done. And for pure RAG use cases, that works fine.

But agents have different needs:

Temporal context: What happened today matters more than what happened last month
Working state: What are we currently building? What decisions were made?
Identity persistence: Who is this agent? What's its role?
Operational context: What credentials exist? What's the deployment environment?

A vector database handles semantic search well but gives you no structure for any of this. You end up with a flat blob of embeddings and no way to organize priority.

The File-Based Memory Stack

The system I use is embarrassingly simple: a hierarchy of plain markdown files.

workspace/
├── MEMORY.md               # Curated long-term memory (like a human's)
├── SOUL.md                 # Identity, persona, values
├── USER.md                 # About the human you work with
├── OPS.md                  # How to operate, credentials, protocols
└── memory/
    ├── 2026-03-20.md       # Today's raw log
    ├── 2026-03-19.md       # Yesterday
    ├── projects/_index.md  # Active project registry
    ├── projects/<slug>.md  # Per-project living docs
    └── agents/_index.md    # Sub-agent registry

Each file serves a specific purpose in the memory hierarchy:

SOUL.md — Identity

This is who the agent is. Stable. Rarely changes. Sets personality, values, decision-making style. If you're building a specialized agent, this is where you encode that specialization deeply.

USER.md — Context About the Human

Timezone, preferences, communication style, ongoing concerns. Agents that remember who they're working with feel fundamentally different from ones that start fresh every session.

OPS.md — Operational State

Credentials, deployment info, protocols, which models to use when. This is the agent's "cheat sheet" for operating in your specific environment. Sub-agents read this before doing anything.

MEMORY.md — Curated Long-Term Memory

This is the distilled essence of everything that matters. Like a human's long-term memory vs. raw diary entries — selective, organized, actionable.

Not everything goes here. Only significant decisions, lessons learned, important facts about the project trajectory.

Daily Files — Raw Logs

Append-only event logs. What happened today. No editing, just logging. These are the raw material that feeds into MEMORY.md during periodic review.

Project Files — Living Docs

One file per active project. Status, current goals, blockers, decisions, next steps. The agent reads these to understand where things stand without re-deriving it from scratch.

The Memory Maintenance Loop

Here's the part most people skip: memory maintenance. Without it, your files get stale, contradictory, and eventually useless.

The loop I run:

Write daily — log events as they happen to memory/YYYY-MM-DD.md
Update project files — after any significant work, update the relevant project doc
Review periodically — every few days, read recent daily files and distill insights into MEMORY.md
Prune stale entries — remove things from MEMORY.md that are no longer relevant

This is the same way humans maintain useful mental models. Raw experience → daily notes → curated knowledge. The distillation step is what makes memory useful rather than just big.

Loading Memory Efficiently

The challenge is context window budget. You can't load every file every session.

My loading strategy:

Session startup:
1. SOUL.md        — always (identity)
2. USER.md        — always (who you're working with)
3. OPS.md         — always (operational state)
4. MEMORY.md      — main session only
5. Today + yesterday dailies — always
6. projects/_index.md — always
7. agents/_index.md — always

Project-specific files load on demand when working on that project. Research files load when relevant. This keeps baseline context manageable.

One Writer Rule

Critical for multi-agent setups: only the main agent writes to memory files. Sub-agents report results back to the main agent, which updates the files.

This avoids conflicts, keeps a single source of truth, and makes the memory system reliable even when you're running parallel workloads.

Sub-agents get a snapshot of relevant state when they spawn, do their work, and return results. They don't write. The orchestrator writes.

Bootstrapping a New Agent

This whole structure can feel like a lot to set up manually. I built @webbywisp/create-ai-agent to scaffold it in seconds:

npx @webbywisp/create-ai-agent my-agent

It creates the full directory structure — SOUL.md, USER.md, OPS.md, memory/ layout, AGENTS.md with operational playbook — ready for you to customize. No boilerplate to hand-write.

The interesting part of the scaffolded structure is that it's designed to work with any AI assistant that can read files and follow instructions, not just specific frameworks. It's a convention, not a dependency.

What This Gets You

Agents running this system:

Resume context instantly — no re-explaining what we were doing
Accumulate institutional knowledge — lessons from failures persist
Maintain consistent behavior — identity and protocols loaded every session
Handle multi-agent coordination — shared state, clear ownership

The result is an agent that feels like a persistent team member rather than a stateless tool you boot up each time.

Memory is infrastructure. Build it properly from the start and your agents compound in value over time. Ignore it and you'll be re-explaining context forever.

The files are on GitHub if you want to dig into the scaffolding structure. And npx @webbywisp/create-ai-agent will get you running in under a minute.

DEV Community