Dmytro Halichenko

Posted on May 4

Five Components of Agent Memory, Implemented in Plain Markdown

#ai #markdown #opensource #agents

An agent memory system, to actually be useful, has to do five things: persist information across sessions, give it structure, support retrieval, allow writeback, and handle forgetting.

Most current implementations cover two or three of these. Vector databases nail retrieval but have no structure a human can read. Prompt files like .cursorrules give you persistence and a thin slice of structure but nothing else. Conversation history scrolls past the context window and is gone.

This post walks through the five components and shows how IWE implements each one. No vector database. No proprietary index. The substrate is a directory of markdown files.

The substrate

IWE is a Rust binary that treats a directory of markdown files as a knowledge graph. Links between files are edges. A link on its own line — [Authentication](auth.md) — is an inclusion link, which establishes a parent-child relationship. Notes can have multiple parents. The graph is rebuilt from the files on every change; the files are always the source of truth.

Agents talk to the graph through a CLI (iwe) or an MCP server (iwec). Humans talk to it through their editor — VS Code, Neovim, Zed, Helix — over LSP.

That's the whole architecture. Now the five components.

1. Persistence

Memory lives as .md files in a directory you choose. It survives sessions, restarts, OS reinstalls, and IWE itself — uninstall the binary tomorrow and your memory is still there, readable in any text editor. Git versions it. rsync backs it up. You already know how to operate it.

Compare to a vector DB or a proprietary blob format: if the vendor changes the format, your memory is stranded.

2. Structure

The structure is the graph. Headers define sections. Inclusion links define hierarchy. Reference links define cross-references. The same note can appear under multiple parents — "Performance Optimization" can live under both Frontend and Backend without duplication.

What structured memory needs — entities, facts, decisions, relationships, temporal context — in IWE these are just notes with links. An entity is a note. A decision is a note linked from the project that made it. A relationship is an inclusion link. Temporal context is git log.

This isn't a clever encoding. It's the boring observation that markdown has always been a graph format, and humans have always written in it.

3. Retrieval

IWE exposes the graph through a query language that reads like Mongo's filter syntax but runs against markdown files:

iwe find "authentication"                           # fuzzy search across titles
iwe find --filter 'status: draft, priority: {$gte: 8}'  # frontmatter predicates
iwe find --included-by projects/alpha:0             # everything under alpha's subtree
iwe find --included-by projects/alpha:0 \
         --references people/dmytro \
         --filter 'status: draft'                   # compose structure + fields

The filter operators are the ones you already know: $eq, $ne, $in, $gt/$gte/$lt/$lte, $exists, $and, $or, $not. On top of that, four structural anchors walk the graph's two edge types: --includes, --included-by for hierarchy, --references, --referenced-by for cross-references. Stack them to compose multi-edge queries — "drafts under this project that mention this person" is three flags, not a join.

For navigating, retrieve fetches a note plus its descendants to a given depth, tree shows the hierarchy, squash flattens a subtree into one document. Projection (--project title,status), sorting (--sort modified_at:-1), and limits round it out. count gives you an integer instead of results.

Retrieval is deterministic graph traversal, not similarity search. The structure the human wrote is the retrieval signal. When the agent gets wrong context, you trace which link brought it in and fix the graph.

4. Writeback

The agent writes back the same way you do. Create a note, link it in, done. But writeback goes further than append-only — the same query language that drives reads also drives mutations:

iwe update -k meeting-notes --set status=reviewed --set priority=3
iwe update --filter 'status: draft, reviewed: true' \
           --set status=published \
           --unset draft_notes

update takes a filter or a key, applies --set and --unset to every match, and rewrites the frontmatter in place. The agent can promote a batch of drafts, tag a set of notes, or adjust priorities across a project — the same filter it used to find the notes is the one it uses to change them.

The MCP server exposes the same operations as tools: create, update, extract (split a section into its own note), inline (pull a referenced note back in), attach (link a note under a new parent). extract and inline let the agent refactor memory the same way a developer refactors code — a note that's grown too long gets split, small notes that belong together get inlined into a summary.

This is what separates real memory from RAG: the agent can read, write, query, and restructure. One source of truth, no sync layer.

5. Forgetting

This is the hardest one. IWE now has a first-class delete with the same filter language:

iwe delete --filter 'status: archived'
iwe delete --filter '{type: scratch, modified: {$lt: "2026-01-01"}}'

Delete is reference-aware: inclusion links to removed notes are cleaned up automatically, inline links are flattened to plain text. Nothing dangles. --dry-run previews without writing. Git means nothing is truly lost — deletions are commits you can reverse.

The full forgetting loop: iwe stats surfaces orphan nodes, dead links, and graph density. iwe find --roots shows unlinked documents — candidates for archival. extract and inline let you summarize subtrees into higher-level notes and archive the details. delete prunes what's no longer needed. The agent can run this loop autonomously: find stale notes, summarize, prune, commit.

Not as elegant as automatic temporal decay, but every step is auditable and reversible.

What this buys you

For agent memory specifically, plain markdown files have a property no vector DB can match: the agent and the human read the same bytes. There is no translation layer where the human's mental model can drift from the agent's stored representation. When the agent surfaces a surprising response, you grep the directory and find out why.

The architectural shape — a human-editable instruction layer, a structured memory layer, a graph view both humans and agents share — is showing up in several places right now. IWE is one open-source implementation of it. The binary is on GitHub, on Homebrew, on crates.io. Point an agent at it and try.

Top comments (2)

PEACEBINFLOW • May 4

The idea that agent memory and human memory should read the same bytes — no translation layer, no embedding drift — is one of those design principles that sounds modest but is actually radical in the current landscape. Vector databases optimize for retrieval speed and semantic similarity, but they sacrifice auditability. When the agent gives you something surprising, you can't grep a vector. You're guessing at why that chunk was retrieved.

What I think is genuinely novel here isn't the markdown substrate — lots of tools use markdown — but the idea that the query language for agents to read memory and the query language for humans to browse memory should be the same thing. iwe find --filter 'status: draft, priority: {$gte: 8}' works whether it's an agent looking for high-priority unfinished work or a developer trying to remember what they left half-done. That symmetry means the agent's retrieval logic is reviewable in a way that similarity search results never are.

The risk with any graph-based memory system is that the structure the human builds reflects the human's mental model at a particular moment, and mental models drift. The graph becomes a fossil of how you thought about the project six months ago rather than how the codebase actually works now. I'm curious whether there's any mechanism in IWE for the agent to detect that drift — to notice that the graph structure doesn't match what it's actually encountering in the code — or if maintaining the graph's fidelity is entirely a human responsibility. That seems like the gap between "memory that persists" and "memory that stays accurate."

NOVAInetwork • May 6

The "agent and the human read the same bytes" property
is the thing most memory systems get wrong. When the
stored representation and the mental model can drift
apart, debugging becomes guesswork.

I'm solving a related problem on-chain. On NOVAI, AI
entities have protocol-level memory objects with five
typed categories (ChainSummary, LabelIndex,
EmbeddingCommitment, AnomalyLog, StatisticsSnapshot).
Each object is capped at 64 KiB, each entity is capped
at 100 objects. These are protocol constants, not
application logic.

The parallel to your approach: the memory is queryable
by anyone. Another entity, a wallet, or an explorer can
call getMemoryObjects and see exactly what an AI stored.
No translation layer. No proprietary index. The chain
is the shared substrate the way your markdown directory
is.

Where the trade-offs diverge: your system gives the
agent full restructuring power (extract, inline,
delete). Ours constrains it. The entity can create,
update, and delete memory objects but only if the
capability bitfield allows it. The protocol decides
what the agent can do with its own memory. That's the
right trade-off for on-chain agents where you need
governance over AI behavior, but it would be wrong for
a local dev tool where flexibility matters more.

The forgetting loop (find stale, summarize, prune,
commit) is interesting. We don't have that yet. Our
pruning is blunt: block-level pruning removes old
chain history, but entity memory objects persist until
explicitly deleted. An autonomous decay system at the
entity level is worth thinking about.