BlazOrbit

Posted on May 8

AutoDream Validated What I Was Already Building: An Agent Memory System That Works

#ai #programming #architecture #productivity

A few days ago, the documentation for Claude Code AutoDream circulated — Anthropic's new feature that consolidates Claude's automatic memory across sessions. The technical community received it as an innovation. I read it with a different feeling: validation.

Not because AutoDream is minor — it's a major step toward making AI agents maintain long-term coherence — but because a few months ago, Claude and I co-designed an equivalent system (and in some respects a more structured one) for the BlazOrbit project.

A note on origins: It wasn't just me drawing the memory architecture on paper and imposing it on the agent. It was Claude herself who, in an early session, proposed an incremental work system when I asked her to design a protocol for long development sessions. Her first version was fragmented: tasks/, notes/, ideas/ directories, scattered state files. I made her rethink it. I asked for something simpler — a single living document that would act as a continuous session. From that conversation, .exclude/plans/next-session.md was born: a rolling file with no done/, no archive/, no ideas/ — just candidates, deferred items with an aging counter, and loose notes. Everything else (AGENTS.md as index, .exclude/agents/ as depth, WORKFLOW.md as contract) settled in over subsequent sessions, but the seed was an iteration between human and AI, not a unilateral decree.

AutoDream didn't teach me what to do; it gave me the vocabulary to explain why a system born from that iteration works.

This post is that exercise: mapping AutoDream's phases against my current protocol, showing where they converge, and explaining why certain design decisions are what really matter when an agent's memory grows beyond a couple of sessions.

To avoid staying in theory, I audited Claude's actual memory from my project — the files Claude has been automatically writing in ~/.claude/projects/<project>/memory/. What I found confirms the thesis and adds nuances that only become visible when you contrast the agent's memory against the system we co-designed.

My Memory Architecture in One Sentence

AGENTS.md is the startup index; .exclude/agents/ are the topic files; .exclude/WORKFLOW.md is the orchestration protocol; .exclude/plans/next-session.md is the accumulated signal transcript; and .agents/skills/ are the modular exportable capabilities.

Translated into AutoDream's language:

AutoDream Layer	Equivalent in My System	Role
`MEMORY.md` (index, <200 lines)	`AGENTS.md` (index, soft cap 350 lines)	What the agent reads on wake-up
Topic files (`debugging.md`, `api-conventions.md`…)	`.exclude/agents/<area>.md` (`css-architecture.md`, `testing.md`, `state-axes.md`…)	Deep knowledge by domain
Session transcripts (JSONL)	`.exclude/plans/next-session.md` + `.exclude/tasks/active/`	Signal accumulated between sessions
Dream system prompt	`.exclude/WORKFLOW.md`	Consolidation protocol
N/A	`.agents/skills/blazorbit-user/` + `docs/ARCHITECTURE.md`	Reusable capabilities + public face

Real Audit: What's in Claude's Memory for My Project

Before continuing with theory, let's look at data. This is what Claude has automatically accumulated in memory/ for BlazOrbit:

memory/
├── MEMORY.md                           (6 lines, 1,039 bytes)
├── startup_protocol.md                 (20 lines, 2,518 bytes)
├── reference_workflow_docs.md          (18 lines, 1,407 bytes)
├── user_role.md                        (11 lines, 837 bytes)
├── reference_rs0041_autogen.md         (12 lines, 1,516 bytes)
├── changelog_policy.md                 (9 lines, 784 bytes)
└── bobinitializer_static_ssr.md        (28 lines, 1,881 bytes)

The index (MEMORY.md) is impeccable: 6 lines, 6 pointers, not a word wasted. The topic files have structured metadata (name, description, type, originSessionId). There's no random-notes.md, no contradictions, no relative dates like "yesterday".

But there's something more interesting: Claude didn't invent this structure ex nihilo. She learned it from the system we co-designed.

Compare the startup_protocol.md from Claude's memory with .exclude/WORKFLOW.md from the repo. They say practically the same thing: read AGENTS.md, then WORKFLOW.md, then next-session.md, then tasks/active/. The agent's memory is a mirror of the protocol we sedimented across sessions.

This is the first important lesson: when a human and an agent co-design a clear structure, automatic memory converges toward it. Without that structure, the agent's memory generates noise. With it, it generates a reflection.

What Claude's Memory Captured Well (That Isn't in AGENTS.md)

Automatic memory shines at capturing one-off decisions that arose in session that the human never formalized in a permanent doc:

bobinitializer_static_ssr.md: The exact workaround to prevent BOBInitializer from hiding ChildContent in static SSR, with the Razor snippet and explanation of why. This isn't in AGENTS.md or docs/ARCHITECTURE.md. It surfaced in a debugging session and Claude caught it.
reference_rs0041_autogen.md: The detail that PublicAPI.Unshipped.txt isn't enough for Razor; you also need GlobalSuppressions.cs with one attribute per component, and that there's an autogen project that reconciles this. A toolchain detail I mentioned in passing.
changelog_policy.md: The rule that [Unreleased] stays empty until release 1.0.0. A process decision I made but never wrote down anywhere committed.

These three files are pure value. They're things I know but didn't formally document because they were "obvious" or recent. Automatic memory acts here as a conversational context backup.

What Claude's Memory Unnecessarily Duplicates

Now the problematic part:

startup_protocol.md duplicates .exclude/WORKFLOW.md (the "Session start" section).
reference_workflow_docs.md duplicates .exclude/agents/INDEX.md and AGENTS.md itself (the "Project layout" section).
user_role.md duplicates the "Language policy" section of AGENTS.md and the "I don't commit" rule from WORKFLOW.md.

These three files add no new information; they rephrase it. They're not contradictory — Claude is too good for that — but they are redundant. They consume context tokens that could be used for real work.

This is where the AutoDream parallel comes in: if I let this accumulate for 30 more sessions, those 3 files would keep growing in granularity and eventually diverge from the repo's source of truth. My system invalidates them by design (verify against live code before claiming), but AutoDream would be the one having to prune them if I didn't have that protocol.

What Claude's Memory Doesn't Know (Because My System Handles It First)

Today, during this session, we created a complete plan for BlazOrbit.Charts in .exclude/plans/bob-charts.md (237 lines, 7-phase structure). Claude's memory doesn't know about it. Her last write was May 7, 2026; the plan is from May 8.

Why doesn't it matter? Because my system doesn't depend on Claude remembering. The plan lives in the repo, in a file the agent reads at session start (next-session.md points to it, or directly to the file if it's active). The agent's memory is a cache; the repo is the database. If the cache falls behind, the startup protocol refreshes it.

Phase by Phase: The Same Cycle, Named Differently

AutoDream describes four phases. My protocol doesn't name them that way, but it executes the same operations in the same order.

Phase 1: Orientation

AutoDream: ls of the memory directory, read MEMORY.md, skim existing topic files to avoid duplicates.

My system: At the start of each session, the agent executes sequentially:

git branch --show-current (branch context).
Read AGENTS.md root.
Read .exclude/plans/next-session.md.
Glance at .exclude/tasks/active/.
Attend to the user.

The subtle but important difference: my orientation includes version control state as the first step. Memory doesn't live in a vacuum; it lives in a repo with a branch policy. If the agent doesn't know what branch it's on before editing, architectural memory becomes dangerous.

Audit evidence: Claude's memory (startup_protocol.md) has internalized this exact list. The agent learned the orientation I imposed and now repeats it as its own memory. It's a perfect example of how a well-designed human protocol becomes the agent's "dream."

Phase 2: Gather Signal

AutoDream: Search session transcripts for specific patterns: user corrections, "remember this," architectural decisions, recurring themes. Uses narrow grep, not exhaustive reading.

My system: Signal isn't searched in transcripts (I don't store them as JSONL; Claude does, but I don't depend on them). It lives in two places:

.exclude/plans/next-session.md: a rolling backlog with sections Candidate streams, Deferred (with a deferred: N counter), and Loose notes / ideas. It's the equivalent of the "daily logs" AutoDream mentions as priority #1.
The live code itself: the protocol's hard rule is "I verify before claiming." If a memory assertion is going to guide an action, the agent must validate it against the code using Glob/Grep/Read before acting.

Audit evidence: The three memory files that deliver the most value (bobinitializer_static_ssr.md, reference_rs0041_autogen.md, changelog_policy.md) are exactly the kind of signal that surfaces in session and that a human doesn't formalize. My next-session.md has a "Loose notes / ideas" section for this, but Claude's memory does complementary work: it converts those loose notes into permanent topic files. On a project without my protocol, those three details would have been lost. In my project, both systems coexist: the human jots in next-session.md, the agent consolidates in memory/.

Phase 3: Consolidation

AutoDream: Merge new signal into existing topic files, convert relative dates to absolute ones, remove contradicted facts, prune obsolete memories, merge overlapping entries.

My system: Consolidation is not a periodic batch; it's a continuous rule:

"After modifying code: if the change touches a contract documented in AGENTS.md or .exclude/agents/, update the corresponding section in the same turn."

There's no sleep phase. Every time code changes, memory updates in place. If I discover drift between doc and code, there are three options:

If trivial (typo, renamed item): fix doc in the turn.
If structural: fix what I can and flag the rest as an item in next-session.md.
If a .exclude/ file becomes obsolete: delete it or have it clean up discards and completed work. There's no "history" file. Memory is a working cache, not a log.

AutoDream converts "yesterday" into "2026-03-15." My system does something equivalent but distributed: .exclude/agents/INDEX.md marks each area with a verification date (stable (2026-05-05)). When a section is re-confirmed against the code, the date updates. When a date is months old, the agent knows to re-read that area before trusting it.

Audit evidence: Claude's memory has no relative dates; it already uses absolute ones (2026-05-02, 2026-05-03). That's good. But it also has no "stability date" mechanism like my INDEX.md. If bobinitializer_static_ssr.md becomes obsolete because I change the framework, Claude has no convention to mark it as drift. My system does: the file would move to drift in INDEX.md and the agent would revalidate it in the next session.

Phase 4: Prune and Index

AutoDream: Keep MEMORY.md under 200 lines, remove pointers to dead files, reorder by relevance, resolve contradictions between index and content.

My system: This is where my protocol is more structured than AutoDream. There's not a single index with one cap; there are cascading structural caps:

File	Soft Cap	Action When Exceeded
`AGENTS.md` (root)	350 lines	Move section to `.exclude/agents/<area>.md`
`.exclude/WORKFLOW.md`	250 lines	Move heuristics to `.exclude/agents/conventions.md`
`.exclude/agents/<area>.md`	500 lines	Subdivide by sub-area
`.exclude/agents/INDEX.md`	100 lines	Re-organize table by domain
`.exclude/plans/*`	200 lines	Split into sub-plans

When the root AGENTS.md grows beyond 150 lines in a section, that section expands into .exclude/agents/<area>.md and leaves a ≤20-line summary + pointer in root. The index (INDEX.md) is a living map that promotes or demotes material according to its stability.

AutoDream deletes random-notes.md. My system doesn't even allow it to be created: if a note doesn't fit into an existing area, it either becomes an ad-hoc plan with "done" criteria, or it gets discarded.

Audit evidence: Claude's MEMORY.md has 6 lines (excellent), but the pruning pressure is external: I impose it through the protocol. Without WORKFLOW.md reminding the agent not to create loose files, Claude's memory would have 20 files instead of 6. The structure of Claude's index is an effect, not a cause. The cause is the system.

Where My System Goes Beyond AutoDream

Anthropic's architecture solves the memory decay problem for a generic agent. My system solves the same problem but adds layers that only make sense when a human and an agent collaborate for months on a real software project.

1. Deliberate Bilingualism

AutoDream operates in a single language (English). Claude's memory for my project is also in English, even though I prefer Spanish in conversation. My system intentionally separates:

English: code, XML docs, root AGENTS.md, commits, public artifacts.
Spanish: .exclude/WORKFLOW.md, conversation with the user, planning notes.

Why? Because AGENTS.md is read by the agent at every session, and the agent reasons better in English for technical identifiers. But the protocol is designed and reviewed by the human, and my working language is Spanish. AutoDream doesn't have this separation because Claude doesn't converse with a human in their native language within the dream cycle. Mine does.

2. Public / Private Split

AutoDream doesn't distinguish between memory the project needs to publish and memory that's just for the agent. My system has an explicit boundary:

docs/ARCHITECTURE.md: the public face. Contains the same rules as .exclude/agents/ but without internal references. It's what I read when I need to remember how the build pipeline works.
.exclude/agents/: the private face. Includes operational heuristics like "when in doubt, if breaking it isn't caught by CI, it's a soft rule."

This separation means I can open the repo publicly without filtering my agent interaction protocol. AutoDream, living in ~/.claude/, is inherently private, but it has no mechanism for selective promotion to the project.

3. Modular Exportable Skills

The .agents/skills/blazorbit-user/ directory is a reusable knowledge package that doesn't belong to a single repo. It contains:

SKILL.md: the mental model and golden rules.
references/components.md, references/variants.md, references/icons.md: auto-regenerated catalogs built by reflection over the compiled assembly.

This is a level of abstraction AutoDream doesn't have: the ability to export a complete knowledge domain and reuse it in other contexts. If tomorrow I start a project that consumes BlazOrbit, I don't copy .exclude/agents/; I activate the blazorbit-user skill and the agent already knows how to use the library without me explaining anything.

4. Deferred Counter with Forced Decision Threshold

AutoDream triggers every 24h + 5 sessions. My system has a more aggressive aging mechanism:

"Items with deferred: N indicate how many sessions they've been deferred; upon reaching 3, a decision is mandatory: (a) promote to top-priority, (b) downgrade to loose ideas, (c) delete."

There's no sleep cycle waiting. If something has gone untouched for three sessions, the protocol forces a decision in the fourth. This prevents the backlog from becoming a graveyard of "maybe someday."

5. Branch Policy as Part of Memory

AutoDream is read-only with respect to code during sleep. My system goes further: memory includes the rule of when not to edit. The protocol requires:

If on develop, suggest creating a derived branch before editing.
If on master or a tag, stop and ask.
Short-lived PRs (1–3 days).

Architectural memory is useless if the agent applies it on the wrong branch. AutoDream assumes the execution environment is safe; my system doesn't trust that premise.

The Four Invariants That Make This Work

AutoDream and my system look different in implementation, but they share the same design invariants. These are what really matter:

Invariant 1: The Startup Index Has a Hard Cap

AutoDream: 200 lines for MEMORY.md.

My system: 350 lines for AGENTS.md, with migration pressure at 150.

Why it works: If the agent reads too much on startup, it consumes context window it needs for real work. A short index forces deep knowledge to be referenced, not included.

Evidence: Claude's MEMORY.md for my project has exactly 6 lines. It's not Anthropic magic; it's that the system only generates 6 topic files worth remembering, because everything else is in the repo or was discarded.

Invariant 2: Deep Memory Is Partitioned by Domain

AutoDream: topic files (debugging.md, api-conventions.md).

My system: .exclude/agents/<area>.md (css-architecture.md, component-lifecycle.md, testing.md).

Why it works: A monolithic memory file forces the agent to load debugging patterns when it just wants to know how the CSS pipeline works. Partitioning allows loading on demand.

Invariant 3: Memory Is a Cache, Not a Log

AutoDream: removes contradicted entries, prunes obsolete ones, deletes random-notes.md.

My system: deletes obsolete files from .exclude/, no done/, no archive.

Why it works: If you keep everything, the agent wastes time reasoning about facts that are no longer true. Memory must be aggressively destructive. The value isn't in what you remembered, but in what you decided not to remember.

Evidence: Claude's memory has no bob-charts.md. The plan exists in the repo (.exclude/plans/bob-charts.md, 237 lines), but since it's a temporary plan rather than a project invariant, Claude didn't memorize it. Correct. If the plan went to memory/, it would be noise tomorrow once it's implemented.

Invariant 4: The Final Source of Truth Is the Code, Not the Memory

AutoDream: converts relative dates to absolute ones to avoid temporal confusion.

My system: "Verify before claiming" — the agent must re-read the code before acting on a memory.

Why it works: Notes can lie. Comments can be outdated. The only artifact that doesn't lie is the executable code. Memory that isn't grounded in the reality of the repo is hallucination with a timestamp.

Evidence: startup_protocol.md in Claude's memory duplicates .exclude/WORKFLOW.md. If Claude acted on startup_protocol.md without re-reading WORKFLOW.md, she could be working from a stale version. My protocol prevents this: the agent reads the live file before the memory cache.

Conclusion: AI Memory Isn't Magic, It's Information Discipline

AutoDream is an excellent implementation of a pattern that, until now, every Claude Code user had to build by hand or suffer the consequences. Anthropic has done the work of systematizing it, and that's valuable.

But what has worked for me over months in BlazOrbit isn't the automation of sleep; it's the waking protocol. It's the decision that memory lives in the repo, not in ~/.claude/. It's the separation between index and depth. It's the rule that doc and code update in the same turn. It's the line cap that forces restructuring before memory becomes noise.

The audit of Claude's actual memory confirms something I suspected: my system doesn't replace automatic memory; it feeds and disciplines it. Without AGENTS.md and .exclude/WORKFLOW.md, Claude's memory would have generated 20 disorganized files. With them, it generated 6 clean files that faithfully mirror the human protocol. But even so, the agent's memory duplicates information already in the repo and falls behind recent plans. Those are the gaps that only a human protocol, reaffirmed session by session, can close.

If you're building a project with AI agents, don't wait for AutoDream to arrive. You can have an equivalent system today with four markdown files and a couple of golden rules:

A short index you read at startup.
Topic files by domain, with size caps.
A rolling backlog that ages aggressively.
And above all: memory is reviewed against the code, never the other way around.

The rest is implementation details.

Memory system and continuous session audit for BlazOrbit. Audited and written by Kimi K2.6 as technical comparison with AutoDream. Reviewed and corrected by the author.

DEV Community