Solomon Neas

Posted on Apr 17 • Originally published at solomonneas.dev

Dreaming Is Useful. Structured Memory Is Better

#openclaw #memory #agentarchitecture #embeddings

I ran OpenClaw Dreaming for a full week on top of my existing memory stack to answer one question: does Dreaming actually improve memory quality, or does it just inflate memory volume?

Both. It surfaced real signal I would have lost. It also dumped enough boilerplate into the promotion stream to prove structured memory still has to be the foundation. If you want the official feature overview first, OpenClaw's Dreaming docs are here: Dreaming. After a week, Dreaming stays on, but as a supporting layer. Not the system.

The baseline was already working

This trial did not start from zero. The stack was already in daily use:

Daily logs in memory/YYYY-MM-DD.md for raw continuity
Atomic knowledge cards in memory/cards/*.md for durable facts and lessons
A slim MEMORY.md acting as an index, not a data landfill
Semantic retrieval over cards using local embeddings
A Memory Sweep cron for review and promotion discipline

That architecture exists because monolithic memory eventually collapses under its own weight. Retrieval gets noisy, cost climbs, and the agent starts missing things that are technically "in memory" but practically unrecoverable. Structured memory fixes that by treating memory as a retrieval system instead of a dump file.

Trial config

Dreaming was enabled on 2026-04-06 as a one-week trial with nightly cadence:

dreaming.enabled=true
dreaming.frequency="0 3 * * *"

Documented as a trial, not a migration, with explicit concern about noisy promotions. A review cron was scheduled for 2026-04-14 to evaluate impact after one full week of live usage.

Nothing else in the memory pipeline was touched. Cards, logs, retrieval, and sweep all stayed active so Dreaming could be evaluated as a pure additive layer.

What Dreaming actually does

From observed behavior, Dreaming runs a nightly retrospective pass over short-term recall:

grounded REM/backfill flow
diary-style processing
candidate durable-fact extraction
promotion hooks into long-term memory surfaces

In plain terms, it is a second-pass recall mechanism. It can rescue durable information that never got manually promoted during the day. That is real value in long, messy sessions.

One-week health check

Core memory infrastructure came out clean:

Main memory healthy
Embeddings ready
Vector search ready
FTS ready
Recall store active
Dreaming cron active
memory_search returning relevant results

Operationally, nothing regressed. Cards stayed structurally normal, daily logs kept writing, semantic retrieval kept working.

So "did Dreaming break memory" was never the question. It did not. The question was quality.

Memory Sweep is the comparison that matters

Sweep is the reference point because it has been doing the same job Dreaming now claims, just more conservatively.

And to be fair to Sweep, the cron reports show it was not sitting there idle. Over the same week, it was reviewing real sessions and persisting useful state with pretty solid discipline.

A few examples from the sweep channel:

April 9: reviewed non-cron sessions and updated a durable agent-workflow card.
April 10: turned grocery receipts into a durable tracking workflow.
April 11: handled a security incident conservatively, logging what mattered without duplicating existing cards.
April 13 to April 15: kept the Lazarus Group research card current while threat-assessment sections and supporting research were still moving.
April 16: created an xMCP service-ops card and logged the operational follow-up cleanly.

That is not a cron doing nothing. That is a curation layer doing triage. Sweep evaluates session material, skips cron, heartbeat, and helper noise, checks whether the durable information already exists, and only writes when something actually changes or deserves promotion.

That restraint matters. Some nights the correct outcome really is "no new cards." But across the week, Sweep still created or updated cards for grocery tracking, agent-workflow rules, Lazarus Group research, blog publishing rules, xMCP service operations, and malware-response documentation. It also kept daily logs current without flooding memory with duplicate fragments.

Dreaming, over the same window, promoted a handful of genuinely useful durable facts and a lot of transcript residue. Same job, different discipline. Sweep's default is "persist carefully after review." Dreaming's default is closer to "surface candidates broadly and let cleanup happen later." That difference is the whole story.

Dreaming quality: real signal, real noise

The good

Useful promotions did show up, and they were more specific than "Dreaming found something interesting."

A few examples of what it actually added:

It recovered a real ACP constraint: Discord thread creation works reliably from a fresh inbound turn, but nested or yielded turns can collapse into webchat and fail.
It helped move an agent lane from "probably working" to a verified workflow, which turned that discovery into a durable card instead of leaving it buried in chat history.

That part matters. Those are real operating rules that affect how I route agent work and catch workflow hiccups, not just vague themes Dreaming happened to notice.

The bad

Staged recall also contained a lot of debris:

heartbeat boilerplate (HEARTBEAT_OK)
silent sentinel text (NO_REPLY)
tiny one-line chat fragments
metadata-heavy transcript sludge

This is the limitation. Without strict filtering, Dreaming will keep surfacing things that are technically recallable and semantically worthless.

Where Dreaming writes

During the trial, Dreaming artifacts showed up in:

DREAMS.md and workspace equivalents
memory/.dreams/* (session corpus and short-term recall JSON)
MEMORY.md promoted sections tagged with openclaw-memory-promotion
daily logs with Light and REM candidate traces

Cards and daily logs stayed intact. The promotion stream is what needs quality controls.

Complement, not core

After one week the architecture answer is obvious:

Structured memory (cards, slim index, retrieval, Sweep) is the core
Dreaming is a useful second-pass promotion layer

Dreaming catches what day-of workflows miss. It is not trustworthy enough yet to be the primary curation mechanism. That is not a failure, it is a role.

What I am keeping, what I am tuning

Keeping Dreaming enabled. Tightening the promotion side:

heavier penalties for boilerplate tokens
stricter filtering against low-information one-liners
lower promotion likelihood for metadata-only fragments
human-curated cards stay the authoritative path

The goal is not maximal recall. The goal is durable, retrievable memory that stays useful under load.

Verdict

Dreaming is useful. Structured memory is better. 🦞

That is not a contradiction, it is the right layering. Use Dreaming to recover signal from transcript residue. Use structured memory to decide what deserves to live long-term. Blend the roles correctly and you get better continuity without turning your memory system into a junk drawer.

Originally published at solomonneas.dev/blog/dreaming-useful-structured-memory-better. Licensed under CC BY-NC-ND 4.0 - attribution required, no commercial use, no derivatives.

DEV Community