<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: 이종국 (barw)</title>
    <description>The latest articles on DEV Community by 이종국 (barw) (@_barw_bc0ab3c2937660).</description>
    <link>https://dev.to/_barw_bc0ab3c2937660</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3934807%2Ffd64f668-405a-4ca6-89a4-fa42b253774f.png</url>
      <title>DEV Community: 이종국 (barw)</title>
      <link>https://dev.to/_barw_bc0ab3c2937660</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_barw_bc0ab3c2937660"/>
    <language>en</language>
    <item>
      <title>Claudelab: a manually-gated alternative to Anthropic's agent 'Dreaming'</title>
      <dc:creator>이종국 (barw)</dc:creator>
      <pubDate>Sat, 16 May 2026 12:04:28 +0000</pubDate>
      <link>https://dev.to/_barw_bc0ab3c2937660/claudelab-a-manually-gated-alternative-to-anthropics-agent-dreaming-59l2</link>
      <guid>https://dev.to/_barw_bc0ab3c2937660/claudelab-a-manually-gated-alternative-to-anthropics-agent-dreaming-59l2</guid>
      <description>&lt;p&gt;A disclaimer before anything else: this is not a framework, and not a&lt;br&gt;
product. It's a handful of markdown conventions plus one small Python script —&lt;br&gt;
a personal habit for keeping my own Claude Code setup from rotting. Later in&lt;br&gt;
this post I put it next to Anthropic's "Dreaming"; I want to be clear up front&lt;br&gt;
that the comparison is about a shared &lt;em&gt;problem&lt;/em&gt;, not a shared &lt;em&gt;scale&lt;/em&gt;. If "a few&lt;br&gt;
markdown files" sounds too small to be worth your time, that's a fair call —&lt;br&gt;
feel free to bail here.&lt;/p&gt;

&lt;p&gt;If you run long-lived projects through Claude Code, you've probably noticed its&lt;br&gt;
memory system (a &lt;code&gt;MEMORY.md&lt;/code&gt; index plus topic files) is genuinely useful — and&lt;br&gt;
also that it rots.&lt;/p&gt;

&lt;p&gt;After a few months of daily use across several projects, I kept hitting the same&lt;br&gt;
three failure modes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Audit inflation. You start logging decisions ("tried X, didn't work"),&lt;br&gt;
encoding rules ("always typecheck before push"), recording context. Then one day&lt;br&gt;
you notice you spend more time documenting decisions than making them. Memory&lt;br&gt;
files multiply. Cross-references break.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Project pollution. A throwaway meta-experiment — "let me check if this MCP&lt;br&gt;
server works" — leaks into a real project's commit history and workspace. Six&lt;br&gt;
months later you can't tell signal from scratchpad.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Rule drift. You write a governance rule on Monday. On Tuesday you bypass it&lt;br&gt;
with "this time is different." The rule was never wrong; you just stopped&lt;br&gt;
enforcing it.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I built a small workbench pattern to fight all three. It's called claudelab,&lt;br&gt;
and it's deliberately boring: a few markdown conventions and one Python script.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Audit inflation cap — an explicit per-session limit on new memory files.
Hit it, and only justified exceptions (a real incident, an external audit) let
you add more. It forces you to stop documenting and go build.&lt;/li&gt;
&lt;li&gt;REJECTED archive — every killed idea is logged with a &lt;em&gt;reason&lt;/em&gt; and a
&lt;em&gt;reversal trigger&lt;/em&gt;. You don't get to re-litigate a dead idea without new evidence.&lt;/li&gt;
&lt;li&gt;Closed-loop self-audit — a health-check script finds broken cross-references,
orphan files, and stale entries. The system stays consistent without a human
scanning it.&lt;/li&gt;
&lt;li&gt;Track-separated governance — real projects, meta-experiments, and research
each live under different rules. Strict where it matters, loose where it doesn't.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's the part that made me write this up.&lt;/p&gt;

&lt;p&gt;While building claudelab, I didn't know Anthropic had just shipped Dreaming —&lt;br&gt;
a managed feature, announced in May 2026, that does automated &lt;em&gt;between-session&lt;/em&gt;&lt;br&gt;
memory consolidation: it reviews past sessions, prunes stale notes, merges&lt;br&gt;
duplicates, resolves contradictions, and writes structured playbooks for future&lt;br&gt;
runs. The legal-AI company Harvey reportedly saw task completion rates jump ~6x&lt;br&gt;
after adopting it.&lt;/p&gt;

&lt;p&gt;When I found out, my first reaction was "well, that's the same problem." And it&lt;br&gt;
is — but the two solutions sit at opposite ends:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Anthropic Dreaming&lt;/th&gt;
&lt;th&gt;claudelab&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Memory consolidation&lt;/td&gt;
&lt;td&gt;automatic, between sessions&lt;/td&gt;
&lt;td&gt;manual, explicitly gated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Where it runs&lt;/td&gt;
&lt;td&gt;managed / hosted&lt;/td&gt;
&lt;td&gt;your local files, your repo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recurring-mistake surfacing&lt;/td&gt;
&lt;td&gt;learned, implicit&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;REJECTED.md&lt;/code&gt; — explicit reason + reversal trigger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stops over-documentation&lt;/td&gt;
&lt;td&gt;not a stated goal&lt;/td&gt;
&lt;td&gt;audit inflation cap is the &lt;em&gt;core&lt;/em&gt; constraint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auditable&lt;/td&gt;
&lt;td&gt;yes (plain-text playbooks)&lt;/td&gt;
&lt;td&gt;yes (plain-text, in your git)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;To be clear about the timeline: Dreaming was &lt;em&gt;announced before&lt;/em&gt; I'd finished&lt;br&gt;
claudelab's tooling. This isn't a priority claim — it's independent convergence.&lt;br&gt;
The interesting thing isn't who was first; it's that the problem is real enough&lt;br&gt;
that two people hit it and picked opposite trade-offs.&lt;/p&gt;

&lt;p&gt;Dreaming optimizes for scale and autonomy — agents that get better&lt;br&gt;
unsupervised, the longer they run. claudelab optimizes for one human staying&lt;br&gt;
in the loop without drowning in audit overhead. If you run fleets of managed&lt;br&gt;
agents, Dreaming is the answer. If you want a transparent, self-hosted workbench&lt;br&gt;
where every memory change is something you consciously approved, that's&lt;br&gt;
claudelab — and the two aren't mutually exclusive.&lt;/p&gt;

&lt;p&gt;One deliberate non-feature: claudelab does &lt;em&gt;not&lt;/em&gt; automate memory updates with&lt;br&gt;
hooks. That sounds backwards for a "self-evolving" system, but auto-creating&lt;br&gt;
memory is just audit inflation with extra steps — rule drift happens faster than&lt;br&gt;
a human notices. Manual gating &lt;em&gt;is&lt;/em&gt; the feature.&lt;/p&gt;

&lt;p&gt;It's MIT, it's alpha, and it's a transferable pattern more than a product. If you&lt;br&gt;
run agents long-term, I'd genuinely like to know whether the audit-inflation cap&lt;br&gt;
resonates — or whether I've built an elaborate solution to a problem only I have.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/whdrnr2583-cmd/claudelab" rel="noopener noreferrer"&gt;https://github.com/whdrnr2583-cmd/claudelab&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>opensource</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
