<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: jimovonz</title>
    <description>The latest articles on DEV Community by jimovonz (@jimovonz).</description>
    <link>https://dev.to/jimovonz</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3839682%2F69ad9059-1204-483e-a47c-fc5fd8c2b4d9.png</url>
      <title>DEV Community: jimovonz</title>
      <link>https://dev.to/jimovonz</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jimovonz"/>
    <language>en</language>
    <item>
      <title>I built "Invisible" Persistent Memory for Claude Code</title>
      <dc:creator>jimovonz</dc:creator>
      <pubDate>Mon, 23 Mar 2026 09:20:44 +0000</pubDate>
      <link>https://dev.to/jimovonz/i-built-invisible-persistent-memory-for-claude-code-5emi</link>
      <guid>https://dev.to/jimovonz/i-built-invisible-persistent-memory-for-claude-code-5emi</guid>
      <description>&lt;h2&gt;
  
  
  If you use Claude Code heavily, you've hit the wall.
&lt;/h2&gt;

&lt;p&gt;You spend an hour explaining your project's architecture, your library preferences, and the specific quirks of your legacy code. Then the session ends. The next time you boot up &lt;code&gt;claude&lt;/code&gt;, it starts from zero. You have to re-explain everything. Even with massive context windows, every new session is a fresh case of amnesia.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;Cairn&lt;/strong&gt; to fix this. It's an open-source persistent memory system that lives entirely on your machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "Exploit": The Invisible Control Plane
&lt;/h2&gt;

&lt;p&gt;I discovered a loophole in how Claude Code renders output.&lt;/p&gt;

&lt;p&gt;Claude Code automatically strips XML-style tags (like &lt;code&gt;&amp;lt;example&amp;gt;...&amp;lt;/example&amp;gt;&lt;/code&gt;) from its final display. If Claude writes a tag in its response, the user never sees it, but the raw response still contains it.&lt;/p&gt;

&lt;p&gt;Claude Code's hook system has access to that raw response &lt;em&gt;before&lt;/em&gt; it's stripped. I realized this gap—between what Claude writes and what you see—is an &lt;strong&gt;invisible control plane&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Cairn Works
&lt;/h2&gt;

&lt;p&gt;Every time Claude finishes a thought, Cairn forces it to append a hidden &lt;code&gt;&amp;lt;memory&amp;gt;&lt;/code&gt; block:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;Here's the refactored authentication module...

&lt;span class="nt"&gt;&amp;lt;memory&amp;gt;&lt;/span&gt;
- type: decision
- topic: auth-approach
- content: Use JWT for stateless auth — rejected session cookies
  because the API is consumed by mobile clients.
- complete: true
- context: sufficient
&lt;span class="nt"&gt;&amp;lt;/memory&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;You see:&lt;/strong&gt; &lt;em&gt;"Here's the refactored authentication module..."&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;The Hook sees:&lt;/strong&gt; The hidden structured data.&lt;/p&gt;

&lt;p&gt;Cairn's parser captures that block, generates a semantic embedding using a local model, and stores it in a SQLite database. In your next session—even days later or in a different directory—Cairn searches the database and injects relevant "memories" back into the prompt.&lt;/p&gt;

&lt;p&gt;Claude "remembers" your decisions because it wrote them down for itself.&lt;/p&gt;
&lt;h2&gt;
  
  
  Stopping the "I'll do that now..." Lie
&lt;/h2&gt;

&lt;p&gt;We've all seen it: Claude says, &lt;em&gt;"I'll check the logs for that error,"&lt;/em&gt; and then the terminal prompt just returns. It didn't check the logs. It just... stopped.&lt;/p&gt;

&lt;p&gt;Cairn solves this using &lt;strong&gt;Mechanical Enforcement&lt;/strong&gt;. It acts as an agentic supervisor through a &lt;strong&gt;Dual-Gate Guard&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Structural Gate (&lt;code&gt;complete: false&lt;/code&gt;):&lt;/strong&gt; Inside the hidden memory block, Claude must report its own state. If Claude marks the task as &lt;code&gt;complete: false&lt;/code&gt;, Cairn blocks the response from reaching you. It silently re-prompts: &lt;em&gt;"You reported this is incomplete. Continue."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Semantic Gate (Trailing Intent):&lt;/strong&gt; Cairn analyzes the end of every response. If Claude states an intent to act (e.g., &lt;em&gt;"I will run the tests"&lt;/em&gt;) but the memory block claims completion, Cairn detects the contradiction. It knows Claude is "ghosting" the task and forces it to actually execute the command.&lt;/p&gt;
&lt;h2&gt;
  
  
  The "Pūkeko" Test
&lt;/h2&gt;

&lt;p&gt;Here is how I knew the retrieval was working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session 1&lt;/strong&gt; (in &lt;code&gt;~/temp&lt;/code&gt;): I mentioned I saw a "big blue bird with a red beak" on my lawn. Claude identified it as a Pūkeko. I didn't save any files or notes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session 2&lt;/strong&gt; (Three days later, in &lt;code&gt;~/projects&lt;/code&gt;): I simply asked, &lt;em&gt;"What was on my lawn?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Result:&lt;/strong&gt; Claude answered instantly: &lt;em&gt;"A pūkeko — NZ Purple Swamphen."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It didn't search my files. It didn't look at chat logs. It retrieved its own distilled memory from the SQLite database.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why this is different from "Standard RAG"
&lt;/h2&gt;

&lt;p&gt;Most LLM memory systems just dump your chat logs into a vector DB. Cairn is built for the high-pressure environment of a coding agent:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The LLM is the Memory Author:&lt;/strong&gt; Cairn doesn't store raw transcripts. It forces the LLM to distill facts. A 50,000-token session becomes 10 precise, high-value memories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Epistemic Traceability:&lt;/strong&gt; One-line summaries are great for context window efficiency, but if you need the "why," Cairn can help. Every memory is a pointer. Using &lt;code&gt;--context &amp;lt;id&amp;gt;&lt;/code&gt;, you can instantly recover the original multi-turn conversation from your local logs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9 Quality Gates:&lt;/strong&gt; This isn't just "top-k" search. Cairn uses a sophisticated retrieval pipeline including &lt;em&gt;Saturating Confidence&lt;/em&gt; (important memories stay relevant; "noise" fades) and &lt;em&gt;Adaptive Thresholds&lt;/em&gt; (it learns your project's noise floor and tightens retrieval automatically).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;100% Local &amp;amp; Fast:&lt;/strong&gt; No extra API keys. No cloud privacy concerns. Cairn runs a lightweight Python daemon in the background to keep the embedding model (~80MB) in RAM, so memory injection happens in milliseconds.&lt;/p&gt;
&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;p&gt;It takes about a minute to install. The installer sets up a Python venv, initializes the database, deploys the hooks globally, and starts the background daemon.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/jimovonz/cairn.git ~/cairn
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/cairn
./install.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart Claude Code. It's now active in every session. You'll also get access to new slash commands:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/cairn search &amp;lt;query&amp;gt;&lt;/code&gt;: See what Claude knows about a topic.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/cairn audit&lt;/code&gt;: Review, edit, or delete stored memories.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/cairn recent&lt;/code&gt;: See what was captured in the last hour.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Limitations (Honestly)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code only:&lt;/strong&gt; This relies on the specific hook system and tag-stripping behavior of the &lt;code&gt;claude&lt;/code&gt; CLI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Early stage:&lt;/strong&gt; I've tested this extensively on Ubuntu. It's stable, but edge cases with permissions or venv conflicts might exist.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM Cooperation:&lt;/strong&gt; While mechanical enforcement catches most failures, Claude is still non-deterministic. Sometimes it needs a nudge to write a good memory.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it out
&lt;/h2&gt;

&lt;p&gt;I built this because I wanted a coding partner that actually learns my habits and project history. If you're a heavy Claude Code user, I'd love your feedback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub Repo:&lt;/strong&gt; &lt;a href="https://github.com/jimovonz/cairn" rel="noopener noreferrer"&gt;https://github.com/jimovonz/cairn&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Let's Talk
&lt;/h2&gt;

&lt;p&gt;I’m curious—what is the one thing you are tired of re-explaining to Claude every single session? &lt;/p&gt;

&lt;p&gt;Whether it's a specific naming convention, a complex architectural quirk, or just how you like your imports organized—I’d love to hear what "amnesia" moments you think a system like Cairn should solve first.&lt;/p&gt;

&lt;p&gt;Also, if you’re an agent developer or a tool-builder, I’d love your take on the &lt;strong&gt;Invisible Control Plane&lt;/strong&gt; approach. Is hiding metadata in stripped tags a "clever hack" or a "dangerous precedent"? Does your favorite AI tool (Cursor, Windsurf, etc.) have a hook system that could do this better?&lt;/p&gt;

&lt;p&gt;I'll be hanging out in the comments to answer questions about the local-first setup, the embedding daemon, or anything else in the &lt;a href="https://github.com/jimovonz/cairn" rel="noopener noreferrer"&gt;repo&lt;/a&gt;!&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Cairn — a mound of stones built as a trail marker, placed one at a time by those who pass, so that those who follow can find their way.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>showdev</category>
      <category>productivity</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
