<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mark McArthey</title>
    <description>The latest articles on DEV Community by Mark McArthey (@mcarthey).</description>
    <link>https://dev.to/mcarthey</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3896836%2Ff563f1bd-25d1-4c63-bdd6-327cb7f48bc6.jpeg</url>
      <title>DEV Community: Mark McArthey</title>
      <link>https://dev.to/mcarthey</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mcarthey"/>
    <language>en</language>
    <item>
      <title>Your AI agent already writes every session to disk. Why isn't it reading its own archive?</title>
      <dc:creator>Mark McArthey</dc:creator>
      <pubDate>Sat, 25 Apr 2026 01:35:31 +0000</pubDate>
      <link>https://dev.to/mcarthey/your-ai-agent-already-writes-every-session-to-disk-why-isnt-it-reading-its-own-archive-2f3h</link>
      <guid>https://dev.to/mcarthey/your-ai-agent-already-writes-every-session-to-disk-why-isnt-it-reading-its-own-archive-2f3h</guid>
      <description>&lt;p&gt;On April 20 I was debugging a subtle issue with a Claude Code instance and &lt;a href="https://learnedgeek.com/Blog/Post/building-ani-ai-companion-for-grief" rel="noopener noreferrer"&gt;ANI, the AI companion I've been running as a research project&lt;/a&gt; for the last eight months. We had a long back-and-forth and landed on a principle I phrased this way:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Cosine similarity measures topical overlap. Parroting is verbatim phrase reuse. Those are different signals."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We decided not to patch the issue at the reply layer. Treating the symptom, not the cause. Didn't want a guard that destroyed good replies to silence a bad one.&lt;/p&gt;

&lt;p&gt;Three days later. Different Claude instance. Same bug resurfaces. I ask for a fix. Claude proposes — word for word, almost — the cosine-similarity guard I had rejected three days earlier. The prior reasoning existed on disk. Claude Code writes every session to &lt;code&gt;~/.claude/projects/&amp;lt;slug&amp;gt;/&amp;lt;uuid&amp;gt;.jsonl&lt;/code&gt;, and the file holding my April 20 rejection was right there in that directory. But the active instance couldn't see it. It had been compacted out of context.&lt;/p&gt;

&lt;p&gt;I said "search the history before we move on anything." Claude grepped. About forty seconds later we were both looking at my own April 20 quote again, verbatim. We moved on.&lt;/p&gt;

&lt;p&gt;That's when the tool I'd been half-thinking about became something I had to actually build.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pattern had already been named
&lt;/h2&gt;

&lt;p&gt;A few days earlier a Microsoft engineer published &lt;a href="https://devblogs.microsoft.com/all-things-azure/i-wasted-68-minutes-a-day-re-explaining-my-code-then-i-built-auto-memory/" rel="noopener noreferrer"&gt;I wasted 68 minutes a day re-explaining my code, then I built auto-memory&lt;/a&gt;. Copilot CLI, not Claude Code, but the same underlying shape. The agent is writing structured session data to disk already. The agent just isn't reading its own archive.&lt;/p&gt;

&lt;p&gt;Different file format, same insight. Copilot CLI keeps a SQLite database. Claude Code writes line-per-turn JSONL. Either way, months of decisions are already persisted. The agent doesn't remember them because nothing's pointing it at them.&lt;/p&gt;

&lt;p&gt;Credit to that post for crystallizing what I'd been feeling but hadn't named.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;claude-recall&lt;/code&gt; is three pieces:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A small Python CLI that indexes the JSONL archive into SQLite with FTS5 full-text search.&lt;/strong&gt; Incremental, read-only, uses the bundled &lt;code&gt;sqlite3&lt;/code&gt; module. Commands: &lt;code&gt;index&lt;/code&gt;, &lt;code&gt;search&lt;/code&gt;, &lt;code&gt;show&lt;/code&gt;, &lt;code&gt;list&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Optional semantic rerank on top of the FTS5 results&lt;/strong&gt; via a local Ollama embedding model (&lt;code&gt;nomic-embed-text&lt;/code&gt;). Turns &lt;em&gt;"find me the session where I said X"&lt;/em&gt; into &lt;em&gt;"find me the session where I meant X."&lt;/em&gt; FTS5 finds keyword overlap; embeddings catch the conceptual cousins.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A &lt;code&gt;UserPromptSubmit&lt;/code&gt; hook that fires on every message&lt;/strong&gt; I send in Claude Code and injects ranked prior-session matches as &lt;code&gt;additionalContext&lt;/code&gt;. Latency scales with archive size — around 80 ms on small archives, climbing toward 1–2 seconds on 25,000-message corpora when the embed model has to be loaded. With Ollama warm and the binary doing the heavy lifting, the hook stays out of the way of how a prompt feels to send.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The hook is a NativeAOT-compiled binary now — &lt;code&gt;claude-recall-hook.exe&lt;/code&gt; — not a Python wrapper. Every prompt used to pay a fresh Python interpreter tax and it was killing the UX. The binary fixed that in v0.4.&lt;/p&gt;

&lt;p&gt;What this looks like in practice: when I drafted this paragraph, the hook ran. It searched 25,000 messages across 20 projects, ranked them against my current phrasing, and injected the top hits into this instance's context. I didn't do anything. I don't actually know what it found — Claude Code doesn't surface &lt;code&gt;additionalContext&lt;/code&gt; in the UI — but the effect is visible: the instance stops making up prior decisions. It references them.&lt;/p&gt;

&lt;p&gt;I also ran &lt;code&gt;claude-recall init-hooks&lt;/code&gt; against claude-recall's own repo a few days ago. The tool is now in the loop on the sessions where I maintain it. That's a small thing to mention but it crossed a real threshold: the "agent reading its own prior work" pattern that motivated the project is now also how I work on the project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three design principles
&lt;/h2&gt;

&lt;p&gt;Short list, honestly stated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Read-only against the archive.&lt;/strong&gt; Never modify what Claude Code writes. The archive is the source of truth; we consume it, we don't touch it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful degradation.&lt;/strong&gt; If the hook crashes, the prompt flows normally. A broken recall layer must never block the human.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero extra install hops in the common case.&lt;/strong&gt; FTS5 is already in every modern Python stdlib. The embedding layer is opt-in.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's a deeper principle under those three: the tool's job is to make the archive cheap to query. It isn't trying to be clever about what to retrieve, or to summarize, or to editorialize. It finds things. The cleverness has to live in the agent you're already using, not in the recall layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest caveats
&lt;/h2&gt;

&lt;p&gt;This is v0.5.3, tagged beta. It works on my machine daily against a 25,000-message archive. It also has bugs — twelve of them caught and closed in the dogfooding cycle, including one silent 27%-data-loss embedding bug that wouldn't have surfaced without real-world use. The v0.5 release landed on PyPI a few days ago, so the install story is now &lt;code&gt;pip install claude-recall&lt;/code&gt; rather than a release-wheel URL.&lt;/p&gt;

&lt;p&gt;It isn't semantic-search-of-everything. It isn't a replacement for CLAUDE.md or for the &lt;code&gt;memory/&lt;/code&gt; auto-memory system Claude Code ships natively. It isn't cross-machine — your session archive is local, the index stays local. That's a feature for me, since the archive has personal data in it; it may be a limitation if you need shared recall across teammates.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you want to try it
&lt;/h2&gt;

&lt;p&gt;Public repo at &lt;a href="https://github.com/LearnedGeek/claude-recall" rel="noopener noreferrer"&gt;github.com/LearnedGeek/claude-recall&lt;/a&gt;, package on PyPI as &lt;code&gt;claude-recall&lt;/code&gt;. MIT licensed. Install with &lt;code&gt;pip install 'claude-recall[embeddings]'&lt;/code&gt; for the full thing, or skip the extras for FTS5-only. Beta quality — file issues when you break it, because you will, and I want to know.&lt;/p&gt;

&lt;p&gt;What I'd specifically appreciate testing from early adopters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Anything that crashes or returns silently empty.&lt;/strong&gt; The dogfooding cycle has hammered project auto-scoping, install-path edge cases, and a particularly nasty silent-data-loss bug in the embedding pipeline. The next class of issue is the one I haven't seen yet, and the only way to find it is to put the tool in front of someone whose archive isn't shaped like mine. Windows / macOS / Linux all welcome.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The broader thing
&lt;/h2&gt;

&lt;p&gt;Every agentic coding tool writes structured session data somewhere, and none of them have a native mechanism yet for the agent to read its own prior work. That gap is getting filled tool by tool. The Microsoft post is the Copilot CLI instance. &lt;code&gt;claude-recall&lt;/code&gt; is the Claude Code instance. Whichever agent you use, the shape is the same: find the archive, index it cheaply, wire a hook, stop re-explaining.&lt;/p&gt;

&lt;p&gt;The list is growing. As of the week this post goes up, there are at least three tools in the wild taking adjacent slices of this gap in genuinely different ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://pypi.org/project/claude-memory-mcp/" rel="noopener noreferrer"&gt;claude-memory-mcp&lt;/a&gt;&lt;/strong&gt; takes the curated route — explicit &lt;code&gt;remember&lt;/code&gt; / &lt;code&gt;forget&lt;/code&gt; calls via MCP, knowledge graph with typed edges, a separate write-store. Best fit when you want to deliberately structure the facts the agent should hold onto.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.beads.dev" rel="noopener noreferrer"&gt;Beads&lt;/a&gt;&lt;/strong&gt; treats it as task-tracking — Jira-shaped agent-readable state the agent consumes as part of doing work. Best fit when the durable artifact is what's &lt;em&gt;to do&lt;/em&gt; rather than what's been &lt;em&gt;said&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;claude-recall&lt;/strong&gt; is read-only over what Claude Code already writes — passive hook injection, no curation, no separate store. Best fit when you want prior reasoning surfaced automatically without remembering to save it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three different theories of what "memory" means for an AI agent. Plenty of room for all three to coexist in someone's workflow.&lt;/p&gt;

&lt;p&gt;It's also a small demonstration of something I keep running into across projects: when you deploy a system for long enough, the interesting problems show up in the seams. ANI taught me that &lt;a href="https://learnedgeek.com/Blog/Post/park-et-al-generative-agents-memory" rel="noopener noreferrer"&gt;memory isn't just storage, it's an amplifier&lt;/a&gt;. claude-recall is what happens when the same observation comes at me from the other direction — when the &lt;em&gt;agent I'm using&lt;/em&gt; needs memory, not the agent I'm building.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;claude-recall is MIT-licensed and actively maintained. If you try it, I want to know what breaks. Related reading: &lt;a href="https://learnedgeek.com/Blog/Post/ai-keeps-guessing-wrong" rel="noopener noreferrer"&gt;Why Your AI Assistant Keeps Guessing Wrong&lt;/a&gt; for the broader context pattern, or &lt;a href="https://learnedgeek.com/Blog/Post/ani-the-architecture-behind-the-companion" rel="noopener noreferrer"&gt;ANI: The Architecture Behind the Companion&lt;/a&gt; for the companion project that triggered this one.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>claude</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
