<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ani</title>
    <description>The latest articles on DEV Community by Ani (@thebnbrkr).</description>
    <link>https://dev.to/thebnbrkr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3843443%2F6c639cff-d7b3-4f1d-af7f-0bfc4023a227.png</url>
      <title>DEV Community: Ani</title>
      <link>https://dev.to/thebnbrkr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thebnbrkr"/>
    <language>en</language>
    <item>
      <title>I Built a tool to give AI coding agents persistent memory and a way smaller token footprint</title>
      <dc:creator>Ani</dc:creator>
      <pubDate>Wed, 25 Mar 2026 16:09:12 +0000</pubDate>
      <link>https://dev.to/thebnbrkr/i-built-a-tool-to-give-ai-coding-agents-persistent-memory-and-a-way-smaller-token-footprint-4p4</link>
      <guid>https://dev.to/thebnbrkr/i-built-a-tool-to-give-ai-coding-agents-persistent-memory-and-a-way-smaller-token-footprint-4p4</guid>
      <description>&lt;p&gt;Been building with AI coding agents for a while now. Claude Code, Cursor, Antigravity, and two things kept annoying me enough that I finally just built something to fix them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The two problems
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem 1: Your agent reads a 1000-line file and burns 8000 tokens doing it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's before it's done anything useful. Large codebases eat context fast, and once the window fills up, you're either compressing (lossy) or starting over. Neither is great.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem 2: Every new session, your agent starts from zero.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It doesn't remember that the API rate limit is 100 req/min. It doesn't remember the weird edge case in the auth module you spent two hours debugging last week. It doesn't remember &lt;em&gt;anything&lt;/em&gt;. You either re-explain everything, or watch it rediscover the same gotchas.&lt;/p&gt;

&lt;p&gt;These aren't niche complaints — if you're using AI agents to work on real codebases, you've hit both of these.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/thebnbrkr/agora-code" rel="noopener noreferrer"&gt;agora-code&lt;/a&gt;&lt;/strong&gt; — persistent memory and context reduction for AI coding agents. Works with Claude Code, Cursor, and Gemini CLI. Survives context resets, new conversations, and agent restarts.&lt;/p&gt;

&lt;p&gt;It's early. It works. I want people to try it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it handles token bloat
&lt;/h2&gt;

&lt;p&gt;Instead of letting the agent read raw source files, agora-code intercepts every file read and serves an AST summary instead.&lt;/p&gt;

&lt;p&gt;Real example: &lt;code&gt;summarizer.py&lt;/code&gt; is 885 lines. Raw read = 8,436 tokens. Summarized = 542 tokens. That's a &lt;strong&gt;93.6% reduction&lt;/strong&gt; — and the agent still gets all the signal: class names, function signatures, docstrings, line numbers.&lt;/p&gt;

&lt;p&gt;It works across languages too:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File type&lt;/th&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;What you get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;stdlib AST&lt;/td&gt;
&lt;td&gt;Classes, functions, signatures, docstrings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JS, TS, Go, Rust, Java + 160 more&lt;/td&gt;
&lt;td&gt;tree-sitter&lt;/td&gt;
&lt;td&gt;Same — exact line numbers, parameter types&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON / YAML&lt;/td&gt;
&lt;td&gt;Structure parser&lt;/td&gt;
&lt;td&gt;Top-level keys + shape&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;td&gt;Heading extractor&lt;/td&gt;
&lt;td&gt;Headings + opening paragraph&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Summaries are cached in SQLite, so re-reads on the same branch are instant.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it handles memory loss
&lt;/h2&gt;

&lt;p&gt;When a session ends, agora-code parses the transcript and extracts a structured checkpoint: what was the goal, what changed, what non-obvious things did you find, what's next.&lt;/p&gt;

&lt;p&gt;At the start of the next session, the relevant parts are injected automatically — last checkpoint, top learnings from recent commits on the branch, git state, symbol index for dirty files.&lt;/p&gt;

&lt;p&gt;You can also manually store findings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agora-code learn &lt;span class="s2"&gt;"POST /users rejects + in emails"&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; email,validation
agora-code learn &lt;span class="s2"&gt;"Rate limit is 100 req/min"&lt;/span&gt; &lt;span class="nt"&gt;--confidence&lt;/span&gt; confirmed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And recall them later (keyword search by default, semantic search if you wire up embeddings):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agora-code recall &lt;span class="s2"&gt;"email validation"&lt;/span&gt;
agora-code recall &lt;span class="s2"&gt;"rate limit"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Storage is three layers: an active session file (project-local, gitignored), a global SQLite DB scoped per project via git remote URL, and search (FTS5/BM25 always on, optional vector search).&lt;/p&gt;




&lt;h2&gt;
  
  
  What happens automatically (Claude Code)
&lt;/h2&gt;

&lt;p&gt;Once hooks are installed, you don't have to think about most of this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;When you…&lt;/th&gt;
&lt;th&gt;agora-code automatically…&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Start a session&lt;/td&gt;
&lt;td&gt;Injects last checkpoint + relevant learnings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Submit a prompt&lt;/td&gt;
&lt;td&gt;Recalls relevant past findings, sets session goal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Read a file &amp;gt; 100 lines&lt;/td&gt;
&lt;td&gt;Summarizes via AST — serves summary instead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edit a file&lt;/td&gt;
&lt;td&gt;Tracks the diff, re-indexes symbols&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run &lt;code&gt;git commit&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Derives learnings from the commit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context window compresses&lt;/td&gt;
&lt;td&gt;Checkpoints before, re-injects after&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;End a session&lt;/td&gt;
&lt;td&gt;Parses transcript → structured checkpoint in DB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;git+https://github.com/thebnbrkr/agora-code.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;your-project
agora-code install-hooks &lt;span class="nt"&gt;--claude-code&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Cursor and Gemini CLI, you copy a config directory into your project root — full instructions in the README.&lt;/p&gt;

&lt;p&gt;At the start of every Claude Code session, run &lt;code&gt;/agora-code&lt;/code&gt; to load the skill. That's the bit that tells the agent when to summarize, when to inject context, when to save progress.&lt;/p&gt;




&lt;h2&gt;
  
  
  It's early
&lt;/h2&gt;

&lt;p&gt;APIs may change. Things might break. I'm actively working on it — semantic search is in progress, automated hook setup for Cursor and Gemini is on the roadmap.&lt;/p&gt;

&lt;p&gt;If you try it and hit something weird, open an issue. If you want to add hook support for a different editor, the pattern is consistent across &lt;code&gt;.claude/hooks/&lt;/code&gt; and &lt;code&gt;.cursor/hooks/&lt;/code&gt; — PRs welcome.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/thebnbrkr/agora-code" rel="noopener noreferrer"&gt;https://github.com/thebnbrkr/agora-code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Screenshot: &lt;a href="https://imgur.com/a/APaiNnl" rel="noopener noreferrer"&gt;(https://imgur.com/a/APaiNnl&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Would love to hear if this solves the same pain points for others, or if you're handling token bloat / memory loss differently. Drop a comment.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>aiops</category>
      <category>agentskills</category>
    </item>
  </channel>
</rss>
