<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rahil Pirani</title>
    <description>The latest articles on DEV Community by Rahil Pirani (@rahil_pirani_c48446facc8c).</description>
    <link>https://dev.to/rahil_pirani_c48446facc8c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3922851%2F562316ad-6070-4ae8-9c4c-772035064295.png</url>
      <title>DEV Community: Rahil Pirani</title>
      <link>https://dev.to/rahil_pirani_c48446facc8c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rahil_pirani_c48446facc8c"/>
    <language>en</language>
    <item>
      <title>AI memory has a contradiction problem nobody is talking about</title>
      <dc:creator>Rahil Pirani</dc:creator>
      <pubDate>Mon, 25 May 2026 06:53:45 +0000</pubDate>
      <link>https://dev.to/rahil_pirani_c48446facc8c/ai-memory-has-a-contradiction-problem-nobody-is-talking-about-8me</link>
      <guid>https://dev.to/rahil_pirani_c48446facc8c/ai-memory-has-a-contradiction-problem-nobody-is-talking-about-8me</guid>
      <description>&lt;p&gt;Most discussions about AI memory focus on a few main concerns: whether it lasts across sessions, how quickly it retrieves information, and whether it can scale. These are important questions. However, there’s a simpler issue that often gets overlooked, and it slowly worsens memory systems over time.&lt;/p&gt;

&lt;p&gt;What happens when two stored memories conflict?&lt;/p&gt;

&lt;p&gt;You tell your AI assistant that you prefer short, direct answers. A month later, you mention wanting more detailed explanations with examples. Both preferences get stored. Now, every recall brings up both. The system tries to accommodate both, but neither aligns with what you actually want at that moment.&lt;/p&gt;

&lt;p&gt;This isn’t just a hypothetical situation. It happens with any memory system that only adds information over time. Your preferences change. Your situation evolves. But the earlier version of you is still there, pushing in the opposite direction.&lt;/p&gt;




&lt;p&gt;Most people default to using a review inbox. It identifies conflicts and lets the user decide. It sounds good in theory but is frustrating in practice.&lt;/p&gt;

&lt;p&gt;No one wants to manage their AI's memory manually. The goal is for it to work in the background. A review inbox turns memory management into a task that often gets ignored, leading to a buildup of contradictions anyway.&lt;/p&gt;

&lt;p&gt;Another common method is timestamp-based overwriting: when new information comes in, it checks for similarities and replaces the old. But similarity doesn’t equal contradiction. "I work best in the mornings" and "I do my best thinking late at night" may be very different but share low similarity. A vector search won’t catch this. Both get stored and recalled.&lt;/p&gt;




&lt;p&gt;The right question isn’t "how do we find similarities?" It should be "how do we identify logical incompatibility?"&lt;/p&gt;

&lt;p&gt;This is a semantic reasoning challenge, not just a retrieval one. Two memories might not seem similar, yet can still contradict each other. The only way to recognize this is with a language model, not through distance metrics.&lt;/p&gt;

&lt;p&gt;When we integrated contradiction detection into &lt;a href="https://github.com/rahilp/second-brain-cloudflare" rel="noopener noreferrer"&gt;second-brain&lt;/a&gt;, our key design choice was to use a large language model (LLM) to check if new memories contradict any of the most recently recalled ones. We inquire not only "is this similar?" but "can both of these be true at the same time?"&lt;/p&gt;

&lt;p&gt;When a conflict arises, the new memory prevails. The old one gets deleted entirely, from both storage and the vector index. It's gone. The new memory is the only version that exists.&lt;/p&gt;




&lt;p&gt;There’s a real trade-off worth noting. Conditional preferences can be tricky.&lt;/p&gt;

&lt;p&gt;For example, "I want short responses when I’m coding, long ones when I’m strategizing" isn’t a contradiction. Those statements can coexist. An unsophisticated LLM check might flag them as conflicting. To get this right, enough context needs to be passed through the check so the model can distinguish between real conflicts and situational variations.&lt;/p&gt;

&lt;p&gt;This is a more complex issue, and the current implementation doesn't address it entirely. It handles clear cases well: factual contradictions, changes in preferences, updated decisions. The conditional cases represent a known gap.&lt;/p&gt;

&lt;p&gt;However, catching the clear cases already makes a significant difference. A memory system that sometimes overlooks nuanced conditions is still better than one that continuously accumulates contradictions without end.&lt;/p&gt;




&lt;p&gt;Storage is the easy part of AI memory; everyone can provide it. What truly matters for long-term usefulness is coherence over time, not just a lot of noise. To achieve coherence, contradictions must be treated as a primary issue, not just a task to clean up later.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>discuss</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I built persistent AI memory for Claude on Cloudflare's free tier</title>
      <dc:creator>Rahil Pirani</dc:creator>
      <pubDate>Wed, 20 May 2026 04:45:51 +0000</pubDate>
      <link>https://dev.to/rahil_pirani_c48446facc8c/i-built-persistent-ai-memory-for-claude-on-cloudflares-free-tier-12kc</link>
      <guid>https://dev.to/rahil_pirani_c48446facc8c/i-built-persistent-ai-memory-for-claude-on-cloudflares-free-tier-12kc</guid>
      <description>&lt;p&gt;Every Claude session starts fresh. You copy context, explain your setup, reintroduce your project, and then do it all over again the next day. I got tired of this and created a solution.&lt;/p&gt;

&lt;p&gt;second-brain-cloudflare is a self-hosted MCP server that provides Claude, ChatGPT, Cursor, and any MCP-compatible client with persistent memory across sessions. It operates entirely on Cloudflare's free tier. Here’s how it works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cloudflare Workers&lt;/strong&gt;: MCP server, REST API, and web UI, all from one &lt;code&gt;wrangler deploy&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;D1 (SQLite)&lt;/strong&gt;: stores entry content, tags, source, timestamps, and vector chunk IDs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vectorize&lt;/strong&gt;: the vector index (bge-small-en-v1.5, 384 dimensions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workers AI&lt;/strong&gt;: &lt;code&gt;bge-small-en-v1.5&lt;/code&gt; for embeddings,
&lt;code&gt;@cf/meta/llama-4-scout-17b-16e-instruct&lt;/code&gt; for web UI synthesis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One deployment. No external databases. No API keys needed beyond your Cloudflare account token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tag-based time-decay reranking
&lt;/h2&gt;

&lt;p&gt;Pure vector similarity has a drawback. A memory from three months ago can outrank something you saved yesterday if it’s semantically closer. The solution is to fetch three times more candidates than needed (topK=5 pulls 15), then score each using a tag-aware half-life:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tasks: 7-day half-life&lt;/li&gt;
&lt;li&gt;Work: 3-month half-life&lt;/li&gt;
&lt;li&gt;Context: 6-month half-life&lt;/li&gt;
&lt;li&gt;Default: 30-day half-life&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;adjusted_score = cosine_similarity × e^(-age_in_days / half_life)&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Duplicate detection
&lt;/h2&gt;

&lt;p&gt;Before storing anything, embed the incoming content and query Vectorize for its nearest neighbor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score ≥ 95%: block&lt;/li&gt;
&lt;li&gt;Score 85–94%: store with &lt;code&gt;duplicate-candidate&lt;/code&gt; tag&lt;/li&gt;
&lt;li&gt;Score &amp;lt; 85%: store normally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this step, Claude creates 20–30 nearly identical entries for the same decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smart chunking
&lt;/h2&gt;

&lt;p&gt;Long notes split at sentence ends, with a 200-character overlap. Each chunk receives its own vector. Chunk IDs are stored in D1, so forget() reliably removes all related vectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Temporal recall (v1.2.0)
&lt;/h2&gt;

&lt;p&gt;Queries now support time limits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;recall("API decisions", after="7 days ago")&lt;/li&gt;
&lt;li&gt;recall("standup notes", after="2026-05-12")
Supports: "today", "yesterday", "last week", "this month", ISO dates, and epoch timestamps.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  AI synthesis in the web UI
&lt;/h2&gt;

&lt;p&gt;Queries flow through &lt;code&gt;@cf/meta/llama-4-scout-17b-16e-instruct&lt;/code&gt; before being rendered. Answers stream in real time, with source memories that can be collapsed underneath. You’ll find Append and Forget buttons. This runs on your own Cloudflare account.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the free tier works
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;D1: 5GB storage, 5 million row reads per day&lt;/li&gt;
&lt;li&gt;Vectorize: 5 million vectors, 30 million queried dimensions per month (adequate for team scale but fine for personal use)&lt;/li&gt;
&lt;li&gt;Workers AI: 10,000 Neurons per day&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Deploy: &lt;a href="https://thesecondbrain.dev" rel="noopener noreferrer"&gt;https://thesecondbrain.dev&lt;/a&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/rahilp/second-brain-cloudflare" rel="noopener noreferrer"&gt;https://github.com/rahilp/second-brain-cloudflare&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If this was helpful, please give it a star.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cloudflare</category>
      <category>productivity</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I gave Claude a persistent memory for $0/month using Cloudflare</title>
      <dc:creator>Rahil Pirani</dc:creator>
      <pubDate>Sun, 10 May 2026 05:35:41 +0000</pubDate>
      <link>https://dev.to/rahil_pirani_c48446facc8c/i-gave-claude-a-persistent-memory-for-0month-using-cloudflare-2e5a</link>
      <guid>https://dev.to/rahil_pirani_c48446facc8c/i-gave-claude-a-persistent-memory-for-0month-using-cloudflare-2e5a</guid>
      <description>&lt;h1&gt;
  
  
  I gave Claude a persistent memory for $0/month using Cloudflare
&lt;/h1&gt;

&lt;p&gt;Claude is great. But every time you start a new conversation, it forgets everything. Your projects, your preferences, what you decided last week — gone.&lt;/p&gt;

&lt;p&gt;The official memory feature exists, but it's vague and you can't really control it. You can't query it, tag it, or search it semantically. It's a black box that occasionally surfaces something useful.&lt;/p&gt;

&lt;p&gt;So I built my own.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it is
&lt;/h2&gt;

&lt;p&gt;It's a self-hosted MCP server that runs on Cloudflare Workers. Four tools: &lt;code&gt;remember&lt;/code&gt;, &lt;code&gt;recall&lt;/code&gt;, &lt;code&gt;list_recent&lt;/code&gt;, &lt;code&gt;forget&lt;/code&gt;. Claude calls them automatically. You never think about it.&lt;/p&gt;

&lt;p&gt;The interesting part is how recall works — it's not keyword search. Every note gets embedded as a 384-dimensional vector using &lt;code&gt;bge-small-en-v1.5&lt;/code&gt; on Workers AI. When you ask Claude something, it searches by &lt;em&gt;meaning&lt;/em&gt;, not exact words.&lt;/p&gt;

&lt;p&gt;Store: &lt;em&gt;"users drop off at the payment step."&lt;/em&gt;&lt;br&gt;&lt;br&gt;
Query: &lt;em&gt;"onboarding problems."&lt;/em&gt;&lt;br&gt;&lt;br&gt;
It finds it. No keyword overlap needed.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Cloudflare
&lt;/h2&gt;

&lt;p&gt;Honestly, cost. The whole stack — Workers, D1 (SQLite), Vectorize, Workers AI embeddings — runs on Cloudflare's free tier at personal scale. You don't even need a credit card to get started.&lt;/p&gt;

&lt;p&gt;The other reason is deployment. There's a one-click deploy button that provisions everything automatically. It takes about 3 minutes to go from zero to a running second brain connected to Claude Desktop.&lt;/p&gt;
&lt;h2&gt;
  
  
  How to set it up
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Deploy&lt;/strong&gt; — click the button in the repo, Cloudflare provisions D1 + Vectorize and deploys the Worker.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Run the schema&lt;/strong&gt; — one SQL snippet in the Cloudflare dashboard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Set your auth token&lt;/strong&gt; — one command with wrangler.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Connect Claude Desktop&lt;/strong&gt; — add a few lines to your config JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"second-brain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"mcp-remote"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://&amp;lt;your-worker-url&amp;gt;/mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Claude now has persistent memory across every conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually use it for
&lt;/h2&gt;

&lt;p&gt;I have Claude set up to call &lt;code&gt;recall&lt;/code&gt; at the start of every conversation, before it says anything. So when I open a new chat and say "continue the onboarding work from last week," it already knows what that means.&lt;/p&gt;

&lt;p&gt;I also capture from everywhere — there's a browser bookmarklet that saves any highlighted text or page with one click, and iOS Shortcuts for voice capture on the go. "Hey Siri, brain dump" and I can dictate a note that shows up in Claude's memory immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it doesn't do (yet)
&lt;/h2&gt;

&lt;p&gt;There's no UI for browsing your memory. You can hit the &lt;code&gt;/list&lt;/code&gt; endpoint, but it's raw JSON. I want to build a proper dashboard eventually — something that shows your memory visually, lets you edit or delete entries, maybe shows what Claude has recalled most often.&lt;/p&gt;

&lt;p&gt;Also, the local dev experience is slightly annoying because Vectorize and Workers AI don't run locally — you end up pointing at remote resources for real testing. Not a dealbreaker, but worth knowing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The repo
&lt;/h2&gt;

&lt;p&gt;Everything is open source under MIT. One-click deploy, manual setup instructions, iOS Shortcuts templates, bookmarklet source — it's all there.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://github.com/rahilp/second-brain-cloudflare" rel="noopener noreferrer"&gt;github.com/rahilp/second-brain-cloudflare&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you use it, I'd genuinely like to know what you end up storing in it. That's the part I'm most curious about — what people actually find worth remembering.&lt;/p&gt;

</description>
      <category>claude</category>
      <category>cloudflare</category>
      <category>mcp</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
