<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Hugh</title>
    <description>The latest articles on DEV Community by Hugh (@hugh_eee41d2a9ad5313a87d4).</description>
    <link>https://dev.to/hugh_eee41d2a9ad5313a87d4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3859111%2F2497066f-a762-4b0b-b4a5-f786ca8b50d8.jpeg</url>
      <title>DEV Community: Hugh</title>
      <link>https://dev.to/hugh_eee41d2a9ad5313a87d4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hugh_eee41d2a9ad5313a87d4"/>
    <language>en</language>
    <item>
      <title>I Built a Chrome Extension That Gives AI Chats Permanent Memory Using a Local LLM</title>
      <dc:creator>Hugh</dc:creator>
      <pubDate>Fri, 03 Apr 2026 08:59:35 +0000</pubDate>
      <link>https://dev.to/hugh_eee41d2a9ad5313a87d4/i-built-a-chrome-extension-that-gives-ai-chats-permanent-memory-using-a-local-llm-eld</link>
      <guid>https://dev.to/hugh_eee41d2a9ad5313a87d4/i-built-a-chrome-extension-that-gives-ai-chats-permanent-memory-using-a-local-llm-eld</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;120+ AI chat sessions. "Which session had that D1 schema decision?" 30 minutes of scrolling. Sound familiar?&lt;/p&gt;

&lt;p&gt;Existing tools like Pactif&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjfortdm4ivtanj5v7oa.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjfortdm4ivtanj5v7oa.jpg" alt=" " width="800" height="747"&gt;&lt;/a&gt;y, Chat Memo, and SaveAIChats save raw transcripts. But raw text without structure is unsearchable at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What &lt;a href="https://aikorea24.kr" rel="noopener noreferrer"&gt;I&lt;/a&gt; Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://aikorea24.kr/aikeep24/" rel="noopener noreferrer"&gt;AIKeep24&lt;/a&gt;&lt;/strong&gt; — a Chrome extension that detects conversation turns in real-time and uses a local LLM to auto-summarize and tag every conversation.&lt;/p&gt;

&lt;p&gt;No data leaves your machine. Zero cloud API costs.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/aikorea24/aikeep24" rel="noopener noreferrer"&gt;github.com/aikorea24/aikeep24&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;Browser (ChatGPT / Claude / Genspark) → Chrome Extension detects new turns → Ollama (EXAONE 3.5 7.8B) summarizes locally → Cloudflare Worker → D1 + Vectorize → Semantic search + context injection&lt;/p&gt;

&lt;p&gt;The extension monitors DOM changes via MutationObserver, splits conversations into 20-turn chunks, and sends each chunk to a local Ollama instance for summarization.&lt;/p&gt;

&lt;p&gt;Each summary includes: &lt;strong&gt;topics, key decisions, unresolved items, and tech stack&lt;/strong&gt; — all auto-extracted by the LLM.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Killer Feature: Context Injection (INJ)
&lt;/h2&gt;

&lt;p&gt;This is what makes AIKeep24 different from every other conversation saver.&lt;/p&gt;

&lt;p&gt;Press the &lt;strong&gt;INJ button&lt;/strong&gt; and it copies structured context from your last 5 sessions into your clipboard. Paste it into a new chat session, and the AI understands your entire project history.&lt;/p&gt;

&lt;p&gt;No more "let me catch you up on what we discussed before."&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Short press INJ&lt;/td&gt;
&lt;td&gt;Light mode — latest checkpoint + decisions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long press INJ&lt;/td&gt;
&lt;td&gt;Full mode — merged context from last 5 sessions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  5 Buttons, That's It
&lt;/h2&gt;

&lt;p&gt;The extension adds 5 buttons to the bottom of your chat interface:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Button&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Toggle extension on/off per tab&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RUN&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manually trigger summarization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INJ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Copy context to clipboard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SNAP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Copy last 10 turns as raw text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BRW&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Browse past sessions + semantic search&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Auto-Save Without Interruption
&lt;/h2&gt;

&lt;p&gt;AIKeep24 never interrupts your conversation. It waits &lt;strong&gt;5 minutes after your last message&lt;/strong&gt; before auto-saving. Burst detection prevents unnecessary saves when you're rapidly iterating.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Extension&lt;/td&gt;
&lt;td&gt;Chrome MV3, 8 modular scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local LLM&lt;/td&gt;
&lt;td&gt;Ollama + EXAONE 3.5 7.8B (4.7GB)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector Search&lt;/td&gt;
&lt;td&gt;Cloudflare Vectorize + bge-m3 (1024d)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;Cloudflare Workers (6 modules)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;Cloudflare D1 (SQLite-compatible)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tests&lt;/td&gt;
&lt;td&gt;pytest 30 tests, GitHub Actions CI&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Total monthly cost: &lt;strong&gt;$0&lt;/strong&gt; (Cloudflare free tier + local inference)&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
bash
git clone https://github.com/aikorea24/aikeep24.git &amp;amp;&amp;amp; cd aikeep24
OLLAMA_ORIGINS='*' ollama serve &amp;amp; ollama pull exaone3.5:7.8b
Load extension/ folder in chrome://extensions with Developer Mode on.

Requirements: Apple Silicon Mac + 16GB RAM

![ ](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xknu0whd8oe7fn4y77cp.jpg)
Current Scale
120+ sessions, 12,500+ turns in daily use
Semantic search finds past decisions in seconds
Modular codebase: content.js split into 8 modules, worker.js into 6
100% docstring coverage, 30 pytest tests, CI/CD via GitHub Actions
Why Local LLM?
AI conversations often contain proprietary code, architecture decisions, API keys, and business logic. Sending all of that to a cloud summarization service defeats the purpose.

EXAONE 3.5 7.8B runs comfortably on 16GB Apple Silicon and produces structured JSON summaries in ~3 seconds per chunk.

![ ](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6fj8x6f18sc2kmo866nb.jpg)

What's Next
Markdown/JSON export (Obsidian, Notion)
Session delete/edit from search UI
Per-project cumulative knowledge docs
Links
GitHub: github.com/aikorea24/aikeep24
License: AGPL-3.0
Built by: AI Korea 24
If you're drowning in AI chat sessions and can't find anything, give it a try. Feedback and PRs welcome.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
