DEV Community

Cover image for I Gave My AI a Memory
Kristoffer Nordström
Kristoffer Nordström

Posted on • Originally published at blog.northerntest.se

I Gave My AI a Memory

I had a dream.

Not the inspirational kind. A practical, slightly obsessive dream that had been nagging me for years. I wanted a system that remembers what I know. Not a notes app, not a wiki, not another
tool that promises "second brain" and delivers a folder structure. An actual memory. Something I could ask "what did we discuss about the Zoom integration last month?" and get a real
answer.

I tried building it. A lot. Every attempt died the same way: too much manual work. The technology wasn't there. LLMs could sort of understand text, but they couldn't search it reliably,
couldn't integrate with my existing workflow, couldn't run locally without eating all my vRAM.

Then, over Christmas 2025, it worked.

The pain that started it

The immediate trigger was copy-paste. I'd be in a Claude Code session, working on something, and I'd need context from an email. So I'd switch to Gmail, find the thread, copy the relevant
bits, paste them into the conversation. Five minutes later I'd need a Slack message. Switch, search, copy, paste. Then a calendar event. Then something from a ChatGPT conversation I'd had
two weeks ago.

Every context switch cost me time and focus. And each one added a little mental resistance, a small reason to not bother looking something up, to work from memory instead of checking.
Death by a thousand paper cuts on my mental discipline. And every new Claude session started from zero. No memory of what we discussed yesterday. No knowledge of my preferences, my
projects, my testing philosophy. Just a blank slate with good language skills.

I'd been living with that friction for long enough.

The cornerstones

Before writing a line of code, I knew three things that weren't negotiable.

Local and private. My emails, my calendar, my notes, my conversations. All of it stays on my machine. No cloud service gets to index my life. The embeddings run on my own GPUs. If a
company goes bankrupt or changes their terms of service, nothing happens to my data.

Plain text. Everything stored as org-mode files, the same format I've used in Emacs for years for my task management. Editable in any text editor, forever. No proprietary format, no
database you can't inspect. If the entire system disappeared tomorrow, my knowledge would still be there as readable text files on my hard drive.

Human ownership. The system retrieves, I decide. It doesn't auto-organize my notes. It doesn't delete things it thinks are irrelevant. It doesn't act on my behalf. I curate what matters.

Three principles. They sound obvious. They shaped every technical decision that followed.

What I actually built

The knowledge base indexes nine data sources into a single searchable system: Gmail, Google Calendar, Google Drive, Slack threads, WhatsApp messages, ChatGPT conversation exports, GitHub
issues, handwritten org-mode notes, and captured Claude Code session context. Over forty-four thousand files of my digital life back to 2012, a hundred thousand indexed chunks, all
available through a set of tools that plug directly into Claude Code or any other MCP capable LLM via the Model Context Protocol.

Of course, that sounds clean on paper. In practice, each piece exists because something broke or something was missing.

I started with ChromaDB for the vector database. Worked fine for a while. Then a memory corruption bug silently wiped out part of my index. I only noticed because search results for a
topic I knew I'd written about came back empty. Memory corruption in a memory system. I migrated to LanceDB the same week. It gave me hybrid search (BM25 keyword matching plus vector
similarity) and a Rust engine I actually trust. That's a trust decision.

Search was the next problem. Pure vector search is great for "find me things about testing philosophy" but terrible for "find me that email from Fredrik Haard." Names, project codes,
technical terms. They get scattered across the vector space in ways that make exact matches unreliable. BM25 handles those perfectly. So the system runs both in parallel and combines the
results. You search for "Fredrik", you get Fredrik. You search for "that conversation about exploratory testing approaches", you get the right conversations even if they never use those
exact words.

Then there was the context window problem. I run local LLMs for evaluation testing (a whole other story), and they have tiny context windows. You can't load fifty full documents into 4K
tokens and expect useful answers. So I built two-stage retrieval. First pass: scan compressed doc-card summaries (title, key facts, topics) to find what's relevant and smaller in tokens.
Second pass: load the full document only for the hits that matter. Think of a librarian who pulls twenty books off the shelf by topic, then carefully reads each one to decide which three
actually answer your question.

The task management integration was the easy part, actually. I've been running a Getting Things Done system in Emacs org-mode for years. The knowledge base just symlinks to those files.
Tasks live in one place, searchable from everywhere. Not a new system. An extension of one I already trust.

And the whole thing is organized using a PARA structure (Projects, Areas, Resources, Archive) because it maps to how I already think about my life. Work stuff in one place, personal areas
in another, reference material separate from active work. The structure exists so the retrieval stays sane.

How it changed my work

The best way to describe it: Claude actually knows me now.

Last week I was preparing for a meeting about an upcoming conference webinar. Instead of hunting through Gmail and Slack, I asked Claude to pull up everything related to the planning. Back
came the email thread, my notes from our last call, the talk abstract, and the calendar invite. All in one response, from four different sources.

Knowledge base search pulling context from email, calendar, notes, and a past session

That used to be four apps and fifteen minutes of tab-switching. And death by a thousand paper cuts switching context all the time.

When I start a morning planning session, the system pulls in my calendar, recent emails, overnight slack messages, and pending tasks without me having to explain any of it. When I'm about
to test a pull request at work, it already has the project documentation, the domain model, related Slack discussions, and my own testing heuristics loaded. I walk into the exploratory
testing session with context I used to spend twenty minutes assembling by hand. The sapient part (the judgment, the risk sense, the "what if I try this?") is still entirely mine. But now I
arrive prepared instead of cold. That preparation is what makes tools like ettool effective. The context feeds the judgment.

Every Claude session used to start with five minutes of "let me explain the background." Now the background is just there.

And the loop closes. After a session I run a single command that extracts the key decisions and new information from our conversation and files them into the right places in the knowledge
base. Next time anyone (me or the AI) asks about that topic, the latest context is already there.

Capturing session context back into the knowledge base

But the real change was subtler. I stopped losing things. Ideas I captured in a ChatGPT conversation six weeks ago are findable. An email from a colleague about a technical approach
surfaces when I'm working on the related project. Meeting notes from three months ago appear when the topic comes up again.

And it's not just work. I used the system to plan Christmas gifts last year. Tracking what my nephews are into, what my wife mentioned wanting months ago, coordinating several 40106 synth
modules and a banana synth build with my daughter. It sounds silly. But that's the point. It's not a developer productivity tool. It's a memory system for your actual life.

I'd spent years building habits around "put the right thing in the right folder." Now the right thing finds me when I need it. That shift was bigger than I expected.

I'm not automating myself away. I'm removing the cost of loading and saving context.

The trap nobody warns you about

And then something happened that I didn't anticipate. I started working faster. A lot faster. Context switches disappeared. Research that used to take an hour took minutes. Writing with
full context instead of half-remembered details meant fewer rewrites.

Sounds great.

It wasn't entirely.

A former employer nicknamed me the Duracell bunny. Just keep going, one more thing, you're on a roll. Removing friction turned out to be a trap I didn't see coming. When everything becomes
frictionless, the limiting factor isn't productivity. It's knowing when to stop.

I'm not great at that. More on this in a future post.

Why this matters beyond me

Context infrastructure is what makes AI collaboration work. Not better models. Not cleverer prompts. The thing that changed my daily work wasn't switching from GPT-4 to Claude or tweaking
my prompting style. It was giving the AI access to what I know, what I've done, and how I think. But local AI, within my control, secure and private.

Every indexed document, every curated fact is a piece of context I'll never have to explain again. Systematically moving intelligence from runtime to build-time, while keeping human
judgment as the final authority.

That's the architecture nobody sees. And it matters more than the model.


This is the first in a series about what happened when I gave my AI a memory. Next up: what happens when you remove all the friction from your work, and why I had to deliberately put some
back.

Top comments (0)