DEV Community: Fex Beck

I accidentally built Karpathy's LLM Wiki — with 5,420 memories, 6 AI agents, and a self-healing knowledge graph

Fex Beck — Thu, 16 Apr 2026 19:29:44 +0000

When Andrej Karpathy published his LLM Wiki pattern on April 4, 2026, I had a strange feeling of deja vu. I'd been building exactly this — a persistent, compounding knowledge system maintained by AI agents — for months on my homelab server in Bavaria. Only mine had evolved into something he didn't describe: a multi-agent cognitive engine that fact-checks itself at 2 AM.

What Karpathy Described

Karpathy's core insight is elegant: RAG is stateless. Every query starts from scratch, searching raw documents. A wiki compounds knowledge — each query refines and connects what the system knows.

He proposes three layers (raw sources, wiki entries, schema) and three operations:

Ingest — extract structured knowledge from raw sources, cross-reference with existing entries
Query — search the wiki, synthesize an answer, optionally write back
Lint — periodically health-check for staleness, contradictions, gaps

It's a beautiful pattern. And I'd been living it for a month before he published it.

What I Built (Before Reading His Post)

BrainDB started in March 2026 as a dumb JSON store so my Claude Code sessions could remember things between conversations. "Just save the SSH password somewhere," I thought.

Five weeks later, it had grown into this:

5,420+ memories across 11 types: credential, service, project, feedback, lesson, issue, reference, user, note, research, decision
Hybrid search — SQLite FTS5 + semantic embeddings (Ollama on a local GPU) + Reciprocal Rank Fusion
Knowledge graph — 551 relations with automatic entity extraction
Multi-agent coordination — advisory locks, heartbeats, session handovers
105+ API endpoints, 40+ MCP tools wired directly into Claude Code

The architecture wasn't planned. It was grown, one annoying problem at a time. "Why does Claude keep forgetting how my firewall works?" led to memory types. "Why did two agents just edit the same file?" led to coordination. "Why is this credential from last week wrong?" led to contradiction detection.

The Mapping — His Pattern, My Implementation

When I read Karpathy's gist, I started mapping concepts:

Karpathy's Concept	BrainDB's Implementation
Raw Sources -> Wiki -> Schema	`research` memories -> `recall`/`search` -> `CLAUDE.md` bootstrap
Ingest (extract + cross-reference)	`/learn` + `/wiki/ingest` (entity extraction, relation creation)
Query (search wiki + synthesize)	`/ask` RAG pipeline + `/wiki/synthesize`
Lint (health check)	`/wiki/lint` + autoDream + contradiction detection
Index file	`/recall` + `/hybrid-search` (FTS5 + vectors)
Log file	`/wiki/log` (append-only activity chronicle)
"The LLM maintains everything"	6 AI agents, each with a specialty

The overlap was uncanny. But the differences were more interesting.

Where BrainDB Goes Further

Karpathy describes a single LLM maintaining a wiki. BrainDB does that, but it also does things his pattern doesn't cover:

1. Inception Knowledge — The 2 AM Fact-Checker

Every night at 2:30 AM, a cron job kicks off a dream cycle:

Wake GPU PC via Wake-on-LAN
-> Query SearXNG for facts to verify
-> Pre-gate with local 14B model (free)
-> Fact-check survivors with Mistral API (free tier)
-> Store validated findings back in BrainDB
-> Shut down GPU PC

Last week it caught that a Docker image I'd pinned had a CVE published two days prior. I woke up to a memory tagged type: issue with the CVE number and a suggested fix. Self-healing knowledge isn't a feature — it's the whole point.

2. Contradiction Detection

Five automated strategies scan for conflicts: port collisions, status mismatches, decision reversals, credential drift, and temporal impossibilities. When two memories disagree, the system flags it and optionally routes to Mistral for arbitration.

{
  "strategy": "port_conflict",
  "memory_a": "Grafana runs on port 3000",
  "memory_b": "Brain dashboard runs on port 3000",
  "severity": "high",
  "auto_resolved": false
}

This is Karpathy's "lint" on steroids. His lint checks for staleness and gaps. Mine checks whether the knowledge is internally consistent.

3. Multi-Agent Coordination

Not one LLM — six, each with a role:

Agent	Cost	Role
Claude Code (Opus)	expensive	Orchestration, complex multi-step tasks
Mistral	free	Strategy, analysis, fact-checking
Codestral	free	Code review, refactoring, security
Codex	subscription	Autonomous multi-file coding
Local 14B (Ollama)	free	Batch processing, embeddings, offline fallback
Vibe	free	Quick prototyping

They coordinate through BrainDB: advisory locks prevent two agents from editing the same project, heartbeats track who's alive, and handover records let a morning session pick up exactly where the 2 AM dream cycle left off.

4. Temporal Decay

Every memory has a freshness score with a 30-day half-life. A credential stored yesterday scores 1.0. The same credential after 60 days scores 0.25. Search results factor in freshness, so you naturally get current information first.

Karpathy's wiki doesn't age. Mine does. Because facts spoil.

5. Context Budget API

Ask for 4,000 tokens of context about "firewall configuration" and get exactly that — the most relevant memories packed into your budget, scored and ranked:

curl braindb:3197/compact \
  -d '{"query": "firewall rules", "budget": 4000}'

This turns BrainDB from a search engine into a context engine. The LLM doesn't drown in irrelevant results — it gets a curated briefing, every time.

The Wiki Integration (After Reading His Post)

Credit where it's due: Karpathy's framing gave me four endpoints I was missing.

/wiki/synthesize — Takes related memories and asks Mistral to produce a synthesis. Connections I hadn't seen emerged.
/wiki/lint — A proper health score: orphan detection, staleness audit, coverage gaps, contradiction count.
/wiki/ingest — Cascading ingest with entity extraction and auto-linking to the knowledge graph.
/wiki/log — An append-only chronicle of every knowledge mutation.

The lint run was humbling. First pass flagged 20,647 contradictions. After reviewing them, 99.7% were false positives — mostly memories that described the same thing at different points in time. After dismissing those, the health score went from 0 to 69. There's work to do.

Lessons Learned

Knowledge compounds, but only if maintenance cost is zero. This is Karpathy's core insight and it's exactly right. BrainDB's maintenance is automated — dream cycles, contradiction detection, temporal decay. If I had to manually curate 5,420 memories, I'd have stopped at 50.

A single LLM is not enough. Different models have different strengths and cost profiles. Claude orchestrates. Mistral analyzes. Codestral reviews code. The local model handles bulk work for free. The ensemble is smarter and cheaper than any single model.

Contradiction detection is the killer feature nobody talks about. Everyone builds RAG. Nobody builds systems that notice when their own knowledge is wrong. This is where the real reliability comes from.

The wiki pattern works better as an API than as markdown files. Karpathy describes a folder of markdown files. That works for a prototype. But once you need search, scoring, freshness, coordination, and budget-aware retrieval, you need a database with an API.

Try It Yourself

Karpathy's LLM Wiki pattern is a great mental model. But the real magic happens when you add agents, contradictions, and a system that learns while you sleep. BrainDB started as a hack to save SSH passwords and became the nervous system for everything I build.

If you want to try the lightweight version: BrainDB Light+ is open source — single Docker container, SQLite-backed, under 500 KB, with hybrid search and the full wiki pattern built in.

What does your AI remember between sessions? And more importantly — does it know when it's wrong?

This is the second article in my BrainDB series. The first one, "I Built an AI Memory That Fact-Checks Itself While You Sleep", covers the Inception Knowledge system in detail.

How to give Claude Code persistent memory with 51 MCP tools

Fex Beck — Wed, 08 Apr 2026 20:40:47 +0000

The Problem

Claude Code uses MEMORY.md files for persistence. 200-line limit, no search, no validation, no multi-agent support. After months of fighting this, I built BrainDB.

Setup (2 minutes)

Clone, configure, start:

git clone https://github.com/beckfexx/BrainDB.git
cd BrainDB && cp .env.example .env
bun install && bun run start

Add to your MCP config (Claude Code settings):

{

  "mcpServers": {

    "braindb": {

      "command": "bun",

      "args": ["run", "src/mcp-client.ts"],

      "env": { "BRAINDB_URL": "http://localhost:3197" }

    }

  }

}

What you get (51 tools)

Smart Context — Claude loads relevant memories based on CWD and git branch.

Decisions that stick — Authoritative memories that supersede conflicting old ones.

Session Handovers — Pass context to the next session seamlessly.

Contradiction Detection — Flags conflicting stored facts automatically.

Nightly Self-Learning — Validates memories against web sources overnight.

MEMORY.md vs BrainDB

Feature	MEMORY.md	BrainDB
Search	None	FTS5 + embeddings
Capacity	200 lines	Unlimited
Validation	None	Nightly fact-checking
Multi-agent	File conflicts	Claims + handovers
Contradictions	Manual	Auto-detected

The 51 Tools grouped

Memory (15): remember, recall, forget, decide, context, validate, learn...
Search (5): semantic-search, hybrid-search, embeddings...
Agents (8): heartbeat, claim, release, handover, inbox...
Graph (10): entities, relations, contradictions...
System (8): health, stats, backup, capabilities...
Inception (5): dream, decay, inception findings...

Every tool Claude Code needs for real persistent memory. No more MEMORY.md limitations.

GitHub: beckfexx/BrainDB — AGPL-3.0, TypeScript.

Why SQLite+FTS5 beats Vector DBs for AI Agent Memory

Fex Beck — Wed, 08 Apr 2026 20:40:09 +0000

The conventional wisdom is wrong

Everyone says you need a vector database for AI memory. Pinecone, Weaviate, Qdrant. They all need a server, an API key, and a monthly bill.

I went a different way: SQLite + FTS5. One file. Zero dependencies. Better results.

How it works

BrainDB stores 4,300+ memories in a single SQLite file with three search modes:

1. Full-text search (FTS5) — Sub-millisecond keyword search with BM25 ranking.

2. Embedding similarity — 384-dim vectors stored as BLOBs, cosine similarity computed in TypeScript.

3. Hybrid search — Reciprocal Rank Fusion combines both for best-of-both-worlds retrieval.

Custom relevance scoring

SQLite custom functions run type-aware ranking inside the database:

Decisions get +0.3 boost (authoritative)
Issues get -0.1 (often resolved)
Superseded memories return 0
Exponential time decay with type-specific half-lives

The numbers

Metric	SQLite+FTS5	Pinecone Free
Latency	<1ms	50-200ms
Setup	0 minutes	15 minutes
Cost	$0	$0-70/mo
Backup	cp file.db backup.db	API call
Offline	Yes	No

For most AI agent memory use cases, SQLite is the right choice.

Try it: github.com/beckfexx/BrainDB — AGPL-3.0, TypeScript, Bun.

I built an AI memory that fact-checks itself while you sleep

Fex Beck — Wed, 08 Apr 2026 19:45:46 +0000

The Problem

AI agents forget everything between sessions. Claude Code uses MEMORY.md files — a 200-line limit, no search, no validation. After months of manually maintaining memory files, I built something better.

What is BrainDB?

BrainDB is a local-first AI memory system built on SQLite. No cloud, no vector database, no subscriptions. One SQLite file, 110 REST endpoints, 51 MCP tools.

Why SQLite instead of Pinecone/Weaviate?

SQLite with FTS5 gives you sub-millisecond full-text search with BM25 ranking. Combined with embedding vectors (stored as BLOBs), you get hybrid search — keyword precision + semantic understanding. For <100k memories, this beats any managed vector DB in latency and simplicity.

The killer features nobody else has

1. Inception — Nightly fact-checking

Every night, BrainDB picks high-importance memories, generates search queries, searches the web via SearXNG, and fact-checks them with an LLM. Last run found 26 outdated facts automatically.

[inception] Processing: "PostgreSQL default port is 5433"
[inception] Web search: 3 results confirm port 5432
[inception] Result: CONTRADICTION — stored fact is wrong

2. autoDream — Knowledge consolidation

While you sleep, BrainDB merges duplicate memories, archives stale ones, and adjusts importance scores. Like defragmenting your AI's brain.

3. Multi-agent coordination

Multiple AI agents can share the same memory with pessimistic locking (claims), context handovers, heartbeat monitoring, and a signal protocol. No conflicts, no data races.

Architecture

MCP Client (Claude/Cursor) → MCP Server → REST API → SQLite+FTS5
                                              ↓
                                    Ollama (local embeddings)

7 route modules: Memory, Search, Agents, Graph, Orchestrator, System, Inception.

Relevance Scoring

Custom SQLite function with type-aware ranking:

Decisions get +0.3 boost (authoritative)
Issues get -0.1 (often resolved)
Superseded memories return 0 (replaced by newer version)
Exponential time decay with type-specific half-lives

Quick Start

git clone https://github.com/beckfexx/BrainDB.git
cd BrainDB && cp .env.example .env
bun install && bun run start
# → http://localhost:3197/health

Or with Docker:

docker compose up -d

Numbers

4,300+ memories in production
110 REST endpoints across 7 domain modules
51 MCP tools for Claude Code / Cursor
0 TypeScript errors
AGPL-3.0 — free for self-hosting

Compared to alternatives

Feature	BrainDB	MemPalace	LangChain Memory
Hybrid search (FTS5+embeddings)	✅	❌	❌
Self-learning (Inception)	✅	❌	❌
Multi-agent coordination	✅	❌	⚠️ Basic
Local-first (no cloud)	✅	✅	❌
Knowledge graph	✅	❌	❌
Production-tested	✅ 4.3k memories	Demo only	Framework

GitHub: beckfexx/BrainDB

Feedback welcome. This is a solo project — every star, issue, or PR helps.