DEV Community

Anushka Shukla
Anushka Shukla

Posted on

LLM Wiki: A Smarter Alternative to RAG

Every developer I know has the same problem.
You read a great article. You save it. You take notes. You bookmark three more links. A month later, you need that knowledge again and you're starting from scratch, re-reading the same things, rediscovering what you already knew.
So you do what every sensible developer does: you build a RAG pipeline.
You chunk your documents. You embed them. You set up a vector database. You wire up a retrieval layer. Three days later, you have a system that answers questions from your documents sort of. Sometimes the chunks are too small. Sometimes the wrong document gets retrieved. Sometimes the answer is technically correct but misses the point entirely.
And the worst part? The system learns nothing. Every query starts from zero.
There's a better way. Andrej Karpathy co-founder of OpenAI, former Director of AI at Tesla published a deceptively simple idea in April 2026 that's been quietly spreading through the developer community ever since.
He calls it the LLM Wiki.

What's Wrong with RAG?
Before I explain LLM Wiki, let me be fair to RAG. It's not bad. For large enterprise document stores thousands of PDFs, legal contracts, customer records RAG is the right tool. It scales in ways that other approaches can't.
But most developers aren't building enterprise search engines. They're trying to get an AI to work intelligently with their knowledge their notes, their research, their team's documentation.
And for that use case, RAG has a fundamental flaw: it's stateless.
Every time you ask a question, the system re reads your raw documents from scratch. It retrieves chunks. It answers. It forgets. The next question, same thing. There's no accumulation. No synthesis. No memory of what it figured out last time.
It's like hiring a brilliant researcher who reads your entire library before answering every single question and then immediately forgets everything when you ask the next one.

The LLM Wiki Pattern
Karpathy's insight is almost embarrassingly simple once you hear it.
Instead of querying raw documents over and over, you have an LLM compile your sources into a structured wiki of plain Markdown files once. From then on, you query the compiled wiki, not the raw sources.
Think of it like this:

"Obsidian is the IDE, the LLM is the programmer, the wiki is the codebase."
— Andrej Karpathy

Your raw documents are source material. The LLM reads them and produces clean, structured, interlinked pages one per concept. Like Wikipedia, but for your personal knowledge.
When you have a new source, you add it and run the compilation again. The LLM updates the relevant pages, adds new ones, and notes any contradictions. The wiki grows. The knowledge compounds.

What a Wiki Page Looks Like
Here's an example of what the LLM produces a page for the concept "vector embeddings":
markdown# Vector Embeddings

Type: Concept

Related: [[RAG]], [[Semantic Search]], [[Neural Networks]]

Summary

Vector embeddings are numerical representations of text that capture semantic
meaning. Similar concepts produce vectors that are close together in space.

How They Work

Text is passed through a model which outputs a list of numbers the embedding.
These numbers encode the meaning of the text in a way that allows mathematical
comparison.

Where They're Used

  • RAG pipelines (retrieving relevant document chunks)
  • Semantic search engines
  • Recommendation systems

Limitations

  • Embeddings from different models aren't compatible
  • Quality degrades on very long texts
  • Require a vector database to query at scale

Sources

  • [[Attention Is All You Need Vaswani et al.]]
  • [[OpenAI Embeddings Documentation]] Clean. Structured. Cross-referenced. Written once, queryable forever.

RAG vs LLM Wiki: A Direct Comparison
RAGLLM WikiLearns over time? No YesInfrastructure neededVector DB + embeddings + retrieval pipelineA text editorQuery costHigh re reads docs every timeLow reads structured wikiCross-source synthesisWeakStrongBest forMillions of documentsPersonal / team knowledge

How to Try It in 30 Minutes
You don't need to write any code to try this. Here's the simplest possible version:
Step 1 — Gather your sources
Create a folder. Drop in 3–5 Markdown or text files articles you've saved, notes you've taken, documentation you reference often.
Step 2 — Compile the wiki
Open Claude.ai, upload your files, and paste this prompt:
Read these documents and create structured Markdown wiki pages for the key
concepts. For each concept, write a summary, list related concepts using
[[wikilinks]], and note the source. Output each page as a separate code block.
Step 3 — Save and query
Copy the generated pages into a folder. Open it in Obsidian (free). Now ask questions against your wiki instead of your raw documents.
That's it. No vector database. No embedding model. No retrieval pipeline. Just Markdown files and an LLM.

When to Stick with RAG
LLM Wiki isn't a replacement for RAG in every situation. Use RAG when:

You have thousands of documents the wiki would become too large for a context window
Your documents update constantly real time data needs real time retrieval
You need exact citation RAG can point to the precise chunk; a wiki synthesizes across sources

For everything else personal research, team knowledge bases, project documentation, learning notes the LLM Wiki pattern is simpler, cheaper, and produces better answers.

The Bigger Idea
What Karpathy identified isn't just a technical pattern. It's a shift in how we think about AI and knowledge.
Most AI tools treat your documents as static inputs things to be searched. The LLM Wiki treats your knowledge as something alive something that grows, connects, and compounds every time you add to it.
RAG answers questions. LLM Wiki builds understanding.
And honestly? That's the kind of AI tool I actually want to use.

Have you tried building a personal knowledge base with LLMs? Drop a comment below I'd love to hear what's working for you.

Want a deeper dive?
I wrote a more detailed version of this on Medium covering the full history behind the idea (including Vannevar Bush's 1945 Memex concept) and a step by step walkthrough for three different implementation paths.

Read the full article on Medium

If you find it useful, a follow there helps me write more content like this.

Top comments (0)