AI coding assistants still have a fundamental limitation: they forget everything between sessions.
Every time you open a new conversation, you re-explain your project structure, conventions, and past decisions. I wanted to fix that — locally, without external dependencies — using Model Context Protocol (MCP).
Most people assume this requires vector embeddings. After building and testing both approaches, I found that’s not always true.
⸻
The default assumption: embeddings everywhere
The obvious solution for semantic memory is vector embeddings.
Convert notes to vectors, store them in a vector database, and retrieve relevant context via cosine similarity.
This works well — but it comes with tradeoffs:
Requires an API key
Introduces cost per operation
Adds network latency
Sends your data off-machine
For developer memory that lives next to your editor, those tradeoffs matter.
⸻
The alternative: TF-IDF + cosine similarity
TF-IDF (Term Frequency–Inverse Document Frequency) is an old algorithm. It’s not trendy. But it’s extremely effective for focused, specialized vocabularies.
The idea is simple:
Terms that appear often in one document but rarely across all documents are likely important.
By converting notes into TF-IDF vectors and comparing them using cosine similarity, you get surprisingly strong results — fully offline.
Basic formula:
TF(term, doc) = count(term in doc) / total terms
IDF(term) = log(total documents / documents containing term)
Score = TF × IDF
⸻
Why TF-IDF works well for developer memory
Developer notes have a unique property: consistent terminology.
When you write about React hooks, you repeatedly use terms like useEffect, cleanup, dependency array.
When you write about Docker, you consistently use container, volume, compose.
TF-IDF performs best when:
The domain is narrow
Vocabulary is specialized
Search queries use similar wording to stored notes
That’s exactly how developer knowledge behaves.
⸻
What I observed in practice
On a corpus of a few hundred developer notes:
TF-IDF retrieved the correct context roughly 85% of the time compared to embeddings.
The missing 10–15% was mostly:
Synonyms
Very abstract phrasing
Cross-language queries
For most day-to-day coding memory, that gap didn’t matter.
⸻
Where TF-IDF falls short
It’s important to be honest about limitations:
It doesn’t understand synonyms
It doesn’t work well across languages
It struggles with vague or emotional queries
Precision drops on large, highly diverse corpora
If you need any of those, embeddings are the right choice.
⸻
Implementation notes
The implementation is deliberately minimal: pure JavaScript, no dependencies.
Tokenization, TF-IDF calculation, and cosine similarity can all be implemented in a few hundred lines of code.
Everything runs locally. Data stays on disk as JSON files.
⸻
When to choose each approach
Choose TF-IDF if:
Privacy matters
Zero setup is important
Your knowledge base is focused
You want zero ongoing costs
Choose embeddings if:
You need cross-language search
Your data spans many domains
Synonym matching is critical
You already rely on external APIs
⸻
Final thought
Newer isn’t always better.
For certain problems — especially local, developer-focused ones — simple, well-understood algorithms still punch far above their weight.
Sometimes the best solution is the one that removes friction instead of adding intelligence.
⸻
Top comments (0)