The Problem
AI coding assistants forget everything between sessions. Every conversation starts from zero. You explain your architecture, your patterns, your preferences — and next time, it's gone.
I built Cuba-Memorys to fix this.
What is it?
An MCP server that gives AI agents persistent long-term memory using a knowledge graph backed by PostgreSQL. 12 tools with Cuban-themed names.
Key Features
🧠 Knowledge Graph
Entities, observations, and typed relations that persist across sessions. Your AI remembers projects, decisions, patterns, and connections between them.
⚡ Hebbian Learning
"Neurons that fire together wire together" — memories strengthen with use (Oja's rule) and fade adaptively using FSRS spaced repetition (the algorithm behind Anki).
🔍 4-Signal RRF Fusion Search
Not just keyword matching. Cuba-Memorys fuses 4 search signals:
- TF-IDF similarity
- PostgreSQL full-text search
- Trigram matching
- Optional pgvector HNSW embeddings
Results are ranked using Reciprocal Rank Fusion for maximum recall.
### 🕸️ GraphRAG Enrichment
Top search results are automatically enriched with degree-1 graph neighbors, giving your AI topological context — not just text matches.
### 🛡️ Anti-Hallucination Grounding
Verify claims against stored knowledge with graduated confidence scoring:
verified→partial→weak→unknown. Your AI can fact-check itself before answering. ### 😴 REM Sleep Consolidation After 15 minutes of inactivity, the system automatically runs maintenance: adaptive decay, deduplication, and memory consolidation. Like biological sleep for your AI's memory. ### 📊 Graph Analytics - PageRank — Find the most important entities
- Louvain communities — Discover knowledge clusters
-
Betweenness centrality — Identify bridge concepts
## The 12 Tools
| Tool | Purpose |
|------|---------|
|
cuba_alma| CRUD knowledge graph entities | |cuba_cronica| Attach observations to entities | |cuba_puente| Create/traverse relations | |cuba_faro| Search with anti-hallucination | |cuba_alarma| Report and track errors | |cuba_expediente| Search past errors (anti-repetition) | |cuba_remedio| Mark errors as resolved | |cuba_eco| RLHF feedback loop | |cuba_decreto| Record architecture decisions | |cuba_jornada| Session management | |cuba_vigia| Graph analytics | |cuba_zafra| Memory maintenance | ## Quick Start
json
{
"mcpServers": {
"cuba-memorys": {
"command": "python",
"args": ["-m", "cuba_memorys"],
"env": {
"DATABASE_URL": "postgresql://user:pass@localhost:5432/brain"
}
}
}
}
¨¨
The database auto-provisions via Docker if needed. Zero manual setup.
Why "Cuba"?
The tools are named after Cuban Spanish words — alma (soul), crónica (chronicle), faro (lighthouse), remedio (remedy). It started as a personal project and the naming stuck.
Links
GitHub: github.com/lENADRO1910/cuba-memorys
PyPI: pip install cuba-memorys
License: CC BY-NC 4.0
If you find it useful, a ⭐ on GitHub would mean a lot. Open to feedback and contributions!
Top comments (2)
The 4-signal RRF fusion search is interesting — how does it perform when the knowledge graph gets past ~10K entities? Curious if the trigram matching or pgvector path dominates at that scale.
Great question — the scaling behavior depends on which signals are active.
At the PostgreSQL level, the 4-signal fusion uses two index-backed paths per query:
GIN tsvector — ts_rank_cd with plainto_tsquery('simple', ...) for full-text search. GIN indexes scale sub-linearly — PostgreSQL's GIN uses compressed posting lists, so even at 100K+ observations the index scan stays in the low milliseconds. This signal carries 35% weight.
GIN pg_trgm —
similarity(name, $1) > 0.3
as the filter predicate. Trigram GIN indexes are O(n) on the trigram set size, not the row count. At ~10K entities with typical name lengths (5-50 chars), the trigram signature is ~15 trigrams per entry, so the index stays compact. This signal carries 30% weight.
Importance + freshness — Pure scalar columns with no index scan needed (computed inline). These contribute 25% + 10% respectively.
The pgvector path (embedding <=> $2::vector) is only activated in verify mode via a separate query (SEARCH_VECTOR_SQL). It uses exact L2 distance scan (no HNSW/IVFFlat index currently) — this would be the bottleneck at >50K embeddings. For the standard hybrid search mode, pgvector is not in the hot path at all.
So to directly answer your question: at ~10K entities, trigram dominates latency (~2-5ms for the GIN scan) since tsvector is faster but trigram has to compute edit-distance similarities. The RRF fusion itself (Cormack 2009, k=60) is pure Python dict merging — O(n) where n = result count (capped at LIMIT, typically 10-50), so negligible.
At >50K scale with embeddings enabled, I'd add an HNSW index (CREATE INDEX ON brain_observations USING hnsw (embedding vector_cosine_ops)) to keep the vector path under 10ms. That's on the roadmap but hasn't been necessary yet — the current deployment handles ~2K entities with sub-20ms total query time.
Appreciate the interest — scaling the vector path is definitely the next optimization target. 🤘