How do you make an AI that actually remembers?
Not just RAG over chunks. Not vector search. Real memory — the kind humans have.
That question grabbed me and wouldn't let go. So I read about the hippocampus, Ebbinghaus forgetting curves, complementary learning systems, slow-wave sleep replay. And then I built it.
## The Architecture
┌──────────────────────────────────────────────────────────────┐
│ 5-TIER MEMORY ARCHITECTURE │
├──────────────────────────────────────────────────────────────┤
│ │
│ TIER 1+2 EPISODIC BUFFER Brain: Hippocampus │
│ ═══════════════════════════════ Speed: <1ms │
│ 64 working + 256 episodic items │
│ Ebbinghaus decay: n^0.3 · e^(-λt) · importance │
│ Forget threshold: 0.05 | Promote threshold: 0.65 │
│ Access-based reinforcement on every read │
│ ↓ │
│ TIER 3 SEMANTIC STORE Brain: Neocortex │
│ ═══════════════════════════════ Speed: ~50ms │
│ ChromaDB v2 · all-mpnet-base-v2 (768-dim) │
│ Hybrid search: dense + BM25 → Reciprocal Rank Fusion │
│ ↓ │
│ TIER 4 KNOWLEDGE GRAPH Brain: Association Cortex │
│ ═══════════════════════════════ Speed: ~100ms │
│ spaCy NER + 30 keyword patterns │
│ NetworkX + SQLite · Multi-hop reasoning │
│ Auto-relation inference: uses/works_on/depends_on │
│ ↓ │
│ TIER 5 COLD ARCHIVE Brain: Distributed Cortex │
│ ═══════════════════════════════ Speed: async │
│ Filesystem JSON · YYYY/MM organization │
│ Full-text search · Thaw to active · Compact summaries │
│ │
├──────────────────────────────────────────────────────────────┤
│ CONSOLIDATION PIPELINE (Sleep Analog) │
│ ═══════════════════════════════════ │
│ Decay → Cluster → Merge(LLM) → Rescore → Promote → │
│ FindRelations(LLM) → Archive → Neurogenesis │
│ Quick: 60ms (every 5min) | Full: ~3s (on idle) │
├──────────────────────────────────────────────────────────────┤
│ STANDBY NEURON AGENTS │
│ ═══════════════════════════════════ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Personal │ │ Tech │ │ Projects │ ...N agents │
│ │ Agent │ │ Agent │ │ Agent │ │
│ │ │ │ │ │ │ │
│ │ DEEP 💤 │ │ LIGHT 🟡 │ │ DEEP 💤 │ │
│ │ 0 RAM │ │ ~3KB RAM │ │ 0 RAM │ │
│ │ 0 tokens │ │ ready │ │ 0 tokens │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ Wake: trigger patterns + centroid similarity │
│ Vote: all agents score → top K form consensus panel │
│ Sleep: return to idle after task (zero token consumption) │
│ Spawn: Neurogenesis creates agents from memory clusters │
│ Prune: inactive agents auto-removed after 30 days │
└──────────────────────────────────────────────────────────────┘
## The Two Novel Things
### 1. Standby Neuron Agents
Here's what hit me: biological neurons don't all fire at once. Only ~2% are active at any moment. The rest wait, silent, consuming almost nothing.
So I built agents that work the same way:
- DEEP_SLEEP — JSON file on disk. 0 RAM. 0 tokens. Just a prompt template and trigger patterns. Most agents live here.
- LIGHT_SLEEP — Centroid vector loaded (~3KB RAM). Agent checks: "Does this query match my domain?"
- ACTIVE — Woken by the Router when relevance score exceeds threshold. Does the job. Returns to sleep immediately after.
They wake on trigger pattern matching + embedding similarity, vote in consensus panels (all agents score the query, top K get activated), communicate via sparse blackboard, and return to sleep.
### 2. Neurogenesis
When the system notices a cluster of memories forming around a new topic — say, 6+ memories about Minecraft — it automatically spawns a new specialized agent for that domain.
If an agent hasn't been woken in 30 days? It gets pruned.
Nobody else does either of these.
## Sleep as a Feature
The consolidation pipeline runs like biological sleep — reorganizing, merging duplicates, strengthening important memories, letting the rest decay:
Quick mode: 60ms, zero LLM calls, runs every 5 minutes
Full mode: ~3 seconds, LLM-powered merge + relation discovery,
triggers on idle detection (15+ minutes of inactivity)
## Does It Work?
Episodic Buffer 41/41 ✅
Memory Integration 32/32 ✅
Semantic Store 26/26 ✅
Knowledge Graph 37/37 ✅
Consolidation 31/31 ✅
Standby Agents 42/42 ✅
Cold Archive 27/27 ✅
Cross-Tier (E2E) 88/88 ✅
────────────────────────────
TOTAL 324/324 ✅
## Why I'm Sharing This
I'm 18, from Slovakia. This started as a random vibecoding project — a voice assistant. But the memory problem grabbed me and wouldn't let go.
The long-term thing that drives me: I believe better memory for AI could eventually help with conditions like Alzheimer's. Computational memory prosthesis. That's the direction I want to explore.
GitHub: github.com/FogyXT/JARVIS
License: AGPL-3.0
Thanks for reading.
Top comments (0)