The previous article introduced engram-rs's three-layer memory architecture and design motivation. This one tackles a more specific question: how does retrieval quality not degrade as memories accumulate?
The answer lives in the scoring algorithms. Here's a visual breakdown of five core mechanisms.
1. Use It or Lose It
Left panel: a memory that's never recalled after storage. Importance decays smoothly, sinking to the bottom layer.
Right panel: a memory that gets periodically recalled. Each retrieval triggers an activation boost (yellow dots), pushing importance back up. The red dashed line shows the unrecalled trajectory for comparison.
This isn't a feature — it's the system's first principle: a memory's survival is determined by how often it's used. Retrieval isn't just a read operation — it's also a vote telling the system this memory still matters.
The result? After hundreds of consolidation epochs, frequently-used knowledge stays prominent, stale noise naturally sinks, and retrieval quality doesn't degrade as total memory count grows.
2. Exponential Decay, Not Linear
The previous article used importance × e^(-decay_rate × idle_hours / 168) for retrieval-time recency weighting. But how does importance itself decay? That's what actually determines whether a memory lives or dies.
Three curves show the decay trajectories for each memory kind:
| Kind | Half-life | Why |
|---|---|---|
| episodic | ~35 epochs | "Yesterday's debug log" — should fade if unused |
| semantic | ~58 epochs | "Auth uses OAuth2" — knowledge decays slower |
| procedural | ~173 epochs | "Deploy steps" — procedures should almost never fade |
The floor is 0.01. Memories never truly reach zero — given a precise enough query, a sunken memory can still be retrieved. This mirrors a human memory property: you think you've forgotten, but the right cue pulls it back.
Why exponential instead of linear? Linear decay has a fatal flaw: the cliff. The moment importance linearly decrements to zero, the memory is permanently lost with no chance of recovery. Exponential decay never reaches zero — it just gets closer and closer, leaving an infinitely long tail.
3. Logarithmic Saturation for Reinforcement
When a memory is stored repeatedly or recalled multiple times, its weight increases. But the growth curve is logarithmic, not linear.
rep_bonus = 0.17 × ln(1 + repetition_count), cap 0.7
access_bonus = 0.12 × ln(1 + access_count), cap 0.55
Why logarithmic?
Consider a counterexample: if rep_bonus were linear (say, 0.1 × count, cap 0.5), then a memory stored 5 times would max out its bonus. The 6th, 50th, and 500th submission — all identical in effect. You can't distinguish "mentioned a few times" from "repeatedly emphasized."
Logarithmic growth pushes the saturation point out to ~30 reps and ~100 accesses. The first few interactions matter most, then returns diminish while still contributing. This matches human learning research — spaced repetition works, but each additional review yields less marginal benefit.
4. Additive Biases Instead of Multiplicative
A memory's final weight is also influenced by its kind and layer. The chart shows the weight effect for all nine combinations (3 kinds × 3 layers):
- procedural + core ranks highest (+0.15 + 0.1 = +0.25)
- episodic + buffer ranks lowest (-0.1 - 0.1 = -0.2)
- semantic + working is the baseline (0)
Why emphasize "additive"?
An earlier version used multiplication: procedural memories ×1.3, core layer ×1.2. Sounds reasonable, but 1.3 × 1.2 = 1.56, while episodic × buffer = 0.8 × 0.8 = 0.64. The gap between the highest and lowest is 2.4× — procedural + core would systematically crush everything else, regardless of how relevant the content actually is.
Additive biases compress this ratio to under 1.6×. Kind and layer still influence ranking, but not enough to override the semantic relevance signal itself.
5. Sigmoid Score Compression
The final ranking score combines semantic relevance, memory weight, and time decay. This raw score is mapped through a sigmoid to the 0–1 range:
score = 2 / (1 + e^(-2x)) - 1
Why not just clamp at 1.0?
Because clamping destroys information. Say two memories score 1.3 and 2.1 in raw — after clamping, both become 1.0, and the system thinks they're "equally good." Sigmoid approaches 1.0 asymptotically but never reaches it, preserving discrimination in the high-score region.
The shaded area in the chart represents the ranking information that sigmoid preserves — the differences that a hard clamp would flatten.
The Full Scoring Formula
Putting all five mechanisms together, a memory's final retrieval score is:
weight = importance + rep_bonus + access_bonus + kind_bias + layer_bias
raw = relevance × (1 + 0.4 × weight + 0.2 × recency)
score = sigmoid(raw)
Where relevance comes from a hybrid of semantic embeddings and BM25 keyword search, recency is time-based exponential decay, and importance is the value after per-epoch exponential decay (counteracted by activation boosts on recall).
No magic numbers — every coefficient maps to an explainable cognitive mechanism.
Specs
| Language | Rust, single binary, zero external dependencies |
| Memory | ~100 MB RSS in production |
| Storage | SQLite, one .db file |
| Search | Semantic embeddings + BM25 (with CJK tokenization) |
| Platforms | Linux, macOS, Windows |
GitHub: github.com/kael-bit/engram-rs





Top comments (0)