김이더

Posted on Apr 14

To Teach AI How to Remember, First Teach It How to Forget 2/2

#ai #llm #memorybank #math

Code on GitHub. Paper on arXiv.
More posts at radarlog.kr.

Part 1 covered MemoryBank's architecture and how to use it. This part cracks open the engine. One formula. R = e^(−t/S). How it works, what happens when S changes, where to set the threshold — all in numbers.

Breaking Down the Formula

R = e^(−t/S)

Three variables. R is memory retention (between 0 and 1), t is time elapsed since learning, S is memory strength. e is Euler's number, approximately 2.71828.

This is an exponential decay model. Same structure as the formula for radioactive decay in physics. Values drop dramatically at first, then taper off gradually.

If you're a game developer, this pattern is everywhere. Alpha decay on particle effects, sound fade-outs, damage-over-time tick reduction. All exponential decay. The forgetting curve is the same math. Just applied to memories instead of damage.

Here's the key insight. What really matters in this formula is the t/S ratio. Not the individual values of t and S, but how they relate. When t/S equals 1, R is about 0.368 (36.8%). At t/S = 2, R drops to about 0.135 (13.5%). At t/S = 3, it's roughly 0.050 (5.0%). Each unit increase in the ratio cuts retention by roughly 1/e (about 36.8%).

import math

# Retention by t/S ratio
for ratio in [0, 0.5, 1, 2, 3, 5]:
    R = math.exp(-ratio)
    print(f"t/S = {ratio:.1f} → R = {R:.4f} ({R*100:.1f}%)")

# t/S = 0.0 → R = 1.0000 (100.0%)
# t/S = 0.5 → R = 0.6065 (60.7%)
# t/S = 1.0 → R = 0.3679 (36.8%)
# t/S = 2.0 → R = 0.1353 (13.5%)
# t/S = 3.0 → R = 0.0498 (5.0%)
# t/S = 5.0 → R = 0.0067 (0.7%)

The implication is clear. A larger S means the t/S ratio stays small for longer, so R stays high. S flattens the curve.

What Happens When S Increases

In MemoryBank, S is an integer. First mention: 1. One recall: 2. Another recall: 3. Simple. Let's see exactly what this simple change does to the curve.

Using days as the time unit for t. What's the retention of an S=1 memory after one day?

# Retention by S value and elapsed days
for S in [1, 2, 3, 5]:
    print(f"\n--- S = {S} ---")
    for t_days in [0, 0.5, 1, 2, 3, 5, 7]:
        R = math.exp(-t_days / S)
        print(f"  t={t_days}d → R = {R:.4f} ({R*100:.1f}%)")

Here's the picture:

S=1:  Day 0 100% → Day 1 36.8% → Day 2 13.5% → Day 3 5.0% → Day 7 0.1%
S=2:  Day 0 100% → Day 1 60.7% → Day 2 36.8% → Day 3 22.3% → Day 7 3.0%
S=3:  Day 0 100% → Day 1 71.7% → Day 2 51.3% → Day 3 36.8% → Day 7 9.7%
S=5:  Day 0 100% → Day 1 81.9% → Day 2 67.0% → Day 3 54.9% → Day 7 24.7%

An S=1 memory crashes to 36.8% after one day. After a week, 0.1%. Effectively dead.

At S=2, 60.7% survives after the same day. At S=5, 81.9% after a day. Even after a full week, 24.7% remains.

Since S goes up by 1 with each recall, a memory recalled 3 times (S=4) retains 17.4% after a week. Recalled 5 times (S=6) retains 31.1% after a week. The curve visibly flattens.

In game terms, think "buff duration increase from stacking." Apply the same buff multiple times, duration gets longer each time. More stacks, longer effect. MemoryBank's S works exactly like that.

The Mathematical Impact of Resetting t

The S increase matters, but resetting t to 0 actually creates the more dramatic effect.

Picture this scenario. A memory sits at S=1, and 3 days have passed. R equals e^(−3/1) = 0.050, or 5.0%. Nearly gone.

At this moment, the memory gets recalled in conversation. MemoryBank bumps S to 2 and resets t to 0. Instantly, R becomes e^(−0/2) = 1.000 — back to 100%.

Before recall:  S=1, t=3 days → R = 5.0%  (nearly dead)
After recall:   S=2, t=0 days → R = 100%  (fully revived)

5% to 100%. That jump is everything.

One more thing. The revived memory is stronger than before. S went from 1 to 2, so the next time 3 days pass, R won't be 5.0% — it'll be 22.3%. In its first life, 3 days was almost fatal. In its second life, 3 days is perfectly survivable.

Repeat this pattern:

# Recall simulation
# Memory recalled every 3 days
memory = {'S': 1, 't': 0}

for cycle in range(5):
    # 3 days pass
    memory['t'] = 3
    R_before = math.exp(-memory['t'] / memory['S'])

    # Recall occurs
    memory['S'] += 1
    memory['t'] = 0

    print(f"Cycle {cycle+1}: Pre-recall R={R_before:.4f} ({R_before*100:.1f}%)"
          f" → Post-recall S={memory['S']}, R=100%")

# Cycle 1: Pre-recall R=0.0498 (5.0%)  → Post-recall S=2, R=100%
# Cycle 2: Pre-recall R=0.2231 (22.3%) → Post-recall S=3, R=100%
# Cycle 3: Pre-recall R=0.3679 (36.8%) → Post-recall S=4, R=100%
# Cycle 4: Pre-recall R=0.4724 (47.2%) → Post-recall S=5, R=100%
# Cycle 5: Pre-recall R=0.5488 (54.9%) → Post-recall S=6, R=100%

The "pre-recall retention" for a memory recalled every 3 days climbs from 5.0% → 22.3% → 36.8% → 47.2% → 54.9%. Same 3-day gap, but accumulated recalls make the memory increasingly resilient.

This is the spacing effect that Ebbinghaus discovered, naturally embedded in the formula. Repeated review flattens the forgetting curve — that principle, encoded in math.

Where to Set the Threshold

The MemoryBank paper doesn't specify an exact threshold number. But to implement this, you need to decide: "At what R do we delete a memory?"

Reframe the question first. "How many days without recall should it take for a memory to be forgotten?"

The baseline is an S=1 memory (never recalled). How long it survives depends on the threshold.

# Time to forget for S=1, by threshold
for threshold in [0.5, 0.3, 0.1, 0.05, 0.01]:
    # R = e^(-t/S) → t = -S * ln(R)
    t_forget = -1 * math.log(threshold)
    print(f"Threshold {threshold} → S=1 memory dies after {t_forget:.2f} days")

# Threshold 0.5  → S=1 memory dies after 0.69 days (~17 hours)
# Threshold 0.3  → S=1 memory dies after 1.20 days (~29 hours)
# Threshold 0.1  → S=1 memory dies after 2.30 days
# Threshold 0.05 → S=1 memory dies after 3.00 days
# Threshold 0.01 → S=1 memory dies after 4.61 days

Threshold 0.5 means memories vanish in 17 hours. Too aggressive. Yesterday's conversation is already gone today.

Threshold 0.01 means 4.6 days of survival. Too loose. Meaningless chatter lingers for almost 5 days, wasting memory space.

The practical sweet spot is 0.05 to 0.1. S=1 memories naturally disappear within 2–3 days, while S=3+ memories (recalled twice or more) survive over a week.

You can also work backwards. If "important memories must survive at least 7 days" is a requirement, you can reverse-engineer S and the threshold.

# "What minimum S is needed to survive 7 days?"
target_days = 7
threshold = 0.1

# R = e^(-t/S) ≥ threshold
# S ≥ -t / ln(threshold)
S_min = -target_days / math.log(threshold)
print(f"To have R ≥ 0.1 after 7 days, need S ≥ {S_min:.2f}")

# To have R ≥ 0.1 after 7 days, need S ≥ 3.04

S needs to be at least 4 (recalled 3 times) to stay above the 0.1 threshold after 7 days. Tune these parameters to fit your service's characteristics.

Thinking in Half-Lives

The most intuitive metric for exponential decay is the half-life — how long until retention drops to 50%.

# R = e^(-t/S) = 0.5
# t_half = S * ln(2) ≈ S * 0.693

for S in [1, 2, 3, 5, 10]:
    t_half = S * math.log(2)
    print(f"S={S:2d} → half-life = {t_half:.2f} days")

# S= 1 → half-life = 0.69 days (~17 hours)
# S= 2 → half-life = 1.39 days (~33 hours)
# S= 3 → half-life = 2.08 days (~50 hours)
# S= 5 → half-life = 3.47 days
# S=10 → half-life = 6.93 days (~1 week)

Half-life is directly proportional to S. Double S, double the half-life. Linear relationship.

A never-recalled memory (S=1) has a half-life of 17 hours. Half gone before the day is over. A twice-recalled memory (S=3) has a half-life of 2 days. Nine recalls (S=10) gets you nearly a week.

These numbers make MemoryBank's design intent crystal clear. Topics that come up frequently are remembered longer. One-off mentions fade fast. Same pattern as human memory.

The game analogy here is aggro decay. When a player stops dealing damage to a boss, aggro decays over time. Keep hitting, and aggro stays high and builds. Damage dealt is "recall," aggro value is "retention." Same exponential dynamics.

Simulation: The Fate of 5 Memories Over 10 Days

Let's run a real scenario. Five memories, different recall patterns, 10 days.

import math

THRESHOLD = 0.1

memories = {
    "job_change":    {"S": 1, "t": 0, "born": 0, "recalls": []},
    "lunch_menu":    {"S": 1, "t": 0, "born": 0, "recalls": []},
    "UE5_bug":       {"S": 1, "t": 0, "born": 1, "recalls": []},
    "weekend_camp":  {"S": 1, "t": 0, "born": 2, "recalls": []},
    "salary_talk":   {"S": 1, "t": 0, "born": 3, "recalls": []},
}

# Recall schedule (by day)
recall_schedule = {
    "job_change":   [1, 3, 5, 8],    # Frequent recalls
    "lunch_menu":   [],               # Never recalled
    "UE5_bug":      [2, 4],           # Occasional recalls
    "weekend_camp": [3],              # Recalled once
    "salary_talk":  [4, 6, 7, 8, 9], # Very frequent recalls
}

for day in range(11):
    print(f"--- Day {day} ---")
    for name, mem in memories.items():
        if day < mem['born']:
            continue

        last_event = mem['recalls'][-1] if mem['recalls'] else mem['born']
        mem['t'] = day - last_event

        if day in recall_schedule[name]:
            mem['S'] += 1
            mem['recalls'].append(day)
            mem['t'] = 0

        R = math.exp(-mem['t'] / mem['S']) if mem['S'] > 0 else 0
        status = "ALIVE" if R >= THRESHOLD else "DEAD"

        print(f"  {name:14s}: S={mem['S']}, t={mem['t']}d, "
              f"R={R:.3f} ({R*100:.1f}%) [{status}]")
    print()

Key takeaways from the results.

"lunch_menu" was never recalled. S=1, R hits 13.5% by Day 2 and 5.0% by Day 3. Dead at the 0.1 threshold on Day 3.

"job_change" was recalled on Days 1, 3, 5, and 8. By Day 10, S=5, 2 days since last recall. R equals e^(−2/5) = 67.0%. Still healthy.

"salary_talk" was recalled almost daily starting Day 4. By Day 10, S=6, 1 day since last recall. R equals e^(−1/6) = 84.6%. The strongest memory.

"weekend_camp" was recalled exactly once on Day 3. S bumped to 2, but then 7 days of silence. By Day 10, R equals e^(−7/2) = 3.0%. Dead.

Same 10 days, completely different fates based on recall patterns. That's the core mechanism of MemoryBank at work.

Mathematical Limitations of the MemoryBank Model

The formula is clean. The gaps with reality are equally clear.

Linear S increase. S goes up by exactly 1 per recall. But real human memory reinforcement is nonlinear. The first review has the biggest impact, and subsequent reviews show diminishing returns. Something like S_new = S_old + 1/log(S_old + 1) would be more realistic than a flat S_new = S_old + 1.

No emotional weighting. Ebbinghaus's original experiments used meaningless syllables — things like WID and ZOF. He himself acknowledged that meaningful information is forgotten roughly 10 times more slowly. MemoryBank initializes S at 1 for everything — career doubts and lunch choices alike. Emotional significance isn't factored in. An extension could use LLM-based sentiment analysis to assign differential S_init values (say, 1–3) based on emotional intensity.

Ambiguous time units. The paper doesn't specify what unit t uses. Days? Hours? Minutes? Conversation sessions? The curve's shape changes entirely depending on the unit. This is the first parameter to lock down when deploying in production.

Recall detection criteria. "This memory was recalled during conversation" is determined by FAISS search results. Does appearing in top-k count as recall? Or does it need to actually influence the response? The answer changes how frequently S gets incremented. If memories that were retrieved but never used in the response still get their S bumped, memory strength gets overestimated.

These limitations are also expansion directions. The authors stating this is "an exploratory and highly simplified model" means the formula is a starting point, not the final answer. Layer emotional weighting, nonlinear S growth, and context-aware recall detection on top of the base formula, and you get a far more sophisticated memory system.

R = e^(−t/S). One formula explains the birth, reinforcement, and death of memories. Not a complex memory architecture — a 140-year-old psychological principle, transplanted onto LLMs. Simple but effective. And because it's simple, it's extensible.