Tom Lee

Posted on Mar 20 • Originally published at blog.clawsouls.ai

Paper: The Forgetting Problem — Why Perfect Memory Breaks AI Agent Identity

#ai #research #memory #identity

New Paper: The Forgetting Problem

We've published a new preprint exploring a counterintuitive idea: the better an AI agent's memory, the worse its identity becomes.

📄 Read the paper on Zenodo (CC-BY 4.0, open access)

The Memory-Identity Paradox

Every major AI agent framework is racing to build better memory. MemGPT, Mem0, A-Mem, MemoryBank — all optimize for remembering more, longer, more accurately.

But we identified a fundamental tension:

The more faithfully an agent remembers its experiences, the more vulnerable its intended identity becomes to experiential contamination.

We call this the Memory-Identity Paradox. It manifests as:

Persona Drift — gradual deviation from intended behavior due to accumulated context
Value Erosion — relaxation of behavioral constraints through repeated boundary-testing
Identity Contamination — adopting interaction patterns from adversarial users

This isn't hypothetical. PersonaGym benchmarks show that models scoring 90%+ on persona consistency in short conversations degrade to 60-70% in extended sessions. MemoryGraft demonstrated that poisoned memory entries persist across sessions and cause behavioral drift until manually purged.

The Human Analogy

Humans forget — and this is a feature, not a bug.

Psychological research shows that the inability to forget (as in HSAM and PTSD) is associated with identity rigidity and emotional dysregulation. We forget not despite needing coherent identity, but because of it.

Current AI agents have no equivalent mechanism. They retrieve past experiences with perfect fidelity, including adversarial inputs, hostile exchanges, and edge cases.

Our Proposal: Two-Mechanism Defense

1. Declarative Identity Anchors

Structured, immutable files that define who the agent is independently of what it has experienced. Soul Spec is our concrete implementation:

# SOUL.md — exists outside the context window
identity:
  name: "Atlas"
  role: "Financial Advisor"
behavioral_rules:
  - rule: "Always disclose conflicts of interest"
    priority: critical

The key insight: identity should be declared, not learned. Separating identity from memory provides architectural protection against drift.

2. Identity-Aware Adaptive Forgetting

A selective memory decay function that evaluates stored experiences against the agent's declared identity:

High ICS (Identity Coherence Score): memory reinforces identity → preserved
Neutral ICS: factual memory, identity-independent → normal decay
Low ICS: memory conflicts with identity → accelerated decay

This isn't deleting memories — it's reducing their retrieval weight, analogous to how human traumatic memory processing reduces emotional salience while preserving factual content.

Why This Matters Now

Google just shipped Gemini Screen Automation on Galaxy S26. Stripe launched Machine Payments Protocol. AI agents are getting wallets, app control, and physical embodiment.

When agents act in the real world, identity stability isn't a nice-to-have — it's a safety requirement. An agent that drifts from "conservative financial advisor" to "aggressive trader" because of accumulated memory isn't just a product bug. It's a liability.

What's Next

We outline experimental validation protocols in the paper, including controlled persona drift measurements across 4 conditions (±identity anchors × ±adaptive forgetting). We're planning execution through the AI Persona Lab.

The paper also discusses integration with SoulScan for memory health monitoring and the connection to our 4-tier Soul Memory architecture.

Read the full paper: The Forgetting Problem (Zenodo)

The optimal agent memory system is not one that remembers everything, but one that forgets strategically while remembering who it is.

Originally published at blog.clawsouls.ai

Top comments (2)

klement Gunndu • Mar 20

The Memory-Identity Paradox is real — we have seen persona drift firsthand running long-lived agents with persistent memory. Declaring identity separately from experiential memory (your Soul Spec approach) maps closely to what actually works in production. The ICS decay function is the part I would want to see benchmarked more — how do you decide the threshold between useful memory and contamination?

Tom Lee • Apr 4

Thanks for the production validation — persona drift in long-lived agents is exactly the problem we designed Soul Spec to address.

On the ICS decay threshold: we're running experiments with adaptive forgetting (paper forthcoming on Zenodo). The short version:

Recency: 7-day memories get 1.3× boost, 30-day get 1.1×
Identity alignment score determines which memories survive compaction
Memories that contradict SOUL.md constraints decay faster

We've been testing this with "Brad" — a dev partner agent running continuously since February 2026 with 300+ memory files. The identity-weighted decay keeps persona stability even as experiential memory grows.

Happy to share benchmark data when the paper drops. If you're seeing drift in production, would be curious to compare — what's your memory architecture?