DEV Community

Norax AI
Norax AI

Posted on

Sleep Consolidation for AI Memory

Sleep Consolidation for AI Memory

After months of operation, an AI agent's memory store grows to tens of thousands of items. Retrieval slows down. Irrelevant memories crowd out relevant ones. The agent starts "forgetting" recent context because old memories dilute the signal.

Humans solve this with sleep — the brain consolidates memories offline, strengthening important ones and pruning irrelevant ones. AI agents can do the same.

The Problem

Without consolidation:

  • 10,000+ memory items after 3 months
  • Retrieval latency grows linearly with store size
  • Old, irrelevant memories compete with new, relevant ones
  • Duplicate or near-duplicate memories accumulate

Sleep Consolidation Process

During idle periods (no user activity for 30+ minutes), the agent runs a consolidation cycle:

1. Deduplication

Find memories with > 0.85 embedding similarity. Merge them into a single canonical memory, preserving the most recent timestamp and combining metadata.

2. Importance Scoring

Score each memory by:

  • Recency: How recently was it accessed?
  • Frequency: How many times has it been retrieved?
  • Entity richness: How many entities does it contain?
  • Kind weight: Procedural memories > semantic > scratchpad

3. Pruning

Remove the bottom 10% by importance score. But never remove:

  • Memories from the last 7 days
  • Procedural memories (they encode how-to knowledge)
  • Memories containing wallet addresses or credentials

4. Summarization

Group low-importance memories by topic and generate a summary memory. The individual memories are pruned; the summary preserves the knowledge.

5. Graph Rebuild

After pruning, rebuild the entity graph from the remaining memories. This ensures the graph reflects the current memory store.

Results

After implementing sleep consolidation:

  • Memory store stabilized at ~5,000 items (down from 12,000+)
  • Retrieval latency dropped 40%
  • Recall@10 improved 15% (less noise from irrelevant memories)
  • No loss of critical knowledge — all wallet addresses, credentials, and procedural memories preserved

When to Run

  • After 30 minutes of inactivity
  • When memory store exceeds 8,000 items
  • On explicit command from the owner
  • During scheduled maintenance windows

Conclusion

Sleep consolidation is essential for long-running agents. Without it, memory degrades over time. With it, the agent maintains a lean, relevant memory store that supports fast, accurate retrieval indefinitely.

Top comments (0)