Every conversation with Claude starts from scratch. Tell it you prefer concise code examples, and next session? It's forgotten.
Here's how to fix that—plus save 80% on API costs.
The Problem
Claude's API is stateless. Each request is independent. For simple Q&A, that's fine. But for:
- Development assistants that learn your coding style
- Customer support that remembers history
- Personal tools that adapt to preferences
...you need memory that persists.
The Solution
from empathy_llm_toolkit import EmpathyLLM
llm = EmpathyLLM(
provider="anthropic",
api_key="your-key",
memory_enabled=True
)
# This preference survives across sessions
response = await llm.interact(
user_id="dev_123",
user_input="I prefer Python with type hints, no docstrings"
)
That's it. Next time this user connects—even days later—Claude remembers.
Real-World Example: Debugging Wizard
Here's what persistent memory enables. I built a debugging wizard that correlates current bugs with historical patterns:
from empathy_software_plugin.wizards import MemoryEnhancedDebuggingWizard
wizard = MemoryEnhancedDebuggingWizard()
result = await wizard.analyze({
"error_message": "TypeError: Cannot read property 'map' of undefined",
"file_path": "src/components/UserList.tsx"
})
print(result["historical_matches"])
# Shows: "This looks like bug #247 from 3 months ago"
# Suggests: "Add null check: data?.items ?? []"
# Time saved: ~12 minutes
Without persistent memory, every bug starts from zero. With it, your AI assistant remembers every fix and suggests proven solutions.
How It Works
The Empathy Framework stores user context in a memory layer that:
- Persists across sessions - Preferences survive restarts
- Scopes by user - Each user has isolated memory
- Supports projects - Different contexts for different work
- Includes privacy controls - Clear memory, forget specific info
Five Levels of Empathy
The framework implements five collaboration levels:
| Level | Behavior | Example |
|---|---|---|
| 1 - Reactive | Standard request-response | Basic Q&A |
| 2 - Informed | Uses stored preferences | Remembers coding style |
| 3 - Proactive | Offers help when stuck | Detects struggle patterns |
| 4 - Anticipatory | Predicts needs | "This will break in 3 days" |
| 5 - Collaborative | Full partnership | Cross-domain learning |
# Level 4: Claude predicts and warns
response = await llm.interact(
user_id="dev_123",
user_input="Starting a new FastAPI project",
empathy_level=4
)
# Might warn: "You had async issues last time—here's a pattern that worked"
Privacy Built In
# Clear all memory
await llm.clear_memory(user_id="dev_123")
# Forget specific information
await llm.forget(user_id="dev_123", pattern="email")
Results
On a real codebase (364 debt items, 81 security findings):
- Bug correlation: 100% similarity matching with proven fixes
- Security noise reduction: 84% (81 → 13 findings after learning)
- Tech debt tracking: Trajectory predicts 2x growth in 170 days
NEW in v2.3: Smart Model Routing (80% Cost Savings)
Why pay Opus prices for simple tasks? The new ModelRouter automatically picks the right model:
llm = EmpathyLLM(
provider="anthropic",
enable_model_routing=True # NEW!
)
# Summarization → Haiku ($0.25/M tokens)
await llm.interact(user_id="dev", user_input="Summarize this", task_type="summarize")
# Code generation → Sonnet ($3/M tokens)
await llm.interact(user_id="dev", user_input="Write a function", task_type="generate_code")
# Architecture → Opus ($15/M tokens)
await llm.interact(user_id="dev", user_input="Design the system", task_type="architectural_decision")
Cost comparison on real workload:
- Without routing (all Opus): $4.05/complex task
- With routing (tiered): $0.83/complex task
- Savings: 80%
Get Started
pip install empathy-framework
Resources:
What would you build with an AI that remembers—and costs 80% less? Drop a comment below.
Top comments (0)