DEV Community

Ashwani Jha
Ashwani Jha

Posted on

I got tired of Agents forgetting everything, so I built a memory layer. No more re-building RAG pipelines everytime.

Every AI agent I built had the same problem: it forgot everything the moment the conversation ended.

Not because the LLM is bad. Because there was no memory layer wiring things together. So I'd ship a chatbot, watch users re-explain their context every session, and quietly die inside.

I spent a few months building extremis to fix this.

Here's the part that matters most.

One import change

# Before
import anthropic
client = anthropic.Anthropic(api_key="sk-ant-...")

# After
from extremis.wrap import Anthropic
from extremis import Extremis

client = Anthropic(api_key="sk-ant-...", memory=Extremis())
Enter fullscreen mode Exit fullscreen mode

That's it. Every client.messages.create() call now automatically recalls relevant past context before the LLM call, and saves the conversation after. Your application code doesn't change at all.

Works with OpenAI too:

from extremis.wrap import OpenAI
client = OpenAI(api_key="sk-...", memory=Extremis())
Enter fullscreen mode Exit fullscreen mode

What makes it different from just storing messages in a database?

Most memory systems are cosine search — the most similar memory wins. That's the wrong metric. Similar ≠ useful.

extremis adds RL scoring. Every recalled memory can receive a +1 or -1 signal. Positive ones rank higher over time. Negative ones fade — with 1.5× weight, the same asymmetry human threat-learning uses.

results = mem.recall("what does the user prefer?")

# After using these memories in your response:
mem.report_outcome([r.memory.id for r in results], success=True)

# Next recall — confirmed-useful memories surface first
Enter fullscreen mode Exit fullscreen mode

Every result also tells you why it ranked there:

"similarity 0.91 · score +4.0 · used 8× · 3 days old"
Enter fullscreen mode Exit fullscreen mode

No black box. Fully debuggable.

It also has a knowledge graph

Vectors answer "what's related to this topic?" The graph answers "who does Alice work for?":

from extremis.types import EntityType

mem.kg_add_entity("Alice", EntityType.PERSON)
mem.kg_add_relationship("Alice", "Acme Corp", "works_at")
mem.kg_add_attribute("Alice", "timezone", "Asia/Dubai")

result = mem.kg_query("Alice")
# → works_at Acme Corp, timezone: Asia/Dubai
Enter fullscreen mode Exit fullscreen mode

Claude Desktop (zero code)

pip3.11 install "extremis[mcp]"
Enter fullscreen mode Exit fullscreen mode

Add two lines to claude_desktop_config.json, restart Claude Desktop, and you get 10 memory tools automatically. No Python code at all.

Try it

pip3.11 install extremis
extremis-demo    # shows everything working in ~20 seconds
Enter fullscreen mode Exit fullscreen mode

Happy to answer questions about the RL scoring design, the knowledge graph, or anything else in the comments.

Top comments (0)