Most AI agents have a memory problem.
You build a LangChain agent, a CrewAI crew, or a
custom AutoGen workflow. It runs perfectly. Then the
process ends and it forgets everything — user
preferences, past decisions, what it already tried.
Every run starts from zero.
The problem with context window stuffing
The common workaround is jamming previous context
into the prompt. This causes three problems:
- You pay for the same tokens every single run
- As the prompt grows, the model's attention dilutes
- Eventually you hit the token ceiling and lose the oldest (often most important) context
What agents actually need is an external memory layer
that persists between runs.
A simple solution: two REST calls
Here is the pattern that works:
Store a memory after something important happens:
curl -X POST https://memstore.dev/v1/memory/remember \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "User prefers concise responses
and works in Python"}'
Recall relevant context before the next run:
curl "https://memstore.dev/v1/memory/recall?q=user+preferences" \
-H "Authorization: Bearer YOUR_API_KEY"
Response:
{
"memories": [{
"content": "User prefers concise responses
and works in Python",
"score": 0.94
}]
}
The recall endpoint uses semantic search — it finds
relevant memories even when your query wording
differs from the stored content.
Wiring it into a LangChain agent
import requests
API_KEY = "your_key_here"
BASE = "https://memstore.dev/v1/memory"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
def remember(content, session=None):
requests.post(f"{BASE}/remember",
headers=HEADERS,
json={"content": content, "session": session})
def recall(query, session=None):
params = {"q": query}
if session:
params["session"] = session
r = requests.get(f"{BASE}/recall",
headers=HEADERS, params=params)
return r.json().get("memories", [])
# Before your agent runs
memories = recall("user preferences",
session="user_123")
context = "\n".join([m["content"] for m in memories])
# Inject into your prompt
prompt = f"Relevant context:\n{context}\n\nTask: ..."
# After your agent learns something new
remember("User is building a FastAPI app",
session="user_123")
The same pattern works for CrewAI
from crewai import Agent, Task, Crew
# Recall memory before crew starts
memories = recall("project context",
session="project_abc")
context = "\n".join([m["content"] for m in memories])
researcher = Agent(
role="Researcher",
goal="Find relevant information",
backstory=f"Previous context: {context}"
)
# After crew finishes, store what was learned
remember(f"Research complete: {result}",
session="project_abc")
Why not just use a vector database directly?
You could set up Supabase with pgvector, write
the embedding pipeline, tune the retrieval, and
manage the index. That is roughly 2-4 hours of
setup and ongoing maintenance.
Or two API calls.
Session isolation
Tag memories by user, task, or agent to keep
contexts separate:
# User-specific memory
remember("Prefers dark mode", session="user_8821")
# Task-specific memory
remember("Found 3 relevant papers",
session="research_task_42")
# Recall only within a session
memories = recall("preferences",
session="user_8821")
TTL for short-lived context
Set memories to expire automatically:
requests.post(f"{BASE}/remember",
headers=HEADERS,
json={
"content": "Current task: write unit tests",
"ttl": 3600 # expires in 1 hour
})
Getting started
Get a free API key at memstore.dev — 1,000
operations per month, no credit card required.
The free tier is enough to add persistent memory
to your first production agent.
What are you building with AI agents? Drop a comment
— always interested in real use cases.
Top comments (0)