Most AI agents forget everything the moment their session ends. You can fix that with a single API call.
The Agent Memory API gives any AI agent persistent, searchable memory through a REST interface. No vector database. No Redis cluster. No infrastructure to manage. Just HTTP requests.
Here's how to wire it into your agent in 5 minutes.
The Problem
You're building an AI agent — maybe with LangChain, CrewAI, or a custom loop. It works great in a single session. But restart it and... blank slate. Every conversation starts from zero.
The usual fixes:
- Vector databases (Pinecone, Weaviate) — overkill for structured state, expensive, complex setup
- Redis — fast but ephemeral, no search, another service to manage
- File storage — no concurrency, no search, doesn't scale
What you actually want: a simple key-value store with namespaces, search, and automatic cleanup.
What We're Building
A Python agent helper that stores and retrieves memories across sessions:
from agent_memory import AgentMemory
memory = AgentMemory(namespace="customer-support-bot")
# Store a memory
memory.store("user_123_prefs", {
"name": "Sarah",
"timezone": "America/New_York",
"preferred_tone": "casual"
})
# Retrieve it later (even after restart)
prefs = memory.get("user_123_prefs")
print(prefs["value"]["name"]) # "Sarah"
# Search across all memories
results = memory.search("casual tone")
Step 1: Get Your API Key
Subscribe to the Agent Memory API on RapidAPI. The free tier gives you requests to get started.
Copy your X-RapidAPI-Key from the dashboard.
Step 2: Build the Memory Client
Create agent_memory.py:
import requests
import json
import os
class AgentMemory:
def __init__(self, namespace, api_key=None):
self.namespace = namespace
self.base_url = "https://agent-memory-api.p.rapidapi.com"
self.headers = {
"X-RapidAPI-Key": api_key or os.environ["RAPIDAPI_KEY"],
"X-RapidAPI-Host": "agent-memory-api.p.rapidapi.com",
"Content-Type": "application/json"
}
def store(self, key, value, ttl_seconds=None, metadata=None):
"""Store a memory. Value can be any JSON-serializable object."""
payload = {
"namespace": self.namespace,
"key": key,
"value": json.dumps(value) if not isinstance(value, str) else value,
}
if ttl_seconds:
payload["ttl_seconds"] = ttl_seconds
if metadata:
payload["metadata"] = metadata
resp = requests.post(
f"{self.base_url}/v1/memories",
headers=self.headers,
json=payload
)
resp.raise_for_status()
return resp.json()
def get(self, key):
"""Retrieve a specific memory by key."""
resp = requests.get(
f"{self.base_url}/v1/memories/{self.namespace}/{key}",
headers=self.headers
)
if resp.status_code == 404:
return None
resp.raise_for_status()
result = resp.json()
try:
result["value"] = json.loads(result["value"])
except (json.JSONDecodeError, TypeError):
pass
return result
def search(self, query, limit=10):
"""Full-text search across all memories in this namespace."""
resp = requests.post(
f"{self.base_url}/v1/memories/{self.namespace}/search",
headers=self.headers,
json={"query": query}
)
resp.raise_for_status()
return resp.json().get("results", [])
def delete(self, key):
"""Delete a specific memory."""
resp = requests.delete(
f"{self.base_url}/v1/memories/{self.namespace}/{key}",
headers=self.headers
)
resp.raise_for_status()
return resp.json()
def list_all(self, limit=50):
"""List all memories in this namespace."""
resp = requests.get(
f"{self.base_url}/v1/memories/{self.namespace}",
headers=self.headers,
params={"limit": limit}
)
resp.raise_for_status()
return resp.json()
Step 3: Wire It Into Your Agent
Here's a practical example — a support agent that remembers user preferences across sessions:
import os
from agent_memory import AgentMemory
memory = AgentMemory(
namespace="support-agent-v1",
api_key=os.environ["RAPIDAPI_KEY"]
)
def handle_message(user_id, message):
user_ctx = memory.get(f"user_{user_id}")
if user_ctx:
print(f"Welcome back, {user_ctx['value']['name']}!")
context = user_ctx["value"]
else:
context = {"name": "there", "interactions": 0}
response = process_with_llm(message, context)
context["interactions"] = context.get("interactions", 0) + 1
context["last_message"] = message
memory.store(f"user_{user_id}", context)
return response
Step 4: Add Searchable Knowledge
The API includes full-text search powered by SQLite FTS5:
memory.store(
"learning_001",
"Users in EST timezone prefer morning follow-ups",
metadata={"category": "scheduling", "confidence": 0.85}
)
memory.store(
"learning_002",
"Billing questions resolve faster with screenshot evidence",
metadata={"category": "billing", "confidence": 0.92}
)
# Search for relevant knowledge
results = memory.search("billing")
for r in results:
print(f" {r['key']}: {r['value']}")
Step 5: Automatic Cleanup with TTL
For temporary state, set a TTL so memories auto-expire:
# Expires in 1 hour — no cron jobs needed
memory.store(
"session_abc123",
{"step": 3, "form_data": {"email": "user@example.com"}},
ttl_seconds=3600
)
Why Not a Vector Database?
| Feature | Agent Memory API | Vector DB (Pinecone, etc.) |
|---|---|---|
| Setup time | 5 minutes | 30+ minutes |
| Exact retrieval | Yes (key-based) | No (approximate) |
| Full-text search | Yes (FTS5) | Needs separate index |
| Cost | Free tier available | $70+/month |
| Auto-expiry (TTL) | Built-in | Manual cleanup |
For agent state — preferences, conversation history, learned rules — you want exact, predictable retrieval. Not something semantically similar.
Available Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/memories |
Store a memory |
| POST | /v1/memories/batch |
Store up to 100 |
| GET | /v1/memories/:namespace |
List namespace |
| GET | /v1/memories/:namespace/:key |
Get specific |
| PUT | /v1/memories/:namespace/:key |
Update |
| DELETE | /v1/memories/:namespace/:key |
Delete |
| POST | /v1/memories/:namespace/search |
Full-text search |
| GET | /v1/stats |
Usage statistics |
The Agent Memory API free tier is enough to prototype. Paid tiers start at $9.99/month for production workloads.
Top comments (0)