lulzasaur

Posted on Apr 1

Add Persistent Memory to Your AI Agent in 5 Minutes

#tutorial

Most AI agents forget everything the moment their session ends. You can fix that with a single API call.

The Agent Memory API gives any AI agent persistent, searchable memory through a REST interface. No vector database. No Redis cluster. No infrastructure to manage. Just HTTP requests.

Here's how to wire it into your agent in 5 minutes.

The Problem

You're building an AI agent — maybe with LangChain, CrewAI, or a custom loop. It works great in a single session. But restart it and... blank slate. Every conversation starts from zero.

The usual fixes:

Vector databases (Pinecone, Weaviate) — overkill for structured state, expensive, complex setup
Redis — fast but ephemeral, no search, another service to manage
File storage — no concurrency, no search, doesn't scale

What you actually want: a simple key-value store with namespaces, search, and automatic cleanup.

What We're Building

A Python agent helper that stores and retrieves memories across sessions:

from agent_memory import AgentMemory

memory = AgentMemory(namespace="customer-support-bot")

# Store a memory
memory.store("user_123_prefs", {
    "name": "Sarah",
    "timezone": "America/New_York",
    "preferred_tone": "casual"
})

# Retrieve it later (even after restart)
prefs = memory.get("user_123_prefs")
print(prefs["value"]["name"])  # "Sarah"

# Search across all memories
results = memory.search("casual tone")

Step 1: Get Your API Key

Subscribe to the Agent Memory API on RapidAPI. The free tier gives you requests to get started.

Copy your X-RapidAPI-Key from the dashboard.

Step 2: Build the Memory Client

Create agent_memory.py:

import requests
import json
import os

class AgentMemory:
    def __init__(self, namespace, api_key=None):
        self.namespace = namespace
        self.base_url = "https://agent-memory-api.p.rapidapi.com"
        self.headers = {
            "X-RapidAPI-Key": api_key or os.environ["RAPIDAPI_KEY"],
            "X-RapidAPI-Host": "agent-memory-api.p.rapidapi.com",
            "Content-Type": "application/json"
        }

    def store(self, key, value, ttl_seconds=None, metadata=None):
        """Store a memory. Value can be any JSON-serializable object."""
        payload = {
            "namespace": self.namespace,
            "key": key,
            "value": json.dumps(value) if not isinstance(value, str) else value,
        }
        if ttl_seconds:
            payload["ttl_seconds"] = ttl_seconds
        if metadata:
            payload["metadata"] = metadata

        resp = requests.post(
            f"{self.base_url}/v1/memories",
            headers=self.headers,
            json=payload
        )
        resp.raise_for_status()
        return resp.json()

    def get(self, key):
        """Retrieve a specific memory by key."""
        resp = requests.get(
            f"{self.base_url}/v1/memories/{self.namespace}/{key}",
            headers=self.headers
        )
        if resp.status_code == 404:
            return None
        resp.raise_for_status()
        result = resp.json()
        try:
            result["value"] = json.loads(result["value"])
        except (json.JSONDecodeError, TypeError):
            pass
        return result

    def search(self, query, limit=10):
        """Full-text search across all memories in this namespace."""
        resp = requests.post(
            f"{self.base_url}/v1/memories/{self.namespace}/search",
            headers=self.headers,
            json={"query": query}
        )
        resp.raise_for_status()
        return resp.json().get("results", [])

    def delete(self, key):
        """Delete a specific memory."""
        resp = requests.delete(
            f"{self.base_url}/v1/memories/{self.namespace}/{key}",
            headers=self.headers
        )
        resp.raise_for_status()
        return resp.json()

    def list_all(self, limit=50):
        """List all memories in this namespace."""
        resp = requests.get(
            f"{self.base_url}/v1/memories/{self.namespace}",
            headers=self.headers,
            params={"limit": limit}
        )
        resp.raise_for_status()
        return resp.json()

Step 3: Wire It Into Your Agent

Here's a practical example — a support agent that remembers user preferences across sessions:

import os
from agent_memory import AgentMemory

memory = AgentMemory(
    namespace="support-agent-v1",
    api_key=os.environ["RAPIDAPI_KEY"]
)

def handle_message(user_id, message):
    user_ctx = memory.get(f"user_{user_id}")

    if user_ctx:
        print(f"Welcome back, {user_ctx['value']['name']}!")
        context = user_ctx["value"]
    else:
        context = {"name": "there", "interactions": 0}

    response = process_with_llm(message, context)

    context["interactions"] = context.get("interactions", 0) + 1
    context["last_message"] = message
    memory.store(f"user_{user_id}", context)

    return response

Step 4: Add Searchable Knowledge

The API includes full-text search powered by SQLite FTS5:

memory.store(
    "learning_001",
    "Users in EST timezone prefer morning follow-ups",
    metadata={"category": "scheduling", "confidence": 0.85}
)

memory.store(
    "learning_002",
    "Billing questions resolve faster with screenshot evidence",
    metadata={"category": "billing", "confidence": 0.92}
)

# Search for relevant knowledge
results = memory.search("billing")
for r in results:
    print(f"  {r['key']}: {r['value']}")

Step 5: Automatic Cleanup with TTL

For temporary state, set a TTL so memories auto-expire:

# Expires in 1 hour — no cron jobs needed
memory.store(
    "session_abc123",
    {"step": 3, "form_data": {"email": "user@example.com"}},
    ttl_seconds=3600
)

Why Not a Vector Database?

Feature	Agent Memory API	Vector DB (Pinecone, etc.)
Setup time	5 minutes	30+ minutes
Exact retrieval	Yes (key-based)	No (approximate)
Full-text search	Yes (FTS5)	Needs separate index
Cost	Free tier available	$70+/month
Auto-expiry (TTL)	Built-in	Manual cleanup

For agent state — preferences, conversation history, learned rules — you want exact, predictable retrieval. Not something semantically similar.

Available Endpoints

Method	Endpoint	Description
POST	`/v1/memories`	Store a memory
POST	`/v1/memories/batch`	Store up to 100
GET	`/v1/memories/:namespace`	List namespace
GET	`/v1/memories/:namespace/:key`	Get specific
PUT	`/v1/memories/:namespace/:key`	Update
DELETE	`/v1/memories/:namespace/:key`	Delete
POST	`/v1/memories/:namespace/search`	Full-text search
GET	`/v1/stats`	Usage statistics

The Agent Memory API free tier is enough to prototype. Paid tiers start at $9.99/month for production workloads.

DEV Community