Modern AI agents aren't just chatbots with prompt-in, answer-out. To feel coherent and genuinely helpful over time, they need to remember. Agent memory is the capability that lets an agent retain facts, preferences, conversations, and experiences across turns and sessions so every new interaction benefits from the history that came before it.
With memory, agents can personalize responses, maintain context across multi-step tasks, and learn from feedback. This is core to agentic systems, and it's why memory is a first-class feature in SuperOptiX.
SuperOptiX is a full-stack agentic AI framework designed for context and agent engineering with an evaluation-first, optimization-core philosophy. Explore the platform at the SuperOptiX website. The framework's declarative DSL, SuperSpec, lets you describe what you want and have SuperOptiX build the pipeline; learn more at the SuperSpec page and the SuperSpec documentation.
What is Agent Memory?
Conceptually, memory is how agents build "continuity of self." Concretely, it's a combination of mechanisms that store and retrieve useful information:
- Short-term memory: session-scoped working memory and conversation history—what's happening right now and in the last few turns.
- Long-term memory: durable knowledge that persists across sessions—facts, preferences, and patterns the agent should retain.
- Episodic memory: structured records of interactions and events over time—who asked what, what the agent did, and how it turned out.
- Context manager: a discipline for combining global, session, task, and local state into a just-right context sent to the model.
This layered design balances immediacy (short-term), durability (long-term), chronology (episodic), and precision (context management). The result is an agent that feels consistent, learns from experience, and remains efficient.
For a deeper conceptual and practical tour, see the Memory System Guide.
How SuperOptiX Memory Works
SuperOptiX provides a powerful, multi-layer memory model you can use via Python, via DSPy adapters configured with JSON-like configs, or declaratively through SuperSpec (YAML).
- Short-term memory captures rolling conversation context and working notes. Use it for ephemeral state and the last N messages.
- Long-term memory persists knowledge with optional semantic search—store guidance ("always return runnable code"), user preferences, and domain facts. Enable embeddings if you want recall by meaning, not just literal keywords.
- Episodic memory tracks episodes and events—great for analytics and learning (e.g., "episode resolved successfully," "user preferred example-based explanations").
- The context manager merges relevant state across scopes to build clean, bounded prompts for the LLM.
Choosing a Memory Backend
Pick the backend that matches your deployment needs:
- file: portable, zero-ops JSON/pickle storage; great for demos and quick local runs.
- sqlite: reliable embedded database; sensible default for most agents.
- redis: networked, high-throughput in-memory store for production workloads.
Use Memory from Python (Public API)
Below are usage-only examples for working with memory in your own Python code.
from superoptix.memory import AgentMemory, FileBackend, SQLiteBackend
# RedisBackend is also available if you install and configure redis
# Create an agent memory (defaults to SQLite)
memory = AgentMemory(agent_id="writer-assistant")
# Short-term: store ephemeral context
memory.remember("User prefers TypeScript", memory_type="short", ttl=3600)
# Long-term: store durable knowledge with categories/tags
memory.remember(
"Always provide runnable code snippets",
memory_type="long",
category="authoring_guidelines",
tags=["writing", "code", "quality"]
)
# Recall (semantic search if embeddings are enabled)
results = memory.recall("runnable code", memory_type="long", limit=5)
for r in results:
print(r["content"])
# Track an interaction episode with events
episode_id = memory.start_interaction({"user_id": "alice"})
memory.add_interaction_event("user_question", "How to configure memory backends?")
# ... generate your response ...
memory.end_interaction({"success": True})
# Introspection and housekeeping
print(memory.get_memory_summary())
memory.cleanup_memory()
# Explicit backend selection
file_memory = AgentMemory("file-demo", backend=FileBackend(".superoptix/memory"))
sqlite_memory = AgentMemory("sqlite-demo", backend=SQLiteBackend(".superoptix/mem.db"))
Configure Memory via DSPy Adapters (JSON)
SuperOptiX integrates memory into DSPy-based agents through adapters. You don't need to wire internals—provide a JSON-like configuration dict (or load it from a .json
file), and the adapter will:
1) retrieve relevant long-term memories for the query,
2) include recent short-term conversation snippets,
3) manage episodes and events,
4) persist useful insights after responses.
See DSPy's adapter documentation for background on the adapter pattern.
How DSPy Adapters Integrate with Memory
The DSPy adapter creates a memory-enhanced agent module that automatically handles the complete memory lifecycle:
Memory Initialization: When you create a DSPyAdapter
, it automatically instantiates an AgentMemory
system based on your config. The adapter reads the memory.enabled
and memory.enable_embeddings
flags to configure the memory system appropriately.
Memory-Enhanced Agent Module: The adapter generates a custom DSPy module (MemoryEnhancedAgentModule
) that wraps your agent logic with memory operations. This module:
- Starts an interaction episode when processing begins
- Retrieves relevant memories before generating responses
- Stores conversation history and insights after completion
- Manages the complete interaction lifecycle
Context Building Process: Before sending a query to the LLM, the adapter:
- Searches long-term memory for semantically relevant knowledge
- Retrieves recent conversation context from short-term memory
- Merges persona information, relevant memories, and conversation history
- Builds a clean, bounded context string for the model
Memory Persistence: After the LLM generates a response, the adapter:
- Stores the Q&A pair in short-term memory for immediate context
- Adds the interaction to the conversation history
- Logs events (user query, agent response) to the episodic memory
- Ends the interaction episode with success/failure metadata
Example JSON config (save as agent.config.json
)
{
"llm": {
"provider": "ollama",
"model": "llama3.2:1b",
"api_base": "http://localhost:11434",
"temperature": 0.2
},
"persona": {
"name": "MemoryDemo",
"description": "Demonstrates SuperOptiX layered memory"
},
"memory": {
"enabled": true,
"enable_embeddings": true
}
}
Advanced Memory Configuration
You can fine-tune memory behavior through additional configuration options:
{
"llm": {
"provider": "ollama",
"model": "llama3.2:1b",
"api_base": "http://localhost:11434"
},
"persona": {
"name": "AdvancedMemoryBot",
"description": "Advanced memory configuration example"
},
"memory": {
"enabled": true,
"enable_embeddings": true,
"short_term_capacity": 200,
"memory_retrieval": {
"max_memories": 5,
"min_similarity": 0.3,
"include_conversation_history": true
},
"episodic_tracking": {
"auto_start_episodes": true,
"event_logging": true,
"outcome_tracking": true
}
}
}
Run with the DSPy adapter
import json
import asyncio
from superoptix.adapters.dspy_adapter import DSPyAdapter
# Or: from superoptix.adapters.observability_enhanced_dspy_adapter import ObservabilityEnhancedDSPyAdapter
with open("agent.config.json", "r") as f:
config = json.load(f)
adapter = DSPyAdapter(config)
# adapter = ObservabilityEnhancedDSPyAdapter(config) # for detailed tracing/debugging
async def main():
result = await adapter.run({
"query": "Remind me how to enable memory in SuperSpec.",
"context": {"user_id": "alice"} # optional context becomes part of the episode
})
print(result["result"])
print("Memory stats:", result.get("memory_stats") or result.get("observability", {}).get("memory_stats"))
asyncio.run(main())
Memory Statistics and Monitoring
The adapter returns comprehensive memory statistics with each response:
# Example response structure
{
"result": "To enable memory in SuperSpec, add the memory section...",
"episode_id": "ep_12345",
"memory_stats": {
"interactions": 15,
"short_term_items": 8,
"long_term_items": 42,
"active_episodes": 1
}
}
Observability-Enhanced Adapter
For production deployments, use the ObservabilityEnhancedDSPyAdapter
which provides:
- Detailed memory operation tracing
- Performance metrics for memory operations
- Debug breakpoints for memory inspection
- Integration with external observability tools (MLflow, Langfuse)
Tip: To extend observability, include:
"observability": {
"debug_mode": false,
"trace_memory": true,
"enable_breakpoints": false
}
Configure Memory in SuperSpec (YAML)
SuperSpec is SuperOptiX's declarative DSL. You describe your agent, and SuperOptiX compiles it to a runnable DSPy pipeline. Learn about SuperSpec at the SuperSpec page and browse the full SuperSpec documentation.
apiVersion: agent/v1
kind: AgentSpec
metadata:
name: memory-demo
id: memory_demo
namespace: demo
level: genies
spec:
language_model:
location: local
provider: ollama
model: llama3.1:8b
api_base: http://localhost:11434
temperature: 0.7
max_tokens: 2048
memory:
enabled: true
short_term:
enabled: true
max_tokens: 2000
window_size: 10
long_term:
enabled: true
storage_type: local # file | sqlite | redis
max_entries: 500
persistence: true
episodic:
enabled: true
max_episodes: 100
context_manager:
enabled: true
max_context_length: 4000
context_strategy: sliding_window
Compile and run with the Super CLI:
# Ensure a local model is installed (Ollama is the default backend)
super model install llama3.2:8b
# Compile and run the agent
super agent compile memory_demo
super agent run memory_demo --goal "Show me how memory works in SuperOptiX"
For a complete overview of the SuperOptiX platform, visit the SuperOptiX website. For a deep dive into memory systems and examples, check out the Memory System Guide.
Practical Patterns and Tips
- Start with sqlite for persistence; use file for simple portability; use redis for high-throughput services.
- Use short-term memory for rolling conversation context; use long-term memory for durable knowledge with categories and tags.
- Treat episodic memory as your analytics backbone: start episodes around conversations/tasks, log events, and end with outcomes.
- Enable embeddings when you need "by-meaning" recall; leave it off to save compute for keyword-only search.
- Periodically call cleanup APIs for long-running services to keep memory lean.
- Use the observability-enhanced adapter for production deployments to monitor memory performance and debug issues.
- Configure appropriate memory retrieval limits to balance context richness with prompt efficiency.
Top comments (0)