Memory Sidecar: Persistent, Tiered Memory for AI Agents Without Touching Internals

#ai #memory #opensource

Every AI agent I've used shares a frustrating flaw: no long-term memory. You build context over a session—project details, preferences, recurring issues—and the moment you restart, it's all gone. Start fresh, re-explain everything.

Sure, you can hack around it. Embed a vector database, stuff a summary into the system prompt, or write custom tools that query external storage. But each approach is fragile, agent-specific, and a maintenance burden.

I needed something that worked across Hermes, Claude Code, Cursor, and Codex without modifying their internals. So I built Memory Sidecar.

What It Is

Memory Sidecar runs as a separate process alongside your agent. It watches session outputs, archives important content, and feeds relevant context back when needed. No agent patching—it communicates through a shared data directory and optionally via MCP.

The Three-Layer Architecture

The key insight is that memory has different latency and persistence needs. Memory Sidecar uses three tiers:

Hot Layer — A fast, ephemeral memory tool with a ~5KB cap. The agent calls it directly for immediate context. Low latency, zero persistence.
Warm Layer — A PostgreSQL-backed service (“hindsight”) that stores enriched memories with metadata. Provides medium-term recall with semantic search.
Cold Layer — A knowledge graph combined with SQLite FTS5 full-text search. This is the long-term archive, storing facts, people, projects, and recurring patterns as a graph.

When your agent needs memory, the sidecar merges results from all three layers into a single, coherent context injection. It balances freshness, relevance, and token cost automatically.

Getting Started

Requirements: Python 3.9+ and PostgreSQL (for the warm layer). Clone the repository, run the setup script, and point your agent to the shared data directory. Integrations are ready for Hermes, Claude Code, Cursor, and Codex, with an onboarding guide for other agents.

What v3.1.1 Adds

The latest release brings automatic memory watermark detection (memory_watermark.py) to keep storage lean, periodic snapshot backups (memory_snapshot_backup.py) for safety, hardened daemon implementations, and a cleaner configuration pipeline that removes hardcoded tokens.

I've been using Memory Sidecar daily with Claude Code and Cursor. It remembers my codebase conventions, test patterns, and ongoing discussions across sessions. No more repeating myself.

If your AI agents keep forgetting what you've already told them, give Memory Sidecar a try. It's MIT-licensed and open source.

GitHub: https://github.com/mage0535/hermes-memory-installer

Star if you find it useful — it helps others discover the project.