Persistent Memory for AI Agents: Running a Memory Sidecar Without Touching Internals

#ai #memory #opensource

I've been using AI agents like Claude Code and Cursor for a while, and the biggest headache was simple: every new session, the agent starts from zero. It forgets the bug we were debugging, the project conventions I set two days ago, and that weird API quirk we already solved. Each conversation is a fresh slate.

Sure, some agents have built-in memory, but it's often limited to the same session or requires patching the agent itself. I didn't want to modify the internals of tools I depend on. What I needed was a separate process that could watch my sessions, build a knowledge base, and feed relevant context back without touching the agent.

That's why I built Memory Sidecar. It's a production memory system that runs alongside your agent — Hermes, Claude Code, Cursor, Codex, any of them — and gives it real memory. It doesn't patch the agent. It's a sidecar: a completely separate process that shares a data directory.

How it works

The sidecar monitors the agent's session data (state.db and session files). Every time the agent writes a checkpoint, the sidecar picks it up and processes it through three retrieval layers:

Hot layer: a small memory tool (5KB cap) for recent, high-fidelity context.
Warm layer: Hindsight PostgreSQL stores summarized session knowledge, so past conversations become searchable.
Cold layer: a knowledge graph with full-text search (FTS5) that tracks entities like people, projects, and recurring issues. Each gets a 'dossier' that accumulates over time.

When the agent starts a new session, the sidecar injects relevant context from all three layers into the system prompt. The agent doesn't know it's happening — it just has the memory it needs.

What it does in practice
Three things: archives sessions to permanent knowledge so nothing is lost, recalls what matters via layered retrieval, and tracks important topics automatically. For example, if I keep hitting the same Python import error across sessions, the sidecar builds a dossier on it. Next time, the agent sees that history and can jump straight to the solution.

Why I like this approach
It's agent-agnostic. I can switch between Hermes and Claude Code, and the memory persists. I don't have to maintain patches for each agent's internals. The setup is minimal: run the sidecar, point it to your session data directory, and it works. The latest release (v3.1.1) adds automatic memory watermark detection and periodic snapshot backups, plus a cleaner onboarding guide for other agents.

Is it for everyone?
Probably not if you only use short, throwaway sessions. But if you have ongoing projects with complex context, or you're building agents that need to learn over time, this sidecar model saves a lot of friction. It complements agent-specific memory rather than replacing it.

I've been running it for a few weeks now, and the difference is noticeable. I don't have to repeat myself, and the agent seems to 'get' my project patterns faster. It's open source under MIT, and you can find the full code and architecture docs here: https://github.com/mage0535/hermes-memory-installer

Feel free to adapt it to your own agents. The architecture is documented, and the three-layer retrieval pattern is straightforward to extend. If you've been wrestling with agent amnesia, give it a try.

DEV Community

Persistent Memory for AI Agents: Running a Memory Sidecar Without Touching Internals

Top comments (0)