Every AI agent I work with—Claude Code, Cursor, Hermes, Codex—starts each session with a clean slate. The context window fills up fast, and once it's gone, the agent has zero recollection of what we discussed last week, last hour, or even five minutes ago. I kept losing architectural decisions, preferred libraries, and debugging context. Frustrating.
I looked into existing memory solutions. Some agents have built-in memory, but it's usually tied to a specific model or platform. Others require patching the agent's code—which breaks with every update. None of them worked across the different agents I use daily.
So I built a sidecar. An external memory process that runs next to any agent, reads its data directory, archives sessions, builds a long-term knowledge graph, and injects relevant recall back into future work. No patching. No lock-in. Just a UNIXy process that hooks into the agent's filesystem and does memory management for it.
Memory Sidecar is agent-agnostic by design. It works by monitoring the agent's AGENT_HOME directory, watching for new sessions and file writes. It then processes those sessions into a layered recall system:
- Hot memory: recent session context (fast, kept in RAM)
- Warm memory: recent archives (indexed on disk, fast recall)
- Cold memory: older, consolidated knowledge (vector search or keyword)
- Curated notes: user-defined knowledge that should always be injected (e.g., project conventions, environment docs)
When a new session starts, the sidecar checks the current project and surfaces the most relevant memory chunks. The agent receives them as injected context—like a team member handing you a summary before a meeting. No manual --memory flags, no API calls outside the agent's existing workflow.
The latest release, v3.5.1, is the operational hardening release for the current public architecture. I cleaned up the install flow, removed any server-specific paths or credentials, and made sure AGENT_HOME detection works reliably across different setups. The repository is now fully public and ready for real-world install feedback.
To try it:
git clone https://github.com/mage0535/hermes-memory-installer
cd hermes-memory-installer
export AGENT_HOME=/path/to/your/agent/workspace
./install.sh
That's it. The sidecar starts, monitors, and begins building memory. For CLI agents like Claude Code or Cursor, you'll notice recalls appearing within a few sessions—suggesting the approach you used last time, remembering which library you chose, even surfacing notes you wrote three weeks ago.
I've been running it for a month across four agents. The biggest wins: no more re-explaining preferred libraries, no more losing debugging history, and a noticeable reduction in context window overflow because the agent doesn't need to keep everything in one prompt.
Is it perfect? No. Memory quality improves over time as the cold index grows. For very large repositories, you might want to tune the injection threshold. And it works best with agents that write persistent session logs or maintain a workspace directory. But for a zero-patch, agent-agnostic memory layer, it's been a game-changer for my daily workflow.
If you're tired of your AI forgetting everything between sessions, give Memory Sidecar a look. It's open source (MIT), written in Python 3.9+, and the README walks through the architecture and customization options.
Top comments (0)