Your agent learned 500 things last month. Then the disk died.
This isn't hypothetical — it happened to me. Months of accumulated corrections, preferences, and domain expertise, gone because I treated agent memory like disposable state instead of valuable data.
The Portability Gap
If you're running AI agents that learn over time, you've probably noticed a gap in the tooling:
- Cloud memory APIs — portable, but your agent's knowledge lives on someone else's servers
- Local-only systems — private, but one hardware failure away from zero
- Protocol-based services — great architecture, but not portable across machines
What I wanted was simple: private, portable, auditable, and free.
A Pattern You Already Know
Every developer intuitively understands this separation:
package.json → syncs via git
node_modules/ → rebuilds locally from package.json
Agent memory has the same structure:
MEMORY.md → syncs via git (source of truth)
vectordb/ → rebuilds locally from MEMORY.md (derived artifact)
Source files are human-readable text — Markdown and JSON. They diff cleanly, version naturally, and merge without conflicts. Vector indexes are binary blobs that depend on your local embedding model. They should never touch git.
How It Works
The implementation is a single bash script (~200 lines) that does three things:
Push: Local → Git
- Compare workspace files against sync repo by content hash (skip unchanged)
- Copy changed
.mdand.jsonfiles - Scan for secrets — abort if API keys found
- Commit and push
Pull: Git → Local
-
git pull --ff-only(fails loudly if diverged — protects local changes) - Copy files to agent workspaces with 600 permissions
- Rebuild vector indexes for agents that received changes
Status
Bidirectional drift check — shows files that exist locally but aren't synced, and vice versa.
What Gets Synced (and What Doesn't)
| Synced | Not Synced |
|---|---|
*.md — Memory documents |
*.sqlite — Vector indexes |
*.json — Structured data |
vectordb/ — Embedding stores |
.gitignore |
*.jsonl — Session transcripts |
sync.sh |
*.key, *.pem, *.env
|
The .gitignore does the heavy lifting here. Secrets and binary artifacts stay local. Only human-readable source files travel.
Multi-Agent, Multi-Machine
This pattern scales naturally. Each agent gets its own directory in the repo:
security-auditor/MEMORY.md
growth-engine/MEMORY.md
content-moderator/MEMORY.md
Security knowledge doesn't pollute architecture patterns. And because it's just git, syncing across machines is a push/pull away.
The Secret Scanning Part
The scariest thing about syncing agent memory to git is accidentally pushing API keys. The script runs a regex scan before every commit — looks for common key patterns (sk-, ghp_, Bearer, base64 blobs) and aborts if anything looks suspicious.
Not bulletproof, but it catches the obvious mistakes that would otherwise end up in your git history forever.
Try It
The repo is MIT-licensed: github.com/musecl/musecl-memory
It's a bash script, not a framework. No dependencies beyond git. Works with any agent system that stores memory as files — Claude, GPT, local models, whatever.
If you're running agents that accumulate knowledge, I'm curious: how are you handling persistence today? The tooling landscape is still early, and I think there's a lot of room for different approaches.
Top comments (0)