musecl

Posted on Feb 17

The package.json Pattern for AI Agent Memory

#ai #agents #git #devops

Your agent learned 500 things last month. Then the disk died.

This isn't hypothetical — it happened to me. Months of accumulated corrections, preferences, and domain expertise, gone because I treated agent memory like disposable state instead of valuable data.

The Portability Gap

If you're running AI agents that learn over time, you've probably noticed a gap in the tooling:

Cloud memory APIs — portable, but your agent's knowledge lives on someone else's servers
Local-only systems — private, but one hardware failure away from zero
Protocol-based services — great architecture, but not portable across machines

What I wanted was simple: private, portable, auditable, and free.

A Pattern You Already Know

Every developer intuitively understands this separation:

package.json    → syncs via git
node_modules/   → rebuilds locally from package.json

Agent memory has the same structure:

MEMORY.md       → syncs via git (source of truth)
vectordb/       → rebuilds locally from MEMORY.md (derived artifact)

Source files are human-readable text — Markdown and JSON. They diff cleanly, version naturally, and merge without conflicts. Vector indexes are binary blobs that depend on your local embedding model. They should never touch git.

How It Works

The implementation is a single bash script (~200 lines) that does three things:

Push: Local → Git

Compare workspace files against sync repo by content hash (skip unchanged)
Copy changed .md and .json files
Scan for secrets — abort if API keys found
Commit and push

Pull: Git → Local

git pull --ff-only (fails loudly if diverged — protects local changes)
Copy files to agent workspaces with 600 permissions
Rebuild vector indexes for agents that received changes

Status

Bidirectional drift check — shows files that exist locally but aren't synced, and vice versa.

What Gets Synced (and What Doesn't)

Synced	Not Synced
`*.md` — Memory documents	`*.sqlite` — Vector indexes
`*.json` — Structured data	`vectordb/` — Embedding stores
`.gitignore`	`*.jsonl` — Session transcripts
`sync.sh`	`.key`, `.pem`, `*.env`

The .gitignore does the heavy lifting here. Secrets and binary artifacts stay local. Only human-readable source files travel.

Multi-Agent, Multi-Machine

This pattern scales naturally. Each agent gets its own directory in the repo:

security-auditor/MEMORY.md
growth-engine/MEMORY.md
content-moderator/MEMORY.md

Security knowledge doesn't pollute architecture patterns. And because it's just git, syncing across machines is a push/pull away.

The Secret Scanning Part

The scariest thing about syncing agent memory to git is accidentally pushing API keys. The script runs a regex scan before every commit — looks for common key patterns (sk-, ghp_, Bearer, base64 blobs) and aborts if anything looks suspicious.

Not bulletproof, but it catches the obvious mistakes that would otherwise end up in your git history forever.

Try It

The repo is MIT-licensed: github.com/musecl/musecl-memory

It's a bash script, not a framework. No dependencies beyond git. Works with any agent system that stores memory as files — Claude, GPT, local models, whatever.

If you're running agents that accumulate knowledge, I'm curious: how are you handling persistence today? The tooling landscape is still early, and I think there's a lot of room for different approaches.

DEV Community