MACCHA: Give Claude Code, Antigravity & OpenCode one shared brain — a secure, self-improving assistant with long-term memory

#ai #agents #opensource #productivity

Multi-Agent Continuous Context Harness (MACCHA): give Claude Code, Antigravity and OpenCode one shared brain and turn them into one secure, real-world, self-improving digital assistant with long-term memory.

Every new session, my AI coding agent forgets everything. My stack, my preferences, yesterday's hard-won lessons — gone. I got tired of re-explaining myself ten times a week, so I built a small system that fixes it. It's plain text files, it's MIT-licensed, and it runs on a Chromebook.

I call it MACCHA — Multi-Agent Continuous Context Harness. It just hit v1.0.

The problem

If you use coding agents seriously, you know the tax:

Transient memory. Close the terminal, lose the context.
Repetitive setup. "Use pnpm, not npm." "This repo deploys via Netlify, not git push." Every. Single. Session.
Vendor lock-in. Cloud "memory" features tie your brain to one tool. Switch agents and you start from zero again.

I bounce between three agents — Claude Code, OpenCode, and Antigravity — depending on the task. None of them shared what they knew.

The idea: memory is just files

Instead of a database or a cloud silo, MACCHA stores memory as plain markdown and JSON files in your home directory. Any agent that can read and write files can plug into the same brain.

That's the whole trick. A lesson Antigravity learns on Monday is a file Claude Code reads on Wednesday — in a completely different project.

~/BRAIN/
  AGENTS.md              # project-specific rules
  memanto/               # the memory engine (vector recall + decay)
  learned-lessons/       # curated, categorized lessons
  tms/                   # todo / in-progress / done (shared task state)
  hooks/                 # session-start / session-end automation

Why not an always-on agent?

The popular direction right now is autonomous, 24/7 agents (Hermes, OpenClaw, and friends). They're impressive, but they assume you have compute to burn.

MACCHA goes the other way on purpose:

No background daemon. It only spends compute when you invoke it. Session-start and session-end hooks do the bookkeeping; nothing runs in between.
Weak-hardware friendly. I genuinely run this on a Chromebook (Crostini/Debian). No GPU, no cloud worker.
Human-in-the-loop by default. The agent proposes, you approve. For anything sensitive — financial, personal, irreversible — that's a feature, not a limitation.

The 7-tier context hierarchy

The risk with "give the agent memory" is context bloat: stuff 50KB of history into every prompt and the model drowns. MACCHA organizes context into seven isolated tiers so each session loads only what matters, most-critical first:

             ┌─────────────────────────────────────────────────────────┐
             │                      USER REQUEST                       │
             └────────────────────────────┬────────────────────────────┘
                                          │
                                          ▼
                      ┌───────────────────────────────────────┐
                      │    TIER 0: ~/AGENTS.md (Bootstrap)    │
                      └───────────────────┬───────────────────┘
                                          │
                                          ▼
                      ┌───────────────────────────────────────┐
                      │  TIER 1: ~/BRAIN/AGENTS.md (Mandates) │
                      └───────────────────┬───────────────────┘
                                          │
                                          ▼
                      ┌───────────────────────────────────────┐
                      │  TIER 2: ~/.gemini/GEMINI.md (Global) │
                      └───────────────────┬───────────────────┘
                                          │
                                          ▼
        ┌─────────────────────────────────┴─────────────────────────────────┐
        ▼                                 ▼                                 ▼
┌──────────────┐                  ┌──────────────┐                  ┌──────────────┐
│    TIER 3    │                  │    TIER 4    │                  │    TIER 5    │
│    SITUATION │                  │  LIVE STATE  │                  │ AUTO-IMPROVE │
│    ~/INFO    │                  │ ~/BRAIN/tms  │                  │~/IMPROVEMENT │
└──────────────┘                  └──────────────┘                  └──────────────┘
        │                                 │                                 │
        └─────────────────────────────────┼─────────────────────────────────┘
                                          │
                                          ▼
                      ┌───────────────────────────────────────┐
                      │  TIER 6: ~/BRAIN/learned-lessons/     │
                      └───────────────────┬───────────────────┘
                                          │
                                          ▼
             ┌─────────────────────────────────────────────────────────┐
             │              FULLY CONTEXTUALIZED AI AGENT              │
             └─────────────────────────────────────────────────────────┘

One routing rule keeps it sane: a piece of information lives in exactly one tier. On conflict, higher tiers win. Lower tiers are loaded eagerly; deep knowledge (learned lessons) is indexed and pulled in only when relevant. The agent stays focused instead of re-reading its entire history every turn.

The memory engine: Memanto

The working-memory layer is a small engine called Memanto. It does four things I couldn't get from "just append to a file":

Vector recall — semantic search over past facts and lessons, not keyword grep.
Confidence decay — old, unreinforced memories fade; durable facts stay. Noise doesn't accumulate forever.
Conflict detection — when a new fact contradicts an old one, it flags it instead of silently storing both.
Repo-state-wins verification — if a memory says "the flag is --foo" but the code no longer has it, the code wins. Memory reflects what was true; it gets checked against reality before it's trusted.

Show, don't tell

A real example from this week. In one session I taught the agent a deploy quirk: "this site doesn't deploy on git push — you must run netlify deploy --prod and curl-verify the result."

That became one lesson file. Two days later, in a different agent, I asked it to ship a change to that same site. It pulled the lesson, ran the right deploy command, and verified the URL — without me saying a word about Netlify. That's the entire point: learn once, apply everywhere.

Try it

It's MIT-licensed, PII-free, and installs with a single setup script. v1.0 was released this week — fully documented, with hard PII and language gates on the publish pipeline:

👉 https://github.com/KarelTestSpecial/real-agent-setup (release notes)

I'd genuinely love feedback — especially on the hard parts: what do you use for cross-session memory, and how do you handle decay and conflicting facts? That's the part I'm least sure I've gotten right.