Nick Stocks

Posted on Mar 18 • Originally published at mistaike.ai

We Gave Our AI Agents a Shared Brain. Here's What Happened.

#memoryvault #mcp #aiagents #devrel

We Gave Our AI Agents a Shared Brain. Here's What Happened.

Every AI agent session starts from zero. Claude Code doesn't know what you told it yesterday. Gemini doesn't know what Claude learned. If you switch from Cursor to Claude Code, your new agent has no idea what the old one figured out.

I got tired of repeating myself. So we built a shared memory system and started using it to build mistaike.ai itself. This post is about what actually happened when we tried it — the good, the surprising, and the stuff that broke.

The Setup

mistaike.ai is built by a team of AI agents. Claude coordinates — it plans, writes issues, reviews PRs. Gemini implements — it writes code, runs tests, opens PRs. I brainstorm in Claude Web and do competitive research in ChatGPT.

Before the Memory Vault, each session started with me pasting context. The deploy flow. The git rules. Which model to use for what. The alembic migration rules. The smoke test 2FA pattern. Every. Single. Time.

The context window is big but it's not memory. When the session ends, everything evaporates.

What We Built

The Memory Vault is a persistent, searchable store exposed via MCP. Any agent with access to the mistaike MCP hub can:

save_memory — store a focused piece of knowledge (max 4KB)
search_memories — semantic search via embeddings
list_memories — browse everything (no embeddings needed, works even when Ollama is down)
update_memory / delete_memory — maintain the knowledge base

Every write passes through our DLP pipeline. If an agent tries to save a database password or API key, it gets blocked before it ever hits storage.

How We Actually Use It

Today, we migrated our entire operational knowledge base from local markdown files into the vault. 18 files became 40+ focused memories. Here's the taxonomy:

Prefix	What goes in	Example
`RULE:`	Non-negotiable guardrails	"Never hand-write alembic revision IDs"
`PROCESS:`	How-to workflows	"Deploy pipeline: push main, build, UAT, smoke, tag, prod"
`INFRA:`	Infrastructure facts	"Hetzner server: 65.21.x.x, 4 CI runners, SSH user claude"
`BUG-PATTERN:`	Learned from incidents	"PBKDF2 key format: 44-byte b64, never blindly decode to 32"
`FEEDBACK:`	User corrections	"Never use --admin on merge, bypasses all CI gates"
`BOOTSTRAP:`	Session startup	"On session start, search_memories for guardrails first"

Each memory is focused on one topic, under 500 characters for rules. That's the key insight: small chunks dramatically outperform large documents in semantic search.

We tested this. A 340-character focused rule scored 0.57 similarity for its target query. A 1.5KB multi-topic document containing the same information scored 0.25. The big document contains the answer, but the embedding is diluted by everything else in it.

The Real Test: Four Agents, One Brain

The vault is accessible from:

Claude Code (my terminal) — coordinates, plans, reviews
Gemini CLI (the executor) — writes code, opens PRs
Claude Web (claude.ai) — brainstorming, design iteration
ChatGPT (developer mode) — competitive intelligence, web research

All four connect to the same MCP hub endpoint. Same user, same memory pool.

When Claude Code saves a memory on Monday morning, Gemini can search for it Monday afternoon. When Gemini discovers a bug pattern during implementation, ChatGPT can find related incidents on the web and save what it finds for everyone.

This sounds obvious. It wasn't, until we built it.

What Surprised Us

Agents don't naturally save. I built the save triggers into CLAUDE.md as a non-negotiable rule: "save_memory immediately when you discover a bug, learn how something works, get corrected by the user, or find something surprising." Without this explicit instruction, the agents would learn things during a session and then let them evaporate.

The DLP pipeline caught real issues. When migrating our operational docs, the DLP scanner blocked a memory that contained a PostgreSQL connection string. It also false-positived on the project_id "mcp" (flagged as PII). Both are useful data points — one is the system working correctly, the other is a bug we logged.

Embedding indexing is async. A memory saved at 09:21:52 wasn't searchable until ~09:22:00. Not a problem for operational use, but worth knowing if you save-then-immediately-search.

The offline fallback matters. Our embedding service runs on a dedicated server. When it was briefly down (an HTTP-to-HTTPS redirect misconfiguration), search_memories returned 503 but list_memories still worked fine because it's just a database query. We keep critical rules in local config files as a safety net.

What Changed

Before: CLAUDE.md was 475 lines. Every session loaded the entire operational manual into context. Gemini got a 150-line GEMINI.md per repo. Context was duplicated, stale, and per-agent.

After: CLAUDE.md is 90 lines — just enough to bootstrap. "Search the vault for everything else." GEMINI.md files are 50 lines each. The vault is the brain. Everyone reads from the same source. When something changes, you update one memory and every agent sees it.

The workflow now:

Session starts
  → search_memories("bootstrap session startup")
  → search_memories("coordinator rules non-negotiable")
  → search_memories for whatever the user's first task is about

During work
  → search_memories before implementing anything
  → save_memory after discovering anything

Session ends
  → Knowledge persists. Next session picks up where this one left off.

The Honest Part

This is early. We have 40 memories. The search quality depends on how well you chunk your knowledge (small + focused = good, large + multi-topic = bad). There's no automatic staleness detection yet — memories don't expire or self-update. When the embedding service is down, search degrades to list-and-scan, which works but isn't as fast.

But the core insight is working: AI agents are dramatically more useful when they remember things across sessions and across tools. The cost of repeating context is measured in minutes per session, hundreds of sessions per month, across every developer using AI agents. The Memory Vault eliminates that cost.

We're using it to build the product. The product exposes it to you. Same system, same API.

The Memory Vault is part of the mistaike.ai MCP Hub. All statistics in this post reflect our actual usage as of March 2026. The migration described here was performed in a single session.

Originally published on mistaike.ai

DEV Community

We Gave Our AI Agents a Shared Brain. Here's What Happened.

We Gave Our AI Agents a Shared Brain. Here's What Happened.

The Setup

What We Built

How We Actually Use It

The Real Test: Four Agents, One Brain

What Surprised Us

What Changed

The Honest Part

Top comments (0)