Start Here: My AI Memory Research So Far

#ai #machinelearning #agents #discuss

I've published 23 articles over the past month. They're not random. They follow one research arc that started with a simple question and ended somewhere harder.

Here's the arc in plain language.

Stage 1 — Does memory survive a reset?

I started by asking whether AI agent memory could preserve useful context after a session ends. The short answer: yes, but only if the memory is structured right. Summary-only memory collapsed. Layered memory with corrections and explicit state held.

Article: The Zero-Budget AI Memory System That Survives Session Resets

Stage 2 — Does correction memory matter?

The next question: what happens when the agent was wrong and needs to update? I tested whether correction memory — explicitly flagging what changed and why — improved recovery. It did. But it also revealed a new problem.

Article: Three Failures My AI Memory System Tested — And the Flaw It Revealed in Itself

Stage 3 — Retrieval accuracy can diverge from safety

When I added real retrieval to the loop, something broke. The agent started finding the right memory and then acting from the wrong one. Retrieval accuracy went up. Safety results went down. That was the turning point.

Article: Higher Retrieval Accuracy Had the Worse Safety Result

Stage 4 — Relevance is not authority

This is the core finding.

A memory can be a perfect semantic match for a request and still be the wrong memory for the agent to obey. Stale instructions, superseded rules, provisional notes — they can all score higher on relevance than the policy that should actually govern the action.

The fix wasn't better retrieval. It was a separate authority pass: policy and correction memories route before retrieval runs.

Article: In This Memory Test, Relevance Wasn't Authority

Stage 5 — Testing it on real setups

The research moved from controlled scenarios to real-world agent instructions. I ran free memory reliability audits on 3 redacted agent setups — AGENTS.md files, CLAUDE.md files, .cursorrules — and reported what the harness found: stale instructions, conflicting rules, missing verification gates.

Article: Testing an AI Memory Reliability Checklist on 3 Redacted Agent Setups

Stage 6 — Making authority math explicit. Then watching it fail.

The next step was turning authority signals into a scoring formula: relevance + authority weight + scope match + specificity + action type + status validity - conflict risk. A governance-adjusted BM25 scorer.

The stress packet result: 4/6, same as the best prior strategy. No improvement.

The held-out packet falsified the improvement claim entirely — plain BM25 outperformed the full scorer 6/6 vs 5/6. That falsification was published as the lead finding of the article, not buried.

Article: I Tried to Turn Agent Memory Authority Into a Scoring Formula. The Held-Out Test Changed the Claim.

Stage 7 — Finding the sensitive memory made it more dangerous

When sensitive memories are stored without authority signals — no governs field, no priority flag, just content — a retrieval system that finds them accurately is worse than one that misses them.

The agent finds the right memory and answers with full confidence. False-certainty errors. Tested across credential packets, PII packets, and industrial safety packets. Target-accurate retrieval produced false-certainty every time on mislabeled memory.

Article: Retrieval Found the Sensitive Memory. That Made It More Dangerous.

Stage 8 — Stop trusting the memory's self-description

The obvious fix is better metadata. The problem: the memory describes itself. A mislabeled memory will pass any check that only reads its own claim.

The gate moved from "what does this memory say it governs" to "what kind of operation is the agent actually about to perform." That caught what the metadata gate missed.

Article: The Gate Was Reading the Memory's Own Lie. Here's What I Built Instead.

Stage 9 — Stop trusting the query too

A query can describe a sensitive operation vaguely. "Take care of the partner setup" sounds routine. The tool call behind it — credential distribution to an external recipient — is not.

The gate moved again: from query-derived operation context to concrete tool-call parameters checked against an external grant table. Expired grants, recipient mismatches, vague queries hiding sensitive operations: all caught. 7/7. Zero false-certainty errors.

Article: The Query Was Still a Lie. The Tool Call Told the Truth.

Stage 10 — The framework as a deployed product

The research became a product: a multi-agent Memory Authority Auditor. Six Cloud Run agents that take any memory or instruction file and return an authority audit report — stale instructions, conflicting rules, missing verification gates, authority hierarchy.

Live web app: memory-authority-auditor-web-992750435781.us-central1.run.app

Article: I Built a Multi-Agent Authority Auditor for AI Memory Files

Stage 11 — What this means for the space

Every major AI memory framework has gotten retrieval right. LangChain, LlamaIndex, MemGPT/Letta, Zep — the tooling is mature. Retrieval is largely solved.

Authorization is not.

Finding the right memory and being authorized to act on it are different objectives. They diverge under adversarial conditions. Most frameworks have no public harness testing that gap. We do.

23 claims. Pre-registered. Falsifications published before the next article drops. Anyone can challenge it.

Article: Retrieval Is Solved. Why Agent Memory Still Isn't Safe.