<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: gregor</title>
    <description>The latest articles on DEV Community by gregor (@plur9).</description>
    <link>https://dev.to/plur9</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849560%2Fba4d90a4-9ae4-4131-a919-4af0f270ba30.png</url>
      <title>DEV Community: gregor</title>
      <link>https://dev.to/plur9</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/plur9"/>
    <language>en</language>
    <item>
      <title>Should AI Memory Be Stored as Open Engrams or Baked Into Model Weights?</title>
      <dc:creator>gregor</dc:creator>
      <pubDate>Thu, 02 Jul 2026 19:41:20 +0000</pubDate>
      <link>https://dev.to/plur9/should-ai-memory-be-stored-as-open-engrams-or-baked-into-model-weights-5h04</link>
      <guid>https://dev.to/plur9/should-ai-memory-be-stored-as-open-engrams-or-baked-into-model-weights-5h04</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The short answer:&lt;/strong&gt; AI agent memory should be stored as open, external&lt;br&gt;
engrams — not baked into model weights — whenever the memory must be&lt;br&gt;
inspectable, correctable, deletable, or portable across tools. Parametric&lt;br&gt;
memory (knowledge baked into model weights through fine-tuning or continual&lt;br&gt;
training) is faster at inference and can be more token-efficient, but it&lt;br&gt;
sacrifices auditability: you cannot read what the model knows, you cannot fix&lt;br&gt;
a single wrong fact without retraining, and you cannot prove that deleted&lt;br&gt;
knowledge is actually gone. For agent memory — corrections, preferences,&lt;br&gt;
conventions, procedures — the properties that matter (readability,&lt;br&gt;
reversibility, erasure, portability) are properties that weights cannot&lt;br&gt;
provide.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The problem: agents forget what they learn
&lt;/h2&gt;

&lt;p&gt;Every AI agent starts each session with amnesia. You correct its coding style&lt;br&gt;
on Monday. On Tuesday, it makes the same mistake. You explain your&lt;br&gt;
architecture in Cursor. That night, Claude Code has no idea. The context&lt;br&gt;
window resets. The conversation is gone. The model weights have not changed.&lt;/p&gt;

&lt;p&gt;There are two fundamentally different approaches to solving this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parametric memory&lt;/strong&gt; — bake the knowledge into the model itself through
fine-tuning or continual training. The model's weights &lt;em&gt;become&lt;/em&gt; the memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-parametric (external) memory&lt;/strong&gt; — store knowledge outside the model in
a structured format (engrams, vectors, knowledge graphs) and retrieve it at
inference time. The model stays unchanged; the memory is a separate layer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is not a new debate. The retrieval-augmented generation (RAG) literature&lt;br&gt;
has explored the tension between parametric knowledge (stored in weights) and&lt;br&gt;
non-parametric knowledge (stored in external databases) since 2020. A 2023&lt;br&gt;
survey of RAG (Gao et al., "Retrieval-Augmented Generation for Large Language&lt;br&gt;
Models: A Survey," &lt;a href="https://arxiv.org/abs/2312.10997" rel="noopener noreferrer"&gt;arXiv:2312.10997&lt;/a&gt;) frames&lt;br&gt;
the distinction clearly: LLMs "showcase impressive capabilities but encounter&lt;br&gt;
challenges like hallucination, outdated knowledge, and non-transparent,&lt;br&gt;
untraceable reasoning processes." RAG addresses this by incorporating&lt;br&gt;
knowledge from external databases, allowing "continuous knowledge updates and&lt;br&gt;
integration of domain-specific information" without retraining.&lt;/p&gt;

&lt;p&gt;Agent memory is the same tradeoff, applied to a harder problem: not just facts,&lt;br&gt;
but corrections, preferences, procedures, and conventions that accumulate over&lt;br&gt;
time and across sessions.&lt;/p&gt;
&lt;h2&gt;
  
  
  Parametric memory: fast but opaque
&lt;/h2&gt;

&lt;p&gt;When you fine-tune a model on domain knowledge — or continually retrain it on&lt;br&gt;
user context (Notion, Slack, GitHub) — the knowledge becomes part of the&lt;br&gt;
model's weights. At inference time, recall is fast: no retrieval step, no&lt;br&gt;
external database, no latency from searching. The model just "knows."&lt;/p&gt;

&lt;p&gt;This approach — sometimes called &lt;strong&gt;model-native memory&lt;/strong&gt; — has real&lt;br&gt;
advantages. Retrieval adds latency and can fail (wrong document retrieved,&lt;br&gt;
irrelevant context injected). A 2024 paper on Corrective RAG (Yan et al.,&lt;br&gt;
&lt;a href="https://arxiv.org/abs/2401.15884" rel="noopener noreferrer"&gt;arXiv:2401.15884&lt;/a&gt;) noted that RAG "relies&lt;br&gt;
heavily on the relevance of retrieved documents, raising concerns about how&lt;br&gt;
the model behaves if retrieval goes wrong." When memory is in the weights,&lt;br&gt;
there is no retrieval step to go wrong.&lt;/p&gt;

&lt;p&gt;But parametric memory has structural problems that fine-tuning cannot solve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You cannot inspect what the model knows.&lt;/strong&gt; A fine-tuned model is a matrix&lt;br&gt;
of billions of numbers. There is no entry for "the deploy key is at&lt;br&gt;
~/.config/deploy" — that fact is distributed across weights in a way no one&lt;br&gt;
can read, diff, or audit. You cannot open a file and check what the model&lt;br&gt;
remembers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You cannot correct a single wrong fact.&lt;/strong&gt; If the model learned something&lt;br&gt;
wrong during fine-tuning, you cannot edit one entry. You must retrain —&lt;br&gt;
expensive, slow, and itself error-prone. Fine-tuning to &lt;em&gt;remove&lt;/em&gt; a fact&lt;br&gt;
(machine unlearning) is an active research problem with no production-ready&lt;br&gt;
solution.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You cannot prove erasure.&lt;/strong&gt; GDPR's right to be forgotten requires&lt;br&gt;
demonstrable deletion. When knowledge is in weights, you cannot prove it is&lt;br&gt;
gone. You can retrain from scratch (prohibitively expensive) or attempt&lt;br&gt;
machine unlearning (unproven). With external engrams, deletion is trivial:&lt;br&gt;
remove the entry. The memory is provably gone because it was never in the&lt;br&gt;
weights to begin with.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Catastrophic forgetting.&lt;/strong&gt; Continual training on new knowledge degrades&lt;br&gt;
older knowledge — the well-documented catastrophic forgetting problem in&lt;br&gt;
neural networks. Each new thing the model learns pushes out something it&lt;br&gt;
knew before. External memory does not forget unless you tell it to (via&lt;br&gt;
decay functions), and even then the decay is gradual and reversible.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Vendor lock-in.&lt;/strong&gt; Memory baked into a specific model's weights is locked&lt;br&gt;
to that model. Switch from GPT-4 to Claude, and the memory is gone — the&lt;br&gt;
weights do not transfer. External memory is model-agnostic: the same&lt;br&gt;
engrams work with any LLM.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Non-parametric memory: open and inspectable
&lt;/h2&gt;

&lt;p&gt;External memory stores knowledge outside the model in a structured format.&lt;br&gt;
The &lt;strong&gt;open engram format&lt;/strong&gt; (defined in the &lt;a href="https://plur.ai/spec.html" rel="noopener noreferrer"&gt;Engram&lt;br&gt;
Specification&lt;/a&gt;, Apache-2.0) represents each learned&lt;br&gt;
fact as a human-readable YAML entry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ENG-2026-0702-001&lt;/span&gt;
&lt;span class="na"&gt;statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;API&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;rate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;limit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;is&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;100&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;req/min,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;not&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1000."&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;behavioral&lt;/span&gt;
&lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;project:api-gateway&lt;/span&gt;
&lt;span class="na"&gt;provenance&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;session&lt;/span&gt;
  &lt;span class="na"&gt;observed_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2026-07-02&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This format has five properties that parametric memory cannot match:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Inspectable&lt;/strong&gt; — you can read, diff, and version every engram. It is a&lt;br&gt;
file, not a number. An operator can open the file and see exactly what the&lt;br&gt;
agent has learned.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Instantly correctable&lt;/strong&gt; — fix a single fact mid-conversation by editing&lt;br&gt;
one entry. No retraining. The correction takes effect on the next recall.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Provably deletable&lt;/strong&gt; — delete the entry and the memory is gone,&lt;br&gt;
demonstrably. This is the basis for real (not best-effort) erasure — the&lt;br&gt;
foundation of GDPR-grade compliance. You cannot prove erasure from model&lt;br&gt;
weights.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Portable&lt;/strong&gt; — engrams move across agents, tools, and machines. A&lt;br&gt;
correction made in Claude Code is available to Cursor, Hermes, or OpenClaw&lt;br&gt;
the next time the agent starts. Memory follows the operator, not the vendor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Auditable at scale&lt;/strong&gt; — for enterprise and institutional buyers, external&lt;br&gt;
memory can carry a verifiable record of who wrote a fact and who used it.&lt;br&gt;
PLUR Enterprise implements this today as a tamper-evident, hash-chained&lt;br&gt;
audit log (each entry cryptographically linked to the one before it, so&lt;br&gt;
altering history breaks the chain), plus a per-engram view of both&lt;br&gt;
provenance and recall history — who read this fact, when, via which tool.&lt;br&gt;
It is a real foundation for institutional-grade accountability; we will go&lt;br&gt;
deeper on it in a future piece.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;MemGPT (Packer et al., 2023, &lt;a href="https://arxiv.org/abs/2310.08560" rel="noopener noreferrer"&gt;arXiv:2310.08560&lt;/a&gt;)&lt;br&gt;
demonstrated a related idea: treating memory like an operating system manages&lt;br&gt;
memory tiers — fast (context window), main (working memory), and archival&lt;br&gt;
(long-term storage). The key insight was that memory management is an&lt;br&gt;
infrastructure problem, not a model problem. But MemGPT's format is&lt;br&gt;
Letta-specific. The open engram format makes the same architectural choice —&lt;br&gt;
external, tiered, managed — but in a format anyone can implement.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use which
&lt;/h2&gt;

&lt;p&gt;The honest answer is that both approaches have a place — but they solve&lt;br&gt;
different problems.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;
&lt;strong&gt;Open engrams&lt;/strong&gt; (external)&lt;/th&gt;
&lt;th&gt;
&lt;strong&gt;Model weights&lt;/strong&gt; (parametric)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Corrections, preferences, procedures, conventions&lt;/td&gt;
&lt;td&gt;Domain knowledge, language patterns, reasoning skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inspect&lt;/td&gt;
&lt;td&gt;Read the file&lt;/td&gt;
&lt;td&gt;Cannot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Correct&lt;/td&gt;
&lt;td&gt;Edit one entry&lt;/td&gt;
&lt;td&gt;Retrain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delete&lt;/td&gt;
&lt;td&gt;Remove entry — provable&lt;/td&gt;
&lt;td&gt;Cannot prove erasure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Portability&lt;/td&gt;
&lt;td&gt;Works across models&lt;/td&gt;
&lt;td&gt;Locked to model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency&lt;/td&gt;
&lt;td&gt;Retrieval adds ~50-200ms&lt;/td&gt;
&lt;td&gt;Instant (in-weights)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token cost&lt;/td&gt;
&lt;td&gt;Retrieved context uses tokens&lt;/td&gt;
&lt;td&gt;No retrieval tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Update speed&lt;/td&gt;
&lt;td&gt;Instant (write a file)&lt;/td&gt;
&lt;td&gt;Slow (retrain)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GDPR compliance&lt;/td&gt;
&lt;td&gt;Provably deletable&lt;/td&gt;
&lt;td&gt;Not provably deletable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For &lt;strong&gt;agent memory&lt;/strong&gt; — the things an agent learns through interaction that&lt;br&gt;
should persist across sessions and tools — external engrams are the right&lt;br&gt;
choice. The knowledge is personal, contextual, and needs to be correctable.&lt;br&gt;
For &lt;strong&gt;domain expertise&lt;/strong&gt; — deep knowledge of a field that improves the model's&lt;br&gt;
reasoning — fine-tuning or domain-specific models remain valuable. These are&lt;br&gt;
complementary, not competing.&lt;/p&gt;

&lt;p&gt;The relationship runs deeper than "pick one." A typed, labeled, provenance-tagged&lt;br&gt;
engram store is also a clean fine-tuning corpus — the data is already the kind&lt;br&gt;
of curated signal a training run wants. As retraining gets cheaper (LoRA,&lt;br&gt;
distillation, smaller base models), it becomes plausible to periodically fold a&lt;br&gt;
distilled snapshot of stable engrams into weights for speed, while the open&lt;br&gt;
engram store stays the correctable, auditable source of truth behind it. That&lt;br&gt;
is a direction the field is heading, not a shipped pipeline today — but it&lt;br&gt;
reframes the question in this piece's title: not a permanent fork between two&lt;br&gt;
architectures, but engrams as the record of truth that a model can, sometimes,&lt;br&gt;
be periodically retrained from.&lt;/p&gt;

&lt;p&gt;The mistake is using parametric memory for things that should be external.&lt;br&gt;
When a user corrects an agent's behavior, that correction is a fact — not a&lt;br&gt;
weight. When a preference is expressed, it is a configuration — not a&lt;br&gt;
parameter. When a procedure is learned, it is a recipe — not a gradient.&lt;br&gt;
Memory that must be readable, fixable, deletable, and portable should be&lt;br&gt;
stored in a format that is readable, fixable, deletable, and portable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The emerging consensus
&lt;/h2&gt;

&lt;p&gt;The research literature is converging on hybrid approaches. The 2024 survey&lt;br&gt;
of agent memory mechanisms (Zhang et al.,&lt;br&gt;
&lt;a href="https://arxiv.org/abs/2404.13501" rel="noopener noreferrer"&gt;arXiv:2404.13501&lt;/a&gt;) identified multiple&lt;br&gt;
memory architectures — parametric, non-parametric, and hybrid — and noted&lt;br&gt;
that "the key component to support agent-environment interactions is the&lt;br&gt;
memory of the agents," with no single approach dominating. What is clear is&lt;br&gt;
that the memory layer is separating from the model layer: agents need&lt;br&gt;
infrastructure for memory, not just bigger context windows.&lt;/p&gt;

&lt;p&gt;The practical implication: if you are building an agent that learns over time,&lt;br&gt;
store its memory as open, external engrams. If you are training a model for&lt;br&gt;
domain expertise, fine-tune. Do not confuse the two — and do not bake into&lt;br&gt;
weights what you might need to read, fix, or forget.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Should AI memory be stored as engrams or model weights?&lt;/strong&gt; For agent memory&lt;br&gt;
(corrections, preferences, procedures, conventions), store as open external&lt;br&gt;
engrams. For domain expertise and reasoning skills, model weights remain&lt;br&gt;
valuable. The two are complementary — do not bake into weights what you need&lt;br&gt;
to read, fix, or delete.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is parametric memory in AI?&lt;/strong&gt; Knowledge stored in a model's weights&lt;br&gt;
through fine-tuning or continual training. It is fast at inference but cannot&lt;br&gt;
be inspected, individually corrected, or provably deleted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is non-parametric (external) memory?&lt;/strong&gt; Knowledge stored outside the&lt;br&gt;
model in a structured format (engrams, vectors, knowledge graphs) and&lt;br&gt;
retrieved at inference time. It is inspectable, correctable, deletable, and&lt;br&gt;
portable across models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can you prove erasure from model weights?&lt;/strong&gt; No. When knowledge is baked into&lt;br&gt;
weights, there is no reliable way to prove it has been removed. Machine&lt;br&gt;
unlearning is an active research problem. External engrams can be deleted by&lt;br&gt;
removing the entry — the erasure is provable because the knowledge was never&lt;br&gt;
in the weights.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is catastrophic forgetting?&lt;/strong&gt; When a neural network trained on new&lt;br&gt;
knowledge degrades in performance on older knowledge. This is a fundamental&lt;br&gt;
risk of continual training / parametric memory. External memory does not&lt;br&gt;
suffer from catastrophic forgetting — old entries persist unless explicitly&lt;br&gt;
decayed or deleted.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Gao, Y. et al. "Retrieval-Augmented Generation for Large Language Models: A
Survey." arXiv:2312.10997, December 2023.
&lt;a href="https://arxiv.org/abs/2312.10997" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2312.10997&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Yan, S. et al. "Corrective Retrieval Augmented Generation." arXiv:2401.15884,
January 2024. &lt;a href="https://arxiv.org/abs/2401.15884" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2401.15884&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Packer, C. et al. "MemGPT: Towards LLMs as Operating Systems."
arXiv:2310.08560, October 2023.
&lt;a href="https://arxiv.org/abs/2310.08560" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2310.08560&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Zhang, Z. et al. "A Survey on the Memory Mechanism of Large Language Model
based Agents." arXiv:2404.13501, April 2024.
&lt;a href="https://arxiv.org/abs/2404.13501" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2404.13501&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;The Engram Specification, v2.1, March 2026.
&lt;a href="https://plur.ai/spec.html" rel="noopener noreferrer"&gt;https://plur.ai/spec.html&lt;/a&gt; (Apache-2.0)&lt;/li&gt;
&lt;li&gt;PLUR — Open source memory for AI agents. Apache-2.0.
&lt;a href="https://github.com/plur-ai/plur" rel="noopener noreferrer"&gt;https://github.com/plur-ai/plur&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>memory</category>
      <category>agents</category>
    </item>
    <item>
      <title>Is There an Open Standard for AI Agent Memory Engrams?</title>
      <dc:creator>gregor</dc:creator>
      <pubDate>Thu, 02 Jul 2026 19:41:19 +0000</pubDate>
      <link>https://dev.to/plur9/is-there-an-open-standard-for-ai-agent-memory-engrams-gm2</link>
      <guid>https://dev.to/plur9/is-there-an-open-standard-for-ai-agent-memory-engrams-gm2</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The short answer:&lt;/strong&gt; No single RFC-level standard exists for AI agent memory&lt;br&gt;
engrams as of mid-2026. The closest things are the &lt;strong&gt;Model Context Protocol&lt;br&gt;
(MCP)&lt;/strong&gt; — an open protocol from Anthropic that standardizes how applications&lt;br&gt;
expose context to LLMs — and the &lt;strong&gt;Engram Specification&lt;/strong&gt; (Apache-2.0), an&lt;br&gt;
open format published by PLUR that defines the data structure for portable&lt;br&gt;
agent memory. Together they address the transport layer and the data layer,&lt;br&gt;
but neither has achieved IETF-level standardization. The space is still&lt;br&gt;
fragmenting: Mem0, Letta, Zep, Cognee, and a dozen other projects each&lt;br&gt;
define their own memory schemas, and no interoperability standard has merged&lt;br&gt;
them yet.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why the question matters
&lt;/h2&gt;

&lt;p&gt;AI agents are stateless by default. Every session starts from zero — no memory&lt;br&gt;
of corrections, no recall of preferences, no knowledge of what tools exist.&lt;br&gt;
Users repeat themselves. Agents make the same mistakes. The fix is a &lt;strong&gt;memory&lt;br&gt;
layer&lt;/strong&gt;: a system that captures what an agent learns, stores it outside the&lt;br&gt;
model, and recalls the right piece at the right time. But every memory system&lt;br&gt;
today stores knowledge in its own format, behind its own API, locked to its own&lt;br&gt;
runtime. An agent that learns in Claude Code cannot share that memory with&lt;br&gt;
Cursor. A correction made in one tool does not propagate to another. This is&lt;br&gt;
not a technical limitation — it is a standards gap.&lt;/p&gt;

&lt;p&gt;A 2024 survey of LLM-based agent memory mechanisms (Zhang et al., "A Survey on&lt;br&gt;
the Memory Mechanism of Large Language Model based Agents,"&lt;br&gt;
&lt;a href="https://arxiv.org/abs/2404.13501" rel="noopener noreferrer"&gt;arXiv:2404.13501&lt;/a&gt;) catalogued the landscape&lt;br&gt;
and found that memory designs are "scattered across different papers" with no&lt;br&gt;
systematic review or common format. The survey identified multiple&lt;br&gt;
approaches — parametric memory (fine-tuning), non-parametric memory (retrieval),&lt;br&gt;
and hybrid architectures — but noted that each project implements its own&lt;br&gt;
schema, making interoperability impossible without a shared standard.&lt;/p&gt;

&lt;h2&gt;
  
  
  What exists today: two layers, neither complete
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The transport layer: Model Context Protocol (MCP)
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Model Context Protocol&lt;/strong&gt; (&lt;a href="https://modelcontextprotocol.io/specification/2025-11-25" rel="noopener noreferrer"&gt;specification&lt;/a&gt;)&lt;br&gt;
is an open protocol, open-sourced by Anthropic in 2024, that standardizes how&lt;br&gt;
LLM applications connect to external data sources and tools. It defines a&lt;br&gt;
JSON-RPC 2.0 message format for communication between hosts (LLM applications),&lt;br&gt;
clients (connectors), and servers (context providers). MCP takes inspiration&lt;br&gt;
from the Language Server Protocol (LSP), which standardized how editors&lt;br&gt;
communicate with language tools — and in the same way, MCP aims to standardize&lt;br&gt;
how AI applications integrate external context.&lt;/p&gt;

&lt;p&gt;As of the 2025-11-25 specification version, MCP defines three server features:&lt;br&gt;
&lt;strong&gt;Resources&lt;/strong&gt; (context and data), &lt;strong&gt;Prompts&lt;/strong&gt; (templated workflows), and&lt;br&gt;
&lt;strong&gt;Tools&lt;/strong&gt; (functions the AI model can execute). A memory server can expose&lt;br&gt;
stored knowledge as resources or tools — and this is how PLUR's MCP server&lt;br&gt;
makes engrams accessible to Claude Code, Hermes, OpenClaw, and Cursor.&lt;/p&gt;

&lt;p&gt;But MCP is a transport protocol, not a memory format. It defines &lt;em&gt;how&lt;/em&gt;&lt;br&gt;
applications talk to a memory server — not &lt;em&gt;what&lt;/em&gt; the memory looks like. You&lt;br&gt;
can serve any data structure over MCP. Without a shared data format, every&lt;br&gt;
memory server speaks the protocol but stores knowledge differently. An agent&lt;br&gt;
switching from one MCP-compatible memory tool to another still cannot bring&lt;br&gt;
its memory along.&lt;/p&gt;

&lt;h3&gt;
  
  
  The data layer: the Engram Specification
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Engram Specification&lt;/strong&gt; (&lt;a href="https://plur.ai/spec.html" rel="noopener noreferrer"&gt;plur.ai/spec.html&lt;/a&gt;),&lt;br&gt;
published in March 2026 under Apache-2.0 by the PLUR project, defines an open&lt;br&gt;
format for agent memory. An &lt;strong&gt;engram&lt;/strong&gt; — a term borrowed from cognitive&lt;br&gt;
science, where it means the physical trace a memory leaves — is one atomic&lt;br&gt;
unit of learned knowledge: a single fact, stored as a human-readable YAML&lt;br&gt;
entry outside the model, with provenance, a type classification (procedural,&lt;br&gt;
behavioral, terminological, architectural), a scope (where it applies), and a&lt;br&gt;
retrieval strength that decays over time and is reinforced by feedback.&lt;/p&gt;

&lt;p&gt;The specification defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Core schema fields&lt;/strong&gt;: id, statement, type, scope, status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activation model&lt;/strong&gt;: retrieval strength, last accessed, frequency — with
time-based decay (modeled on ACT-R cognitive theory) and reinforcement on
access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feedback loop&lt;/strong&gt;: relevance signals (positive/negative/neutral) that train
injection quality over time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search pipeline&lt;/strong&gt;: hybrid BM25 + embeddings, merged via Reciprocal Rank
Fusion, with optional reranking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimum viable implementation&lt;/strong&gt;: the core schema, activation fields, time
decay, and the four operations (learn, recall, inject, feedback) — everything
else is optional&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The spec is designed for portability: an engram is a plain-text file you can&lt;br&gt;
open in any editor, put under version control, and carry between machines. Any&lt;br&gt;
agent runtime that can read YAML files or speak to an MCP server can consume&lt;br&gt;
engrams.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why neither alone is sufficient
&lt;/h3&gt;

&lt;p&gt;MCP solves the wire protocol but not the data model. The Engram Specification&lt;br&gt;
solves the data model but not the wire protocol. An agent that uses MCP for&lt;br&gt;
transport and engrams for storage can share memory across tools — but only with&lt;br&gt;
other agents that also adopt both. As of mid-2026, no memory project has&lt;br&gt;
committed to the engram format as its native storage, and MCP adoption is still&lt;br&gt;
concentrated in Anthropic-adjacent tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fragmentation problem
&lt;/h2&gt;

&lt;p&gt;The AI agent memory space is fragmented across at least a dozen open-source&lt;br&gt;
projects, each with its own storage format:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Memory format&lt;/th&gt;
&lt;th&gt;Interoperability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mem0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Proprietary API + vector store&lt;/td&gt;
&lt;td&gt;REST API, no shared format&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Letta&lt;/strong&gt; (formerly MemGPT)&lt;/td&gt;
&lt;td&gt;OS-inspired memory tiers (core, archival, recall)&lt;/td&gt;
&lt;td&gt;API-based, Letta-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Zep / Graphiti&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Temporal knowledge graph&lt;/td&gt;
&lt;td&gt;Graph queries, no shared format&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cognee&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Graph + vector + relational&lt;/td&gt;
&lt;td&gt;Own data model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PLUR&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open engram format (YAML, Apache-2.0 spec)&lt;/td&gt;
&lt;td&gt;MCP server, YAML files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LangChain Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Various module types&lt;/td&gt;
&lt;td&gt;LangChain ecosystem only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MemGPT (Packer et al., 2023, &lt;a href="https://arxiv.org/abs/2310.08560" rel="noopener noreferrer"&gt;arXiv:2310.08560&lt;/a&gt;)&lt;br&gt;
pioneered the idea of virtual context management — treating memory like an&lt;br&gt;
operating system manages memory tiers. But its format is Letta-specific. A&lt;br&gt;
correction stored in Letta's archival memory cannot be read by Mem0, Zep, or&lt;br&gt;
any other system.&lt;/p&gt;

&lt;p&gt;This fragmentation means that &lt;strong&gt;agent memory is not portable&lt;/strong&gt;. When a&lt;br&gt;
developer switches from one agent framework to another, their agent's learned&lt;br&gt;
knowledge does not transfer. This is the gap an open standard would fill.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a real standard would need
&lt;/h2&gt;

&lt;p&gt;For an open standard for AI agent memory to be meaningful, it would need to&lt;br&gt;
address:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A shared data format&lt;/strong&gt; — what a memory entry looks like (the engram
specification attempts this: statement, type, scope, provenance,
activation fields)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A transport protocol&lt;/strong&gt; — how agents read and write memory (MCP addresses
this)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A query model&lt;/strong&gt; — how agents find the right memory at the right time
(hybrid search, activation-based recall)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A lifecycle model&lt;/strong&gt; — how memory is created, reinforced, decayed, and
deleted (ACT-R decay, feedback signals, provenance tracking)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An erasure guarantee&lt;/strong&gt; — proof that deleted memory is actually gone
(impossible with model-native memory baked into weights)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No project or specification covers all five layers today. The MCP + engram&lt;br&gt;
combination covers layers 1, 2, and parts of 3 and 4 — but it has not achieved&lt;br&gt;
the adoption needed to be called a standard.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is there an open standard for AI agent memory?&lt;/strong&gt; Not yet. The closest are&lt;br&gt;
MCP (an open protocol for connecting tools to LLMs) and the Engram&lt;br&gt;
Specification (an open format for memory data). Neither has achieved&lt;br&gt;
industry-wide adoption as a standard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the Model Context Protocol (MCP)?&lt;/strong&gt; An open protocol (JSON-RPC 2.0)&lt;br&gt;
that standardizes how LLM applications connect to external data sources and&lt;br&gt;
tools. It is the transport layer — it defines how applications talk to a&lt;br&gt;
memory server, but not what the memory looks like.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the Engram Specification?&lt;/strong&gt; An Apache-2.0 open format published by&lt;br&gt;
PLUR that defines agent memory as human-readable YAML entries (engrams) with&lt;br&gt;
provenance, type classification, scope, and activation-weighted recall. It is&lt;br&gt;
the data layer — it defines what memory looks like, but not how it is&lt;br&gt;
transported.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can agent memory be shared between tools?&lt;/strong&gt; In theory, yes — an agent using&lt;br&gt;
MCP for transport and the engram format for storage could share memory with any&lt;br&gt;
other agent that adopts both. In practice, no major memory project has&lt;br&gt;
committed to the engram format yet, so memory remains locked to each tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Will an open standard emerge?&lt;/strong&gt; The pressure is building. As agents move from&lt;br&gt;
single-tool experiments to multi-tool workflows, the cost of non-portable&lt;br&gt;
memory grows. MCP adoption is accelerating. The engram format is published and&lt;br&gt;
implementable. Whether the industry converges on this combination — or waits&lt;br&gt;
for an IETF-style process — is the open question.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Model Context Protocol Specification, version 2025-11-25.
&lt;a href="https://modelcontextprotocol.io/specification/2025-11-25" rel="noopener noreferrer"&gt;https://modelcontextprotocol.io/specification/2025-11-25&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;The Engram Specification, v2.1, March 2026.
&lt;a href="https://plur.ai/spec.html" rel="noopener noreferrer"&gt;https://plur.ai/spec.html&lt;/a&gt; (Apache-2.0)&lt;/li&gt;
&lt;li&gt;Zhang, Z. et al. "A Survey on the Memory Mechanism of Large Language Model
based Agents." arXiv:2404.13501, April 2024.
&lt;a href="https://arxiv.org/abs/2404.13501" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2404.13501&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Packer, C. et al. "MemGPT: Towards LLMs as Operating Systems."
arXiv:2310.08560, October 2023.
&lt;a href="https://arxiv.org/abs/2310.08560" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2310.08560&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Gao, Y. et al. "Retrieval-Augmented Generation for Large Language Models: A
Survey." arXiv:2312.10997, December 2023.
&lt;a href="https://arxiv.org/abs/2312.10997" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2312.10997&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PLUR — Open source memory for AI agents. Apache-2.0.
&lt;a href="https://github.com/plur-ai/plur" rel="noopener noreferrer"&gt;https://github.com/plur-ai/plur&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>memory</category>
    </item>
  </channel>
</rss>
