Tang Weigang

Posted on Jun 21

Before You Add Memory to an AI Agent, Decide What the Agent Is Allowed to Remember

#ai #agents #neo4j #llmops

Before You Add Memory to an AI Agent, Decide What the Agent Is Allowed to Remember

Memory is one of those agent features that sounds obvious until it is connected to a real system.

The naive version is simple: save the conversation, retrieve something similar later, and call it memory.

The operational version is harder:

What is short-term conversation context?
What is long-term user or domain knowledge?
What is a reasoning trace?
Who owns a remembered entity?
Which memory can be reused across sessions?
Which memory should expire, be corrected, or stay private?
How do you prove that the agent used the right memory instead of a convenient hallucination?

That is the useful way to read agent-memory, the Neo4j Labs project covered in the Doramagic manual.

It is not just a "vector store for agents". The manual frames it as a graph-native memory layer for AI agents and context graphs, backed by Neo4j, with Python and TypeScript SDK surfaces and a hosted NAMS backend option.

The first useful mental model: three memory tiers

The most important part is the separation of memory types.

The Doramagic manual describes three main tiers:

Tier	What it holds	Why it matters
Short-term memory	Session or conversation message history	Keeps the current turn grounded without pretending every message is permanent knowledge.
Long-term memory	Entities, preferences, relationships	Lets the system remember durable facts, but also creates privacy and correction obligations.
Reasoning memory	Steps, tool calls, traces, similar traces	Makes the agent's behavior reviewable instead of turning memory into an invisible black box.

That split is practical because "remember everything" is not an implementation strategy. It is a data governance problem wearing a friendly product name.

If an AI host is going to use memory, the host should know which tier it is touching.

A user message might belong in short-term memory.

A confirmed customer preference might belong in long-term memory.

A failed tool call and recovery path might belong in reasoning memory.

Those are different records with different risks.

The graph part is not decoration

agent-memory uses Neo4j as the backing graph. That matters because agent memory is rarely just a bag of text chunks.

Useful memory often has structure:

a person belongs to an organization
a task was requested in a session
a tool call touched an entity
a preference applies to one user but not another
a reasoning trace created or updated a record

The manual highlights POLE+O entity typing: PERSON, ORGANIZATION, LOCATION, EVENT, and OBJECT, plus extension entity types. That gives the memory system a vocabulary for durable knowledge instead of treating every remembered thing as the same kind of note.

The result is not automatically safe or correct. It is just more inspectable.

That is the point.

Backend choice changes the operating boundary

The manual describes two backend paths:

direct Neo4j through Bolt
hosted NAMS through a REST backend

That is not a small deployment detail. It changes the boundary you need to check.

With a local or self-hosted Neo4j path, you are responsible for database configuration, tenant isolation, backups, and operational access.

With NAMS, you get a hosted memory service path and ontology surfaces, but now the remote service boundary, workspace ownership, and API configuration matter.

The practical first question is not "which one is better?"

The practical first question is:

Where is the memory allowed to live, and who can read it later?

If you cannot answer that, do not let an agent write durable memory yet.

Ontology is the part teams will underestimate

The manual calls out a typed, versioned ontology layer in NAMS. This is more important than it sounds.

Without an ontology boundary, agent memory can quietly drift:

the same entity appears under multiple names
preferences become mixed with facts
tool results become treated as user intent
stale knowledge remains in retrieval because nothing marks it as old
private and shared memory end up in the same retrieval pool

An ontology does not solve those problems by itself, but it gives the team a place to define what is valid.

For a first run, I would not start by building a complex domain ontology.

I would start with a deliberately small one:

one user
one session
two entity types
one relationship type
one trace
one correction case

If that cannot be inspected and corrected, scaling the memory system will only make the failure harder to see.

The first safe verification run

Before wiring agent-memory into a serious AI workflow, I would run a small sandbox test.

The test should not require production credentials or real user data.

A good first run looks like this:

Create a temporary test user and session.
Add a small conversation message to short-term memory.
Add one explicit long-term entity, such as a fake preference.
Record one reasoning step or tool call.
Retrieve context on the next turn.
Verify which memory tier produced which returned item.
Correct or delete one memory record.
Confirm the correction is visible in the next retrieval.

The key artifact is not the demo output.

The key artifact is the audit trail:

what was stored
why it was stored
where it lives
how it is retrieved
how it is corrected
what the agent is not allowed to remember

The main pitfall

The biggest mistake is treating memory as a feature toggle.

"Add memory" sounds like a product improvement.

In practice, it changes the agent's state model.

A stateless agent can be wrong in one run.

A stateful agent can be wrong, remember the wrong thing, and use that wrong memory later with confidence.

That does not mean memory is bad. It means memory needs a smaller first run, clearer permissions, and a visible review path.

A practical adoption checklist

Before giving an AI host access to a memory layer, answer these questions:

What memory tiers are enabled?
Which writes are automatic and which require approval?
Where is the backing store?
Is memory scoped by user, workspace, tenant, or project?
Can a user inspect and correct remembered facts?
Are reasoning traces stored separately from durable user knowledge?
Does retrieval show provenance?
Is there a deletion path?
Is there a sandbox test that proves all of the above?

If the answer is unclear, the next step is not production integration.

The next step is a smaller verification loop.

Reference: Doramagic agent-memory project page and manual: https://doramagic.ai/en/projects/agent-memory/manual/

Disclosure: this post is based on an independent Doramagic project pack for neo4j-labs/agent-memory. It is not official Neo4j documentation and does not imply endorsement by the upstream project.