Your AI Agent's Memory Is an Attack Surface — Here's How to Defend It

#ai #security #opensource #python

AI agents are getting persistent memory — vector stores, RAG indexes, conversation histories that carry context across sessions. This is powerful. It's also a brand new attack surface that almost nobody is defending.

The Problem

When an AI agent trusts its own memory on every future run, an attacker who can poison that memory once gains persistent influence over all subsequent agent behavior. This is OWASP's ASI06 — Memory Poisoning.

Real attack scenarios:

A malicious document injected into a RAG pipeline that rewrites the agent's system instructions on every retrieval
A compromised tool output that plants a backdoor instruction in the agent's long-term memory
An adversarial user input that modifies protected memory keys (API endpoints, allowed domains)

These aren't theoretical. Johann Rehberger demonstrated memory poisoning against ChatGPT's memory feature. The attack surface exists in every framework: LangChain, LlamaIndex, CrewAI, AutoGen, Mem0.

The Solution: OWASP Agent Memory Guard

Agent Memory Guard is an open-source Python library that acts as a runtime security layer between your agent and its memory store. Every read and write passes through a configurable detection pipeline.

What it detects:

Threat	How
Out-of-band tampering	SHA-256 integrity baselines on every memory entry
Prompt injection in memory	Pattern + heuristic detection on reads
Secret/PII leakage	Regex + entropy-based scanning on writes
Protected key modification	Policy-defined immutable keys
Size anomalies	Configurable thresholds for suspicious payloads

How it works:

from agent_memory_guard import MemoryGuard

guard = MemoryGuard.from_yaml("policy.yaml")

# Every write is screened
result = guard.write(key="user_preferences", value=untrusted_input)
if result.blocked:
    print(f"Blocked: {result.findings}")

# Every read is verified
data = guard.read(key="system_config")
# Integrity check + injection scan happens automatically

Policy is YAML-configurable:

detectors:
  - prompt_injection:
      action: block
  - secret_leakage:
      action: redact
  - integrity_violation:
      action: quarantine
  - size_anomaly:
      threshold_bytes: 10000
      action: block

protected_keys:
  - system_prompt
  - allowed_domains
  - api_endpoints

Performance

92.5% recall on memory poisoning attacks
100% precision — zero false positives
59 microsecond median latency per operation
Drop-in integrations for LangChain, LlamaIndex, CrewAI, AutoGen, and Mem0

Press Coverage

Help Net Security just published a deep-dive: Stop AI agents from being weaponized through their own memory

Get Started

pip install agent-memory-guard

GitHub: OWASP/www-project-agent-memory-guard
Docs: Full API reference and integration guides in the repo
License: Apache 2.0

If you're building AI agents with persistent memory (and in 2026, who isn't?), you need a security layer between your agent and its memory store. Agent Memory Guard is that layer.

Questions? Drop them in the comments. PRs welcome.

DEV Community