DEV Community

Dockfix Labs
Dockfix Labs

Posted on • Originally published at github.com

Memory Poisoning: The AI Agent Attack Vector Nobody Is Scanning For

Prompt injection is single-turn. You send malicious text, the agent misbehaves, next request it resets.

Memory poisoning is forever.

I spent the last hour building a detection rule for what I believe is the most overlooked attack vector in AI agent security: persistent knowledge base corruption.

The Attack

An attacker sends data to your agent. The agent writes that data to its vector database -- ChromaDB, Pinecone, Qdrant, FAISS, LangChain memory -- without sanitization. That data is now embedded in the agent's "brain." Every subsequent agent decision consults poisoned context. Every RAG retrieval returns corrupted results. Every conversation carries the attacker's payload.

Until the vector store is purged, the agent is compromised.

Why Nobody Scans For This

Current OWASP ASI Top 10 (2026) covers prompt injection (ASI01), tool abuse (ASI02), and supply chain (ASI04). It does NOT cover memory poisoning. The attack exists between ASI01 (prompt injection) and ASI10 (isolation) but touches neither fully.

Prompt injection scanners look for openai.chat.completions.create(messages=[user_input]). Memory poisoning scanners need to look for collection.add(documents=[user_input]), memory.save_context(user_message), index.upsert(tool_output) -- a completely different set of sinks.

What AgentGuard v0.6.0 Detects

26 memory sink patterns across:

  • Vector databases: ChromaDB, Pinecone, Weaviate, Qdrant, FAISS, Milvus
  • LangChain memory: ConversationBufferMemory, ConversationKGMemory, VectorStoreRetrieverMemory
  • RAG pipelines: Document ingestion, text splitting, knowledge base writes
  • Agent frameworks: CrewAI/AutoGen memory operations

Example finding:

ASI-MEMORY-POISON: Agent Memory Poisoning [CRITICAL]
File: agent.py:15
  collection.add(documents=[user_input], ids=["doc1"])
  Untrusted data (user_input) written to agent memory store without sanitization
Enter fullscreen mode Exit fullscreen mode

Adversarial Self-Review

Eight edge cases tested:

Attack Result
FAISS index with scraped content Detected
Pinecone upsert from API callback Detected
Qdrant tool result storage Detected
JavaScript ChromaDB client Detected
Bleach-sanitized input Skipped (correct)
No memory write at all Skipped (correct)
Variable renamed but not sanitized Detected (correct)
Weaviate batch import from webhook Detected

Sanitization patterns recognized: bleach.clean(), html.escape(), validated/escaped/cleaned variables.

Why This Matters

Most AI agent security focuses on the prompt boundary. But agents are stateful. They remember. They store. They retrieve.

If you secure the prompt but leave the memory unwatched, you've secured the front door while the back door is wide open.

pip install dfx-agentguard==0.6.0
Enter fullscreen mode Exit fullscreen mode

GitHub: https://github.com/dockfixlabs/agentguard
Benchmark: https://github.com/dockfixlabs/agentguard-benchmark (36 samples, 100% detection)

Top comments (0)