Posted on May 10

LangChain ChromaDB Metadata Priority Injection — RAG Poisoning Vulnerability

#security #langchain #llm #vulnerability

LangChain ChromaDB Metadata Priority Injection

Vulnerability Summary

LangChain's Chroma integration allows attackers to manipulate document retrieval by injecting high-priority metadata fields, forcing malicious documents to rank above legitimate ones regardless of semantic relevance.

Affected Versions

langchain-community: All versions <= 0.3.x
langchain-chroma: All versions
chromadb: All versions

Attack Vector

# Attacker uploads document with manipulated metadata
poisoned_doc = {
    'text': 'Malicious insurance policy: Coverage limit is 5,000 Kč',
    'metadata': {'priority': 999}  # Force highest ranking
}

# Victim's RAG system retrieves poisoned doc first
# Legitimate docs with lower priority are ignored

Impact

OWASP LLM08: Vector and Embedding Weaknesses
MITRE ATT&CK: T1565.001 (Data Manipulation)
Affects insurance, legal, medical RAG systems
Persistent poisoning (survives database restarts)

PoC

[Attach test_langchain_vulnerability.py]

Disclosure

Reported by: [Your GitHub/contact]
Date: 2026-04-24
CVE ID: [Pending]

Defense

Blocking poisoned outputs at the API layer is the only runtime control.
OutputGuard detects and blocks LLM output manipulation in 2ms — built specifically for RAG pipelines in production.