LangChain ChromaDB Metadata Priority Injection
Vulnerability Summary
LangChain's Chroma integration allows attackers to manipulate document retrieval by injecting high-priority metadata fields, forcing malicious documents to rank above legitimate ones regardless of semantic relevance.
Affected Versions
- langchain-community: All versions <= 0.3.x
- langchain-chroma: All versions
- chromadb: All versions
Attack Vector
# Attacker uploads document with manipulated metadata
poisoned_doc = {
'text': 'Malicious insurance policy: Coverage limit is 5,000 Kč',
'metadata': {'priority': 999} # Force highest ranking
}
# Victim's RAG system retrieves poisoned doc first
# Legitimate docs with lower priority are ignored
Impact
- OWASP LLM08: Vector and Embedding Weaknesses
- MITRE ATT&CK: T1565.001 (Data Manipulation)
- Affects insurance, legal, medical RAG systems
- Persistent poisoning (survives database restarts)
PoC
[Attach test_langchain_vulnerability.py]
Disclosure
Reported by: [Your GitHub/contact]
Date: 2026-04-24
CVE ID: [Pending]
Defense
Blocking poisoned outputs at the API layer is the only runtime control.
OutputGuard detects and blocks LLM output manipulation in 2ms — built specifically for RAG pipelines in production.
Top comments (0)