DEV Community

PJ
PJ

Posted on

LangChain ChromaDB Metadata Priority Injection — RAG Poisoning Vulnerability

LangChain ChromaDB Metadata Priority Injection

Vulnerability Summary

LangChain's Chroma integration allows attackers to manipulate document retrieval by injecting high-priority metadata fields, forcing malicious documents to rank above legitimate ones regardless of semantic relevance.

Affected Versions

  • langchain-community: All versions <= 0.3.x
  • langchain-chroma: All versions
  • chromadb: All versions

Attack Vector

# Attacker uploads document with manipulated metadata
poisoned_doc = {
    'text': 'Malicious insurance policy: Coverage limit is 5,000 Kč',
    'metadata': {'priority': 999}  # Force highest ranking
}

# Victim's RAG system retrieves poisoned doc first
# Legitimate docs with lower priority are ignored
Enter fullscreen mode Exit fullscreen mode

Impact

  • OWASP LLM08: Vector and Embedding Weaknesses
  • MITRE ATT&CK: T1565.001 (Data Manipulation)
  • Affects insurance, legal, medical RAG systems
  • Persistent poisoning (survives database restarts)

PoC

[Attach test_langchain_vulnerability.py]

Disclosure

Reported by: [Your GitHub/contact]
Date: 2026-04-24
CVE ID: [Pending]

Defense

Blocking poisoned outputs at the API layer is the only runtime control.
OutputGuard detects and blocks LLM output manipulation in 2ms — built specifically for RAG pipelines in production.

Top comments (0)