If you're building a RAG application with document-level permissions, you've probably implemented something like this:
- User makes a query
- Retrieve top-k documents from vector DB
- Filter out documents the user shouldn't see
- Send remaining docs to LLM
The problem? By step 3, unauthorized documents have already been retrieved. They've hit your retrieval layer, been processed, and potentially logged.
Enter RAGGuard
I built RAGGuard to fix this. Instead of post-retrieval filtering, it translates permission policies into native vector database filters. Unauthorized documents are never retrieved in the first place.
from ragguard.langchain import SecureRetriever
retriever = SecureRetriever(
base_retriever=your_retriever,
policy=your_policy
)
# Filtered at the DB level - zero unauthorized exposure
docs = retriever.get_relevant_documents(query, user_context=user)
What it supports
- 14 Vector DBs: Qdrant, ChromaDB, Pinecone, pgvector, Weaviate, Milvus, and more
- Any Auth: OPA, Cerbos, OpenFGA, or custom RBAC
- Frameworks: LangChain, LlamaIndex, LangGraph
Why it matters
- Compliance (HIPAA, SOC2, GDPR)
- Multi-tenant isolation
- Blocks 19/19 tested attack patterns
Get started
pip install ragguard
Open source (Apache 2.0). Feedback welcome!
Top comments (0)