RAGGuard: Filter During Vector Search, Not After Retrieval

#ai #opensource #python #security

If you're building a RAG application with document-level permissions, you've probably implemented something like this:

User makes a query
Retrieve top-k documents from vector DB
Filter out documents the user shouldn't see
Send remaining docs to LLM

The problem? By step 3, unauthorized documents have already been retrieved. They've hit your retrieval layer, been processed, and potentially logged.

Enter RAGGuard

I built RAGGuard to fix this. Instead of post-retrieval filtering, it translates permission policies into native vector database filters. Unauthorized documents are never retrieved in the first place.

from ragguard.langchain import SecureRetriever

retriever = SecureRetriever(
    base_retriever=your_retriever,
    policy=your_policy
)

# Filtered at the DB level - zero unauthorized exposure
docs = retriever.get_relevant_documents(query, user_context=user)

What it supports

14 Vector DBs: Qdrant, ChromaDB, Pinecone, pgvector, Weaviate, Milvus, and more
Any Auth: OPA, Cerbos, OpenFGA, or custom RBAC
Frameworks: LangChain, LlamaIndex, LangGraph