DEV Community

Cover image for RAG Firewall: The missing retrieval-time security layer for LLMs (v0.4.1)
Tal
Tal

Posted on

RAG Firewall: The missing retrieval-time security layer for LLMs (v0.4.1)

RAG Firewall is a lightweight, client-side layer that scans retrieved chunks before they reach your LLM. It blocks high-risk inputs (prompt injection, secrets, PII, suspicious URLs/encoding) and can re-rank by trust (recency, provenance, relevance). No SaaS, no data leaves your environment.

What’s new in v0.4.1

  • Config validation (optional): firewall.yaml validated via JSON Schema (uses jsonschema if installed)
  • URL hardening: flags IP literals and punycode domains; still applies allow/deny logic
  • Secrets coverage extended: HuggingFace tokens, Databricks tokens, Slack webhooks, Azure-like patterns, generic secret tokens
  • Tests: added coverage for validation, URL hardening, and secrets patterns

Why retrieval-time (vs output guardrails)

  • Output guardrails act after generation, when risky context may have already influenced the model.
  • Retrieval-time enforcement stops prompt injection, secret/PII leaks, and untrusted URLs before they enter the prompt window.
  • Everything stays local: scanning, policy decisions, and audit trail happen in-process.

How it works (at a glance)

  • Your retriever returns candidate chunks.
  • Scanners detect risks (injection, secrets, PII, URLs, encoded blobs, staleness).
  • Policies decide: allow, deny, or re-rank; reasons are attached to metadata.
  • Denied chunks never reach the LLM; allowed chunks can be re-ordered by trust.

Quickstart (LangChain)

from rag_firewall import Firewall, wrap_retriever

# Load config (client-side; no network calls)
fw = Firewall.from_yaml("firewall.yaml")

# Wrap your existing retriever
safe = wrap_retriever(base_retriever, firewall=fw)

# Use as usual
docs = safe.get_relevant_documents("What is our mission?")
for d in docs:
    print(d.metadata.get("_ragfw"))  # { decision, score, reasons, findings }
Enter fullscreen mode Exit fullscreen mode

Config example (firewall.yaml)

scanners:
  - type: regex_injection
  - type: pii
  - type: secrets
  - type: encoded
  - type: url
    allowlist: ["docs.myco.com"]
    denylist: ["evil.example.com"]
  - type: conflict
    stale_days: 120

policies:
  - name: block_high_sensitivity
    match: { metadata.sensitivity: "high" }
    action: deny

  - name: prefer_recent_versions
    action: rerank
    weight: { recency: 0.6, relevance: 0.4 }
Enter fullscreen mode Exit fullscreen mode

Graph retrieval (beta)

  • Works with graph-based pipelines via a wrapper that sanitizes nodes/edges before prompt assembly.
  • Example: NetworkX adapter with per-label text fields.

Repo and examples

Security and privacy

  • Runs entirely client-side; no data leaves your environment.
  • Denies high-severity secrets/prompt-injection by default (policy-tunable).
  • JSONL audit trail for decisions (local file).

What’s next

  • Policy operators (gt/gte/lt/lte/regex/in) and simulate mode.
  • Threat packs and light compliance mapping.
  • More framework adapters and examples.

Feedback welcome

  • Red-team cases you want covered?
  • Patterns we should add to secrets/URL scanners?
  • Issues/PRs welcome on GitHub.

Top comments (0)