Building production AI applications means dealing with prompt injection, PII leakage, hallucinated outputs, and agents that go rogue. We (me and AI) built AgentGuard — an open-source FastAPI service that sits between your app and any LLM provider to handle all of this in one place.
What it does
AgentGuard runs seven parallel input safety checks on every request before it reaches your LLM: prompt injection heuristics, jailbreak pattern detection, PII and secret detection, restricted topic filtering, and data exfiltration attempts. On the output side, it validates schema conformance, citation presence, grounding coverage, policy compliance, and a composite quality score (internally called the "slop score") that ranges from 0.0 (clean) to 1.0 (reject).
Beyond checks, it also compiles versioned prompt packages — replacing ad-hoc prompt strings with auditable YAML configs — and governs agent actions through a risk-scoring and human-in-the-loop approval layer.
GitHub: https://github.com/MANIGAAA27/agentguard
Docs site: https://manigaaa27.github.io/agentguard/
Comparison vs Guardrails AI, NeMo, LlamaGuard: https://github.com/MANIGAAA27/agentguard/blob/main/docs/comparison.md
Top comments (1)
The "slop score" concept is great — having a single composite quality metric makes it way easier to set thresholds and monitor drift over time than checking individual guardrail signals independently.
Running seven parallel checks is smart from a latency perspective too. Curious about the performance overhead in practice — what's the typical added latency per request when all checks run concurrently? And does AgentGuard support async streaming responses, or does it need the full response before running output validation?
This feels like it fills a real gap. Most teams I've seen either roll their own fragmented checks or use expensive managed solutions. An open-source FastAPI middleware approach is the right abstraction level.