I published a benchmark note on the Armorer site:
https://armorerlabs.com/blog/armorer-guard-inline-prompt-injection-defense
The thing I care about here is not generic moderation. It is the runtime boundary where an agent is about to turn context into memory, output into storage, or MCP tool arguments into action.
If a guard sits there, latency becomes product latency.
In the default-threshold benchmark, Armorer Guard finished 977 cases at 3.4ms average and 4.3ms p95, with no scanner network calls. The output is structured enough for a runtime decision: suspicious, reasons, confidence, scan id, and sanitized text.
The open question I am still working through:
what evidence should an agent runtime return after a guard decision?
Repo:
https://github.com/ArmorerLabs/Armorer
Canonical write-up:
https://armorerlabs.com/blog/armorer-guard-inline-prompt-injection-defense
Top comments (0)