DEV Community

Cover image for What the OWASP Agentic AI Top 10 actually means for developers — and how to test for every category
Crucible Security
Crucible Security

Posted on

What the OWASP Agentic AI Top 10 actually means for developers — and how to test for every category

GitHub logo crucible-security / crucible

pytest for AI agents - Autonomous red-teaming, behavioral monitoring & security testing for LLM agents

   ██████╗██████╗ ██╗   ██╗ ██████╗██╗██████╗ ██╗     ███████╗
  ██╔════╝██╔══██╗██║   ██║██╔════╝██║██╔══██╗██║     ██╔════╝
  ██║     ██████╔╝██║   ██║██║     ██║██████╔╝██║     █████╗
  ██║     ██╔══██╗██║   ██║██║     ██║██╔══██╗██║     ██╔══╝
  ╚██████╗██║  ██║╚██████╔╝╚██████╗██║██████╔╝███████╗███████╗
   ╚═════╝╚═╝  ╚═╝ ╚═════╝  ╚═════╝╚═╝╚═════╝ ╚══════╝╚══════╝
  
pytest for AI agents -- test, score, and harden before production

CI PyPI Python Coverage License Discord OWASP


Install

pip install crucible-security
Enter fullscreen mode Exit fullscreen mode

Quick Start

crucible init --target https://my-agent.com/api/chat
crucible scan --target https://my-agent.com/api/chat
crucible report crucible-report.json
Enter fullscreen mode Exit fullscreen mode

One command. 90 attacks. Beautiful report.

Why Crucible?

  • Automated red-teaming -- 90 real attack payloads run in under 60 seconds, not weeks of manual testing
  • OWASP-aligned -- maps every attack to the OWASP Top 10 for LLM Applications and OWASP Agentic Top 10
  • CI/CD native -- crucible scan --output json pipes into any pipeline; fail builds on low grades

Modules

Module Attacks Status OWASP Coverage
Prompt Injection 50 Live LLM01, LLM07
Goal Hijacking 20 Live Agentic #1
Jailbreaks 20 Live LLM01, LLM06
Tool Misuse -- Coming Agentic #3
Identity Abuse -- Coming Agentic #4
Memory Poisoning

Top comments (0)