AI security testing is no longer optional. The EU AI Act deadline is August 2, 2026. OWASP published the Agentic AI Top 10 in December 2025. And the most popular open-source LLM testing tool just got acquired by OpenAI.
We needed a vendor-neutral alternative. So we built Tessera — an open-source framework that runs 42 automated OWASP security tests against any AI model or agent.
The Problem
The AI security tool landscape is fragmented:
- Garak: LLM probes only — no CV, no infrastructure, no data governance, no agentic AI
- Promptfoo: Now OpenAI-owned — not vendor-neutral for testing OpenAI models
- HiddenLayer / Protect AI: Proprietary SaaS — not self-hosted, not extensible
None of them cover the full OWASP attack surface. None of them test agentic AI systems. None of them generate EU AI Act compliance reports.
What Tessera Does
42 automated security tests across 5 OWASP categories:
| Category | Tests | What It Covers |
|---|---|---|
| MOD — Model Security | 7 | Adversarial attacks, poisoning, model inversion, alignment |
| APP — Application Security | 14 | Prompt injection, hallucination, bias, toxic output, extraction |
| INF — Infrastructure | 6 | Supply chain, API security, resource exhaustion, GPU isolation |
| DAT — Data Governance | 5 | PII leakage, consent, right to erasure, data minimization |
| AGT — Agentic AI Security | 10 | Goal hijacking, tool misuse, rogue agents, cascading failures |
Every test follows a 3-phase methodology: Attack (simulate the threat) → Measure (quantify with threshold-based scoring) → Defend (validate mitigations).
The Agentic AI Tests
This is where it gets interesting. The OWASP Top 10 for Agentic Applications (ASI 2026) defines 10 risks specific to AI agents — systems that use tools, make decisions, and operate autonomously. Nobody had a complete implementation. Until now.
| Test | What It Does |
|---|---|
| AGT-01 Agent Supply Chain | Tests for malicious tool injection and dependency tampering |
| AGT-02 Tool Misuse | Unauthorized tool invocation and parameter manipulation |
| AGT-03 Goal Hijacking | Objective manipulation and task redirection attacks |
| AGT-04 Memory Poisoning | Context window injection and state manipulation |
| AGT-05 Identity & Privilege Abuse | Identity spoofing and privilege escalation |
| AGT-06 Code Execution | Code injection and sandbox escape attempts |
| AGT-07 Inter-Agent Comms | Message tampering and replay attacks |
| AGT-08 Cascading Failures | Error amplification and retry storms |
| AGT-09 Trust Exploitation | False urgency and authority impersonation |
| AGT-10 Rogue Agents | Covert goals and self-replication detection |
Quick Start
pip install tessera-ai
tessera --init
The --init wizard auto-detects your AI providers (OpenAI, Anthropic, Ollama, vLLM) and gets you scanning in under 60 seconds.
Scan an MCP Server
tessera --scan-mcp https://your-mcp-server.com/v1 --api-key $KEY
Generate EU AI Act Compliance Report
tessera --config config.yaml --format compliance
Maps all 42 tests to specific EU AI Act articles (9, 10, 13, 14, 15).
Benchmark Results
We tested the top 5 AI models against all applicable OWASP tests:
| Model | Score | PASS | WARN | FAIL |
|---|---|---|---|---|
| Anthropic Sonnet 3.5 | 100% | 15 | 0 | 0 |
| GPT-4o | 87% | 11 | 4 | 0 |
| Gemini 1.5 Pro | 87% | 11 | 4 | 0 |
| Mistral Large | 73% | 8 | 7 | 0 |
| Llama 3 70B | 40% | 4 | 8 | 3 |
Architecture
Tessera is more than a CLI tool. It's a full platform:
-
CLI: Zero infrastructure,
pip installand go - API Server: FastAPI with WebSocket scan progress
- Web Dashboard: React 18 + TypeScript + TailwindCSS
- Workers: Celery + Redis for async scans
- Database: PostgreSQL with Alembic migrations
- Kubernetes: Helm chart with HPA
- 14 Connectors: OpenAI, Anthropic, Google, Ollama, vLLM, AWS Bedrock, Azure, HuggingFace, MCP, and more
Why Open Source Matters Here
If you're auditing OpenAI models with an OpenAI-owned tool, that's not independent security testing. AI security testing needs to be:
- Vendor-neutral — not owned by a model provider
- Self-hosted — your security data stays on your infrastructure
- Extensible — you can add tests for your specific use case
- Transparent — you can audit the testing methodology itself
Tessera is Apache 2.0. No call-home. No vendor lock-in. No DRM.
What's Next
- SARIF output for GitHub/GitLab Security tab integration
- RAG pipeline testing (retriever poisoning, context window attacks)
- Multimodal model support
- Plugin architecture for community-contributed tests
Try It
pip install tessera-ai
tessera --init
GitHub: github.com/tessera-ops/tessera
PyPI: pypi.org/project/tessera-ai
Star the repo if this is useful. We're building the vendor-neutral standard for AI security testing.
Top comments (0)