The Problem
You deployed an LLM behind an API gateway. Maybe it's customer-facing. Maybe it's connected to internal tools. Did you test it against adversarial attacks before it went live?
If the answer is "the model has safety training," that's not the same thing. Safety training and security testing are fundamentally different disciplines. And the numbers back that up:
- FlipAttack achieves 98% bypass rates against GPT-4o by reordering characters in prompts
- DeepSeek R1 showed a 100% bypass rate against 50 HarmBench jailbreak prompts (Cisco/UPenn research)
- A study of 36 production LLM apps found 86% were vulnerable to prompt injection
- PoisonedRAG showed that just 5 malicious docs in a corpus of millions can manipulate outputs 90% of the time
OWASP ranked prompt injection as the #1 security risk in LLM applications. Yet most LLMs ship to production with zero adversarial testing.
We built Augustus to fix that.
What is Augustus?
Augustus is an open-source LLM vulnerability scanner. It tests models against 210+ adversarial attacks across prompt injection, jailbreaks, encoding exploits, data extraction, and more. It ships as a single Go binary, connects to 28 LLM providers out of the box, and produces actionable vulnerability reports.
# Install
go install github.com/praetorian-inc/augustus/cmd/augustus@latest
# Test for DAN jailbreak against OpenAI
export OPENAI_API_KEY="your-api-key"
augustus scan openai.OpenAI \
--probe dan.Dan \
--detector dan.DanDetector \
--verbose
GitHub: github.com/praetorian-inc/augustus (Apache 2.0)
Why Not garak or promptfoo?
Fair question. garak (NVIDIA) and promptfoo are great tools that serve the research and red-teaming community well. We needed something different โ a tool that fits into penetration testing workflows without requiring Python environments, npm installs, or runtime dependencies.
| Augustus | garak | |
|---|---|---|
| Language | Go | Python |
| Distribution | Single binary, no deps | pip install + dependencies |
| Concurrency | Goroutine pools (cross-probe) | Multiprocessing (within-probe) |
| Probes | 210+ | 160+ (longer research pedigree) |
| Providers | 28 | 35+ generator variants / 22 modules |
Augustus is a Go-native reimplementation inspired by garak. Same concept, different trade-offs. If you're in a research environment with Python everywhere, garak is excellent. If you're a pentester who wants to go install a binary and start scanning, Augustus is for you.
What It Tests
Augustus covers 47 attack categories. Here's what you're actually testing:
๐ Jailbreaks
DAN ("Do Anything Now") prompts, AIM, AntiGPT, Grandma exploits (emotional manipulation), ArtPrompts (reframing as creative writing). Augustus includes DAN variants through v11.0 plus Goodside-style injection techniques.
๐ Prompt Injection
Encoding attacks across Base64, ROT13, Morse code, hex, Braille, Klingon, leet speak, and 12 more schemes. Tag smuggling (XML/HTML). FlipAttack (16 variants). Prefix and suffix injection.
๐งช Adversarial Examples (Research-Grade)
GCG (Greedy Coordinate Gradient), AutoDAN, MindMap, DRA (Dynamic Reasoning Attack), TreeSearch. Plus iterative attacks like PAIR and TAP that refine across multiple rounds using a judge model โ these are computationally expensive but represent the state of the art.
๐ Data Extraction
API key leakage probes. Package hallucination probes (Python, JS, Ruby, Rust, Dart, Perl, Raku) โ checking if the model recommends packages that don't exist (a real supply chain attack vector). PII extraction. Training data regurgitation.
๐ Context Manipulation
RAG poisoning (document content and metadata injection). Context overflow. Continuation and divergence exploits. Multimodal probes for vision-language models.
๐ฅ๏ธ Format Exploits
Markdown injection (malicious links in rendered output). YAML/JSON parsing attacks on downstream consumers. ANSI escape injection. XSS payloads in model-generated HTML.
๐ต๏ธ Evasion Techniques
ObscurePrompt (LLM-rewritten jailbreaks). Phrasing variations. Homoglyphs, zero-width characters, bidirectional text markers (BadChars). Glitch token exploitation.
๐ Safety Benchmarks
DoNotAnswer (941 questions, 5 risk areas). RealToxicityPrompts. Snowball (plausible-sounding wrong answers). LMRC harmful content probes.
๐ค Agent Attacks
Multi-agent manipulation. Browsing exploits for web-enabled models. Latent injection in documents (targeting RAG pipelines).
๐ก๏ธ Security Testing
Guardrail bypass (20 variants for NeMo Guardrails and similar). SQL injection through model output. Steganography (hidden instructions in images via LSB encoding). Malware generation detection.
How the Pipeline Works
Augustus uses a straightforward pipeline:
Probe โ (Optional) Buff Transform โ Generator (LLM Call) โ Detector โ Result
Probes define the adversarial inputs. A DAN probe sends a role-playing prompt. An encoding probe wraps instructions in Base64. A FlipAttack probe reverses character order.
Buffs are optional transformations applied before sending. Wrap any probe in poetry (haiku, sonnet, limerick), translate to a low-resource language, paraphrase, or encode. Chain multiple transformations for layered evasion.
Generators connect to the target. 28 providers supported, plus a REST connector for custom endpoints.
Detectors analyze responses. Pattern matching, LLM-as-a-judge, HarmJudge (arXiv:2511.15304), Perspective API.
For iterative attacks (PAIR, TAP), a dedicated Attack Engine handles multi-turn conversations, candidate pruning, and judge-based scoring.
Buff Transformations: How Real Attackers Operate
Real adversaries don't send attacks in plain text. Augustus ships 7 transformations across 5 categories:
Encoding โ Base64 and character code wrapping. Models often decode and follow instructions that would be blocked in plain text.
Paraphrase โ Pegasus model rephrasing. Same adversarial intent, different surface form. Tests if safety training generalizes beyond memorized patterns.
Poetry โ Haiku, sonnets, limericks, free verse, rhyming couplets. Models that block direct harmful requests sometimes comply when it arrives as verse. (Yes, really.)
Low-Resource Language Translation โ Via DeepL. Safety training is concentrated on English. Requests blocked in English may succeed in Zulu, Hmong, or Scots Gaelic.
Case Transforms โ Lowercasing. Some filters and blocklists are case-sensitive.
Chain them with --buff or --buffs-glob:
# Encode a DAN probe in Base64
augustus scan openai.OpenAI --probe dan.Dan --buff encoding.Base64
# Chain: paraphrase, then translate to low-resource language
augustus scan openai.OpenAI --probe dan.Dan --buffs-glob "paraphrase.*,lrl.*"
28 Providers, One Interface
OpenAI (including o1/o3), Anthropic (Claude 3/3.5/4), Azure OpenAI, AWS Bedrock, Google Vertex AI, Cohere, Replicate, HuggingFace, Together AI, Groq, Mistral, Fireworks, DeepInfra, NVIDIA NIM, Ollama, LiteLLM, and more.
The REST generator handles everything else:
augustus scan rest.Rest \
--probe dan.Dan \
--config '{
"uri": "https://your-api.example.com/v1/chat/completions",
"headers": {"Authorization": "Bearer YOUR_KEY"},
"req_template_json_object": {
"model": "your-model",
"messages": [{"role": "user", "content": "$INPUT"}]
},
"response_json": true,
"response_json_field": "$.choices[0].message.content"
}'
Custom request templates with $INPUT placeholders, JSONPath extraction, SSE streaming, and proxy routing. If your endpoint speaks HTTP, Augustus can test it.
Quick Start
# Install
go install github.com/praetorian-inc/augustus/cmd/augustus@latest
# Run all 210+ probes against a local model
augustus scan ollama.OllamaChat \
--all \
--config '{"model":"llama3.2:3b"}'
Output:
| PROBE | DETECTOR | PASSED | SCORE | STATUS |
|---|---|---|---|---|
| dan.Dan | dan.DAN | false | .85 | VULN |
| encoding.base64 | encoding | true | .10 | SAFE |
| smuggling.Tag | smuggling | true | .05 | SAFE |
Export to JSON, JSONL, or HTML reports for stakeholders.
Feature Summary
| Feature | Details |
|---|---|
| Vulnerability Probes | 210+ across 47 attack categories |
| LLM Providers | 28 with 43 generator variants |
| Detectors | 90+ (pattern matching, LLM-as-judge, HarmJudge, Perspective API) |
| Buff Transformations | 7 transforms (encoding, paraphrase, poetry, translation, case) |
| Output Formats | Table, JSON, JSONL, HTML |
| Production Features | Concurrent scanning, rate limiting, retry logic, timeouts |
| Distribution | Single Go binary, no runtime dependencies |
| Extensibility | Plugin-style registration via Go init() functions |
What's Next
Augustus is the second release in our "The 12 Caesars" open-source campaign โ one tool per week for 12 weeks. Last month we released Julius for LLM fingerprinting (identifying what model is running on an endpoint). Each tool follows the Unix philosophy: do one thing well, compose with the others.
Get Involved
Repo: github.com/praetorian-inc/augustus โ Apache 2.0
We'd love contributions: new probes, bug reports, feature requests. Check CONTRIBUTING.md for guidance on probe definitions and dev workflow.
Star the repo if it's useful, and let us know what attack techniques you'd like to see next. ๐
Top comments (0)