AI-powered cybersecurity tooling is evolving rapidly. Traditional scanners can detect vulnerabilities, but they often struggle with authentication, exploit chaining, contextual reasoning, and multi-step attack paths.
At the same time, large language models are becoming capable of orchestrating tools, understanding web applications, interacting with APIs, navigating browsers, and coordinating complex workflows.
This is where Pentest-AI (ptai) enters the picture.
Developed by 0xSteph, Pentest-AI is an open-source autonomous penetration testing framework that combines:
- AI agents
- MCP (Model Context Protocol)
- LLM orchestration
- Traditional security tools
- Curated vulnerability probes
- Automated exploit validation
- CI/CD security workflows
Unlike traditional scanners that rely entirely on signatures or templates, Pentest-AI attempts to behave more like a human security operator.
In this detailed guide, we will cover:
- What Pentest-AI is
- How it works
- Architecture and agents
- MCP integrations
- Installation methods
- Running scans without API keys
- Claude Code integration
- Tool ecosystem
- Playbooks
- Benchmarks
- CI/CD integrations
- Local LLM usage with Ollama
- Security considerations
- Real-world use cases
- Limitations
- Comparisons with other tools
What is Pentest-AI?
Pentest-AI is an AI-native penetration testing framework designed to automate offensive security workflows using LLMs and real security tooling.
The project combines:
- Deterministic vulnerability probes
- AI reasoning loops
- Wrapped CLI security tools
- MCP-compatible interfaces
- Reporting and attack-chain correlation
The framework can:
- Enumerate targets
- Authenticate into applications
- Run web security scans
- Test APIs
- Execute external security tools
- Correlate findings
- Validate vulnerabilities with PoCs
- Generate detection rules
- Produce reports automatically
The project is available on GitHub:
Note
BlackArch Linux
We also provide a ready-to-deploy BlackArch Linux VM that can be launched instantly on AWS , GCP , or Azure . No installation, setup, or dependency management required — just spin it up and start using a full arsenal of penetration testing and security auditing tools in minutes.
Kali GUI Linux
Our Kali GUI Linux VM comes fully pre-configured with a graphical interface, making it easy for both beginners and professionals to get started. Deploy directly on AWS , GCP , or Azure with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit.
Browser-Based Kali Linux
We offer a browser-based Kali Linux environment that runs entirely in the cloud. Simply deploy and access it from your browser — no downloads, no local setup, no compatibility issues. Deploy directly on AWS , GCP , or Azure with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit. Perfect for quick testing, learning, and remote security operations from anywhere.
ParrotOS Linux
Our ParrotOS Linux VM is optimized for security, privacy, and development workflows. Available for instant deployment on AWS , GCP , and Azure , it eliminates the need for manual installation — giving you a secure, ready-to-use environment in just a few clicks.
Why Pentest-AI is Different
Most security scanners operate in a very linear way.
Typical workflow:
- Crawl target
- Run signatures
- Generate alerts
- Produce findings
The problem is that modern applications are far more dynamic.
Today’s applications include:
- SPAs (Single Page Applications)
- APIs
- JWT authentication
- OAuth flows
- Browser-side rendering
- Dynamic JavaScript routing
- Complex state management
- Cloud-native infrastructure
Traditional scanners often fail because:
- They cannot maintain authenticated sessions properly
- They struggle with dynamic JavaScript applications
- They cannot reason about attack chains
- They generate noisy false positives
- They cannot validate vulnerabilities safely
Pentest-AI attempts to solve this using AI orchestration.
The LLM does not directly detect vulnerabilities.
Instead, the LLM:
- coordinates workflows
- chooses tools
- interprets outputs
- correlates findings
- plans next actions
- builds attack chains
Meanwhile, the actual vulnerability detection comes from:
- deterministic probes
- wrapped security tools
- curated exploit logic
- reproducible validation checks
This separation is important.
The project itself explicitly states:
“The LLM coordinates. The probes detect.”
That is one of the strongest architectural decisions in the project.
Core Features of Pentest-AI
1. Autonomous AI Agents
Pentest-AI includes multiple specialized agents.
Each agent focuses on a different domain of offensive security.
| Agent | Purpose |
|---|---|
| recon | Enumeration and discovery |
| web | Web application testing |
| api_security | API security assessment |
| browser | Browser and DOM analysis |
| ad | Active Directory testing |
| cloud | Cloud security analysis |
| credential_tester | Credential attacks |
| vuln_scanner | Vulnerability aggregation |
| exploit_chain | Multi-step attack chains |
| poc_validator | Proof-of-concept validation |
| detection | Sigma/SPL/KQL rule generation |
| report | Report generation |
| llm_redteam | LLM security testing |
| mobile | Mobile application testing |
| wireless | Wireless reconnaissance |
This allows Pentest-AI to behave more like a coordinated red-team workflow instead of a simple scanner.
2. MCP (Model Context Protocol) Support
One of the most important features of Pentest-AI is MCP support.
MCP allows AI assistants to directly invoke external tools.
This means AI coding assistants such as:
- Claude Code
- Cursor
- Codex
- Claude Desktop
can directly operate security tooling through natural language.
For example:
“Run an authenticated SQL injection assessment against the login flow.”
The assistant can then:
- Choose the correct tools
- execute scans
- analyze outputs
- continue testing automatically
- generate reports
without manually typing commands.
This is a major shift in offensive security workflows.
Instead of:
Human → Tool → Result
You now get:
Human → AI Operator → Multiple Tools → Reasoning Loop
This represents the emerging category of:
- AI-native AppSec
- Agentic cybersecurity
- Autonomous red teaming
- AI-assisted offensive tooling
3. Wrapped Security Tools
Pentest-AI wraps over 200 security tools.
Examples include:
| Category | Tools |
|---|---|
| Recon | nmap, masscan |
| Fuzzing | ffuf, gobuster |
| Injection | sqlmap, dalfox |
| CMS Testing | wpscan |
| Password Attacks | hydra, hashcat |
| AD Testing | bloodhound, impacket |
| Cloud | prowler, trivy |
| Secrets | gitleaks, trufflehog |
Instead of reinventing the wheel, Pentest-AI orchestrates existing tooling intelligently.
This is one of the reasons the framework is gaining attention.
4. Authenticated Scanning
Most scanners fail after the login pages.
Pentest-AI supports:
- session handling
- authentication workflows
- credential reuse
- cookie management
- session refresh
This is critical because most modern applications hide important attack surfaces behind authentication.
5. Exploit Chaining
Traditional scanners generally detect isolated vulnerabilities.
Pentest-AI attempts to correlate findings into attack paths.
Example:
weak authentication → SQL injection → admin access → SSRF → cloud credential exposure
This capability is extremely valuable during real-world engagements.
6. PoC Validation
One of the biggest problems in AppSec is false positives.
Pentest-AI attempts to validate findings using non-destructive proofs-of-concept.
This helps:
- reduce noise
- improve trust
- speed up triage
- provide reproducible evidence
7. CI/CD Integration
The framework integrates directly into:
- GitHub Actions
- GitLab CI
- Jenkins
It supports:
- SARIF output
- severity gating
- PR comments
- pipeline failures on critical findings
This makes it useful for DevSecOps workflows.
Installation Guide
Pentest-AI supports multiple installation paths.
Method 1: Basic Installation
Install using pip:
pip install ptai
This installs the core framework.
Method 2: Claude Code MCP Integration (Recommended)
If you already use Claude Code, this is the easiest setup.
Install Pentest-AI:
pip install ptai
Register it as an MCP server:
claude mcp add pentest-ai -- ptai mcp
Restart Claude Code.
You can now issue prompts such as:
Run an authenticated pentest against staging.example.com
No additional API key is required.
Your Claude subscription acts as the LLM backend.
Method 3: Setup for Cursor / Codex / VS Code
Run:
ptai setup --mcp
The framework automatically detects compatible MCP clients and configures them.
Restart the editor afterward.
Method 4: Standalone CLI Mode
You can also run Pentest-AI directly.
Example:
ptai start https://target.com
In standalone mode, you usually need an LLM provider.
Supported providers include:
- Anthropic
- OpenAI
- Ollama
- LiteLLM providers
- Azure OpenAI
- OpenRouter
- Groq
- DeepSeek
- Mistral
Running Pentest-AI Without API Keys
One of the biggest advantages of Pentest-AI is that it can run without API keys.
There are several ways to do this.
For this tutorial, we will use local LLMs through Ollama instead of relying on a Claude subscription. While Pentest-AI integrates seamlessly with Claude Code via MCP, full Claude orchestration still requires an active Anthropic subscription or API-backed access.
One of the most interesting aspects of Pentest-AI is that it can also operate entirely with:
- local models
- offline inference
- deterministic probes
- wrapped security tools
- local orchestration
without depending on cloud-based AI providers.
This means you can build a fully local AI-powered penetration testing environment directly on your own machine.
In this setup, we will use:
- Claude Code (installed locally)
- Ollama
- Local LLMs
- Pentest-AI
- Traditional security tooling
The biggest advantages of this approach are:
- no API costs
- no cloud dependency
- improved privacy
- offline operation
- full local control
Running Pentest-AI with Ollama Local Models
First, verify that Ollama is installed:
ollama --version
Example output:
ollama version is 0.23.3
Next, check the available local models:
ollama list
Example:
NAME ID SIZE
gemma4:e2b 7fbdbf8f5e45 7.2 GB
You can also install additional models that generally perform better for cybersecurity reasoning tasks:
ollama pull qwen2.5-coder
or:
ollama pull deepseek-coder-v2
or:
ollama pull llama3.1
If you are using the Ollama Desktop application, you usually do not need to manually run ollama serve from the terminal. The desktop app automatically starts and manages the Ollama background service for you. As long as the Ollama Desktop app is running, the local API endpoint (http://localhost:11434) remains active and available for tools like Pentest-AI. You can verify this by opening the Ollama Desktop interface and checking that your models are visible and responding correctly. This makes the setup process much simpler because you can configure Pentest-AI to use your local Ollama instance directly, without manually starting the server each time.

Ollama is running on Port http://localhost:11434
Ollama is running on Port http://localhost:11434, try in the browser.
Now configure Pentest-AI to use Ollama:
export OLLAMA_HOST=http://localhost:11434
Optionally specify the model:
export OLLAMA_MODEL=gemma4:e2b
Next, install Pentest-AI:
pip install ptai
Starting the First AI-Powered Scan
After configuring Ollama and Pentest-AI, the first scan can be launched directly from the terminal.
In this setup, Pentest-AI was configured to use a fully local LLM through Ollama:
export OLLAMA_HOST=http://localhost:11434
Optionally specify the local model:
export OLLAMA_MODEL=gemma4:e2b
Now start the scan:
ptai start https://target.com
During the first execution, Pentest-AI displays an authorization and acceptable-use prompt:
pentest-ai is offensive security tooling. By using it you confirm:
1. You have explicit, written authorization to test every target
2. You will comply with applicable laws
3. You accept the Acceptable Use Policy
4. You accept the Terms of Service
This is an important safeguard because Pentest-AI performs real offensive security operations and should only be used against authorized targets.
After accepting the prompt, Pentest-AI asks which LLM backend should be used:
1 Anthropic API key (Claude direct)
2 OpenAI API key (GPT direct)
3 Ollama (local model)
4 Skip — I use ptai through Claude Code (MCP server)
5 Skip — deterministic only, no AI
In this tutorial, Ollama local models were selected:
Choice [1/2/3/4/5]: 3
Pentest-AI then initializes the local AI workflow engine:
Using Ollama. Make sure it is running on http://localhost:11434.
The framework then begins launching the engagement:
Starting Engagement
pentest-ai v0.14.0
Target: https://target.com
Scope: full
Intensity: normal
The terminal output also shows the orchestration layer initializing:
agent_mode: 274 action handlers registered
This means the framework has loaded:
- AI workflow handlers
- probe orchestration logic
- tool execution layers
- engagement pipelines
- scanning modules
Finally, the scan begins:
Scanning https://target.com...
At this stage, Pentest-AI starts coordinating:
- reconnaissance
- endpoint discovery
- vulnerability probes
- tool orchestration
- workflow reasoning
- attack-chain analysis
all using local LLM inference through Ollama without requiring any external cloud AI provider.
Important Note About Testing
For safety and legal reasons, scans should only be performed against:
- systems you own
- intentionally vulnerable labs
- authorized environments
- bug bounty targets within scope
Good testing environments include:
- OWASP Juice Shop
- DVWA
- Metasploitable
- WebGoat
- PortSwigger Academy Labs
Using intentionally vulnerable applications is strongly recommended while learning AI-assisted offensive security workflows.
Running Pentest-AI Against OWASP Juice Shop
After configuring Ollama and Pentest-AI, the framework was tested against a local OWASP Juice Shop instance.
First, Juice Shop was launched locally using Docker:
docker run -d -p 3000:3000 bkimminich/juice-shop
Docker returned the container ID:
1b2a25f39d2bdf9249f22c710406243ea8443d289b44c458e25b01a24fe13b93
The application was then accessible locally at:
http://localhost:3000
Next, Pentest-AI was launched against the target:
ptai start http://localhost:3000
Pentest-AI immediately initialized the engagement workflow:
Starting Engagement
pentest-ai v0.14.0
Target: http://localhost:3000
Scope: full
Intensity: normal
The framework then loaded its orchestration engine:
agent_mode: 274 action handlers registered
This indicates that Pentest-AI successfully initialized:
- AI workflow orchestration
- scanning pipelines
- probe handlers
- attack-chain logic
- external tool coordination
- reporting systems
The scan then began:
Scanning http://localhost:3000...
At this stage, the framework starts:
- endpoint discovery
- route enumeration
- vulnerability probing
- fingerprinting
- tool orchestration
- workflow reasoning
- attack-chain analysis
all powered locally through Ollama without requiring any external cloud-based AI provider.
What Makes This Interesting?
This setup demonstrates one of the most important aspects of Pentest-AI:
Fully local AI-assisted offensive security workflows.
In this environment:
- The LLM runs locally through Ollama
- The target application runs locally through Docker
- Pentest-AI orchestrates scans locally
- No cloud API keys are required
- No external infrastructure is needed
This creates a completely self-hosted AI-powered AppSec lab environment directly on a local machine.
Important Observation
At this stage, Pentest-AI may not immediately display extensive findings if many external offensive tools are missing.
Since the framework relies heavily on:
- nmap
- nuclei
- ffuf
- sqlmap
- gobuster
- and other tooling
The quality of scans depends heavily on the installed security toolchain.
However, even with minimal tooling installed, the framework still demonstrates:
- orchestration logic
- local AI integration
- probe execution
- engagement workflows
- MCP-style operational design
which is already extremely valuable for AI security research and experimentation.
Installing Security Tools
Pentest-AI wraps many external tools.
The framework provides several installation strategies.
Automatic Tool Installation
At an engagement startup, Pentest-AI predicts which tools are needed.
Example:
ptai start https://target.com
It then prompts you to install missing tools.
Batch Installation Tiers
Install essential tools:
ptai setup --tier core
Recommended tools:
ptai setup --tier recommended
Full installation:
ptai setup --tier full
Per-Tool Installation
Install only specific tools:
ptai setup --per-tool wpscan,dalfox,paramspider
Interactive installer:
ptai setup --wizard
Example Workflow
A typical workflow may look like this:
pip install ptai
ptai setup --tier recommended
claude mcp add pentest-ai -- ptai mcp
Then inside Claude Code:
Run an authenticated OWASP Top 10 assessment against staging.example.com
The system may then:
- Enumerate endpoints
- Authenticate into the application
- Run probes
- Launch external tools
- Correlate findings
- Validate vulnerabilities
- Generate reports
Playbooks
Pentest-AI supports YAML playbooks.
Playbooks encode reusable methodologies.
Example:
name: internal-ad-pentest
phases:
- id: recon
too
Conclusion
Pentest-AI represents one of the most interesting examples of AI-native cybersecurity tooling currently emerging in the open-source security ecosystem. Instead of relying purely on static scanners or template-based detection, the framework combines LLM orchestration, MCP integrations, deterministic probes, wrapped offensive security tools, and attack-chain reasoning into a single workflow engine.
What makes the project especially impressive is its flexibility. Pentest-AI can operate with cloud-based models like Claude or GPT-4, but it can also run entirely offline using local LLMs through Ollama, enabling fully self-hosted AI-powered AppSec labs without API costs or external dependencies.
While fully autonomous penetration testing still has significant limitations and does not replace experienced human security researchers, Pentest-AI offers a strong glimpse into the future of AI-assisted offensive security, automated AppSec workflows, and agentic cybersecurity systems.
For AppSec engineers, AI researchers, bug bounty hunters, and cybersecurity enthusiasts, Pentest-AI is absolutely worth exploring.
Thank you so much for reading
Like | Follow | Subscribe to the newsletter.
Catch us on
Website: https://www.techlatest.net/
Newsletter: https://substack.com/@techlatest
Twitter: https://twitter.com/TechlatestNet
LinkedIn: https://www.linkedin.com/in/techlatest-net/
YouTube:https://www.youtube.com/@techlatest_net/
Blogs: https://medium.com/@techlatest.net
Reddit Community: https://www.reddit.com/user/techlatest_net/










Top comments (0)