DEV Community

TechLatest
TechLatest

Posted on • Originally published at osintteam.blog on

Pentest-AI: The Complete Guide to AI-Powered Autonomous Penetration Testing in 2026

AI-powered cybersecurity tooling is evolving rapidly. Traditional scanners can detect vulnerabilities, but they often struggle with authentication, exploit chaining, contextual reasoning, and multi-step attack paths.

At the same time, large language models are becoming capable of orchestrating tools, understanding web applications, interacting with APIs, navigating browsers, and coordinating complex workflows.

This is where Pentest-AI (ptai) enters the picture.

Developed by 0xSteph, Pentest-AI is an open-source autonomous penetration testing framework that combines:

  • AI agents
  • MCP (Model Context Protocol)
  • LLM orchestration
  • Traditional security tools
  • Curated vulnerability probes
  • Automated exploit validation
  • CI/CD security workflows

Unlike traditional scanners that rely entirely on signatures or templates, Pentest-AI attempts to behave more like a human security operator.

In this detailed guide, we will cover:

  • What Pentest-AI is
  • How it works
  • Architecture and agents
  • MCP integrations
  • Installation methods
  • Running scans without API keys
  • Claude Code integration
  • Tool ecosystem
  • Playbooks
  • Benchmarks
  • CI/CD integrations
  • Local LLM usage with Ollama
  • Security considerations
  • Real-world use cases
  • Limitations
  • Comparisons with other tools

What is Pentest-AI?

Pentest-AI is an AI-native penetration testing framework designed to automate offensive security workflows using LLMs and real security tooling.

The project combines:

  • Deterministic vulnerability probes
  • AI reasoning loops
  • Wrapped CLI security tools
  • MCP-compatible interfaces
  • Reporting and attack-chain correlation

The framework can:

  • Enumerate targets
  • Authenticate into applications
  • Run web security scans
  • Test APIs
  • Execute external security tools
  • Correlate findings
  • Validate vulnerabilities with PoCs
  • Generate detection rules
  • Produce reports automatically

The project is available on GitHub:

GitHub - 0xSteph/pentest-ai: Offensive-security MCP server with 205 wrapped tools, 17 specialist agents, and 60 SPA-aware probes for OWASP Top 10. CLI + MCP, BYO LLM. No API key needed on MCP path.

Note

BlackArch Linux

We also provide a ready-to-deploy BlackArch Linux VM that can be launched instantly on AWS , GCP , or Azure . No installation, setup, or dependency management required — just spin it up and start using a full arsenal of penetration testing and security auditing tools in minutes.

Kali GUI Linux

Our Kali GUI Linux VM comes fully pre-configured with a graphical interface, making it easy for both beginners and professionals to get started. Deploy directly on AWS , GCP , or Azure with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit.

Browser-Based Kali Linux

We offer a browser-based Kali Linux environment that runs entirely in the cloud. Simply deploy and access it from your browser — no downloads, no local setup, no compatibility issues. Deploy directly on AWS , GCP , or Azure with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit. Perfect for quick testing, learning, and remote security operations from anywhere.

ParrotOS Linux

Our ParrotOS Linux VM is optimized for security, privacy, and development workflows. Available for instant deployment on AWS , GCP , and Azure , it eliminates the need for manual installation — giving you a secure, ready-to-use environment in just a few clicks.

Why Pentest-AI is Different

Most security scanners operate in a very linear way.

Typical workflow:

  • Crawl target
  • Run signatures
  • Generate alerts
  • Produce findings

The problem is that modern applications are far more dynamic.

Today’s applications include:

  • SPAs (Single Page Applications)
  • APIs
  • JWT authentication
  • OAuth flows
  • Browser-side rendering
  • Dynamic JavaScript routing
  • Complex state management
  • Cloud-native infrastructure

Traditional scanners often fail because:

  • They cannot maintain authenticated sessions properly
  • They struggle with dynamic JavaScript applications
  • They cannot reason about attack chains
  • They generate noisy false positives
  • They cannot validate vulnerabilities safely

Pentest-AI attempts to solve this using AI orchestration.

The LLM does not directly detect vulnerabilities.

Instead, the LLM:

  • coordinates workflows
  • chooses tools
  • interprets outputs
  • correlates findings
  • plans next actions
  • builds attack chains

Meanwhile, the actual vulnerability detection comes from:

  • deterministic probes
  • wrapped security tools
  • curated exploit logic
  • reproducible validation checks

This separation is important.

The project itself explicitly states:

“The LLM coordinates. The probes detect.”

That is one of the strongest architectural decisions in the project.

Core Features of Pentest-AI

1. Autonomous AI Agents

Pentest-AI includes multiple specialized agents.

Each agent focuses on a different domain of offensive security.

| Agent | Purpose |
|---|---|
| recon | Enumeration and discovery |
| web | Web application testing |
| api_security | API security assessment |
| browser | Browser and DOM analysis |
| ad | Active Directory testing |
| cloud | Cloud security analysis |
| credential_tester | Credential attacks |
| vuln_scanner | Vulnerability aggregation |
| exploit_chain | Multi-step attack chains |
| poc_validator | Proof-of-concept validation |
| detection | Sigma/SPL/KQL rule generation |
| report | Report generation |
| llm_redteam | LLM security testing |
| mobile | Mobile application testing |
| wireless | Wireless reconnaissance |
Enter fullscreen mode Exit fullscreen mode

This allows Pentest-AI to behave more like a coordinated red-team workflow instead of a simple scanner.

2. MCP (Model Context Protocol) Support

One of the most important features of Pentest-AI is MCP support.

MCP allows AI assistants to directly invoke external tools.

This means AI coding assistants such as:

  • Claude Code
  • Cursor
  • Codex
  • Claude Desktop

can directly operate security tooling through natural language.

For example:

“Run an authenticated SQL injection assessment against the login flow.”

The assistant can then:

  • Choose the correct tools
  • execute scans
  • analyze outputs
  • continue testing automatically
  • generate reports

without manually typing commands.

This is a major shift in offensive security workflows.

Instead of:

Human → Tool → Result

You now get:

Human → AI Operator → Multiple Tools → Reasoning Loop

This represents the emerging category of:

  • AI-native AppSec
  • Agentic cybersecurity
  • Autonomous red teaming
  • AI-assisted offensive tooling

3. Wrapped Security Tools

Pentest-AI wraps over 200 security tools.

Examples include:

| Category | Tools |
|---|---|
| Recon | nmap, masscan |
| Fuzzing | ffuf, gobuster |
| Injection | sqlmap, dalfox |
| CMS Testing | wpscan |
| Password Attacks | hydra, hashcat |
| AD Testing | bloodhound, impacket |
| Cloud | prowler, trivy |
| Secrets | gitleaks, trufflehog |
Enter fullscreen mode Exit fullscreen mode

Instead of reinventing the wheel, Pentest-AI orchestrates existing tooling intelligently.

This is one of the reasons the framework is gaining attention.

4. Authenticated Scanning

Most scanners fail after the login pages.

Pentest-AI supports:

  • session handling
  • authentication workflows
  • credential reuse
  • cookie management
  • session refresh

This is critical because most modern applications hide important attack surfaces behind authentication.

5. Exploit Chaining

Traditional scanners generally detect isolated vulnerabilities.

Pentest-AI attempts to correlate findings into attack paths.

Example:

weak authentication → SQL injection → admin access → SSRF → cloud credential exposure

This capability is extremely valuable during real-world engagements.

6. PoC Validation

One of the biggest problems in AppSec is false positives.

Pentest-AI attempts to validate findings using non-destructive proofs-of-concept.

This helps:

  • reduce noise
  • improve trust
  • speed up triage
  • provide reproducible evidence

7. CI/CD Integration

The framework integrates directly into:

  • GitHub Actions
  • GitLab CI
  • Jenkins

It supports:

  • SARIF output
  • severity gating
  • PR comments
  • pipeline failures on critical findings

This makes it useful for DevSecOps workflows.

Installation Guide

Pentest-AI supports multiple installation paths.

Method 1: Basic Installation

Install using pip:

pip install ptai
Enter fullscreen mode Exit fullscreen mode

This installs the core framework.

Method 2: Claude Code MCP Integration (Recommended)

If you already use Claude Code, this is the easiest setup.

Install Pentest-AI:

pip install ptai
Enter fullscreen mode Exit fullscreen mode

Register it as an MCP server:

claude mcp add pentest-ai -- ptai mcp
Enter fullscreen mode Exit fullscreen mode

Restart Claude Code.

You can now issue prompts such as:

Run an authenticated pentest against staging.example.com
Enter fullscreen mode Exit fullscreen mode

No additional API key is required.

Your Claude subscription acts as the LLM backend.

Method 3: Setup for Cursor / Codex / VS Code

Run:

ptai setup --mcp
Enter fullscreen mode Exit fullscreen mode

The framework automatically detects compatible MCP clients and configures them.

Restart the editor afterward.

Method 4: Standalone CLI Mode

You can also run Pentest-AI directly.

Example:

ptai start https://target.com
Enter fullscreen mode Exit fullscreen mode

In standalone mode, you usually need an LLM provider.

Supported providers include:

  • Anthropic
  • OpenAI
  • Ollama
  • LiteLLM providers
  • Azure OpenAI
  • OpenRouter
  • Groq
  • DeepSeek
  • Mistral

Running Pentest-AI Without API Keys

One of the biggest advantages of Pentest-AI is that it can run without API keys.

There are several ways to do this.

For this tutorial, we will use local LLMs through Ollama instead of relying on a Claude subscription. While Pentest-AI integrates seamlessly with Claude Code via MCP, full Claude orchestration still requires an active Anthropic subscription or API-backed access.

One of the most interesting aspects of Pentest-AI is that it can also operate entirely with:

  • local models
  • offline inference
  • deterministic probes
  • wrapped security tools
  • local orchestration

without depending on cloud-based AI providers.

This means you can build a fully local AI-powered penetration testing environment directly on your own machine.

In this setup, we will use:

  • Claude Code (installed locally)
  • Ollama
  • Local LLMs
  • Pentest-AI
  • Traditional security tooling

The biggest advantages of this approach are:

  • no API costs
  • no cloud dependency
  • improved privacy
  • offline operation
  • full local control

Running Pentest-AI with Ollama Local Models

First, verify that Ollama is installed:

ollama --version
Enter fullscreen mode Exit fullscreen mode

Example output:

ollama version is 0.23.3
Enter fullscreen mode Exit fullscreen mode

Next, check the available local models:

ollama list
Enter fullscreen mode Exit fullscreen mode

Example:

NAME ID SIZE
gemma4:e2b 7fbdbf8f5e45 7.2 GB
Enter fullscreen mode Exit fullscreen mode

You can also install additional models that generally perform better for cybersecurity reasoning tasks:

ollama pull qwen2.5-coder
Enter fullscreen mode Exit fullscreen mode

or:

ollama pull deepseek-coder-v2
Enter fullscreen mode Exit fullscreen mode

or:

ollama pull llama3.1
Enter fullscreen mode Exit fullscreen mode

If you are using the Ollama Desktop application, you usually do not need to manually run ollama serve from the terminal. The desktop app automatically starts and manages the Ollama background service for you. As long as the Ollama Desktop app is running, the local API endpoint (http://localhost:11434) remains active and available for tools like Pentest-AI. You can verify this by opening the Ollama Desktop interface and checking that your models are visible and responding correctly. This makes the setup process much simpler because you can configure Pentest-AI to use your local Ollama instance directly, without manually starting the server each time.


Ollama is running on Port http://localhost:11434

Ollama is running on Port http://localhost:11434, try in the browser.

Now configure Pentest-AI to use Ollama:

export OLLAMA_HOST=http://localhost:11434
Enter fullscreen mode Exit fullscreen mode

Optionally specify the model:

export OLLAMA_MODEL=gemma4:e2b
Enter fullscreen mode Exit fullscreen mode

Next, install Pentest-AI:

pip install ptai
Enter fullscreen mode Exit fullscreen mode

Starting the First AI-Powered Scan

After configuring Ollama and Pentest-AI, the first scan can be launched directly from the terminal.

In this setup, Pentest-AI was configured to use a fully local LLM through Ollama:

export OLLAMA_HOST=http://localhost:11434
Enter fullscreen mode Exit fullscreen mode

Optionally specify the local model:

export OLLAMA_MODEL=gemma4:e2b
Enter fullscreen mode Exit fullscreen mode

Now start the scan:

ptai start https://target.com
Enter fullscreen mode Exit fullscreen mode

During the first execution, Pentest-AI displays an authorization and acceptable-use prompt:

pentest-ai is offensive security tooling. By using it you confirm:

1. You have explicit, written authorization to test every target
2. You will comply with applicable laws
3. You accept the Acceptable Use Policy
4. You accept the Terms of Service
Enter fullscreen mode Exit fullscreen mode

This is an important safeguard because Pentest-AI performs real offensive security operations and should only be used against authorized targets.

After accepting the prompt, Pentest-AI asks which LLM backend should be used:

1 Anthropic API key (Claude direct)
2 OpenAI API key (GPT direct)
3 Ollama (local model)
4 Skip — I use ptai through Claude Code (MCP server)
5 Skip — deterministic only, no AI
Enter fullscreen mode Exit fullscreen mode

In this tutorial, Ollama local models were selected:

Choice [1/2/3/4/5]: 3
Enter fullscreen mode Exit fullscreen mode

Pentest-AI then initializes the local AI workflow engine:

Using Ollama. Make sure it is running on http://localhost:11434.
Enter fullscreen mode Exit fullscreen mode

The framework then begins launching the engagement:

Starting Engagement
pentest-ai v0.14.0
Target: https://target.com
Scope: full
Intensity: normal
Enter fullscreen mode Exit fullscreen mode

The terminal output also shows the orchestration layer initializing:

agent_mode: 274 action handlers registered
Enter fullscreen mode Exit fullscreen mode

This means the framework has loaded:

  • AI workflow handlers
  • probe orchestration logic
  • tool execution layers
  • engagement pipelines
  • scanning modules

Finally, the scan begins:

Scanning https://target.com...
Enter fullscreen mode Exit fullscreen mode

At this stage, Pentest-AI starts coordinating:

  • reconnaissance
  • endpoint discovery
  • vulnerability probes
  • tool orchestration
  • workflow reasoning
  • attack-chain analysis

all using local LLM inference through Ollama without requiring any external cloud AI provider.

Important Note About Testing

For safety and legal reasons, scans should only be performed against:

  • systems you own
  • intentionally vulnerable labs
  • authorized environments
  • bug bounty targets within scope

Good testing environments include:

  • OWASP Juice Shop
  • DVWA
  • Metasploitable
  • WebGoat
  • PortSwigger Academy Labs

Using intentionally vulnerable applications is strongly recommended while learning AI-assisted offensive security workflows.

Running Pentest-AI Against OWASP Juice Shop

After configuring Ollama and Pentest-AI, the framework was tested against a local OWASP Juice Shop instance.

First, Juice Shop was launched locally using Docker:

docker run -d -p 3000:3000 bkimminich/juice-shop
Enter fullscreen mode Exit fullscreen mode

Docker returned the container ID:

1b2a25f39d2bdf9249f22c710406243ea8443d289b44c458e25b01a24fe13b93
Enter fullscreen mode Exit fullscreen mode

The application was then accessible locally at:

http://localhost:3000
Enter fullscreen mode Exit fullscreen mode

Next, Pentest-AI was launched against the target:

ptai start http://localhost:3000
Enter fullscreen mode Exit fullscreen mode

Pentest-AI immediately initialized the engagement workflow:

Starting Engagement
pentest-ai v0.14.0
Target: http://localhost:3000
Scope: full
Intensity: normal
Enter fullscreen mode Exit fullscreen mode

The framework then loaded its orchestration engine:

agent_mode: 274 action handlers registered
Enter fullscreen mode Exit fullscreen mode

This indicates that Pentest-AI successfully initialized:

  • AI workflow orchestration
  • scanning pipelines
  • probe handlers
  • attack-chain logic
  • external tool coordination
  • reporting systems

The scan then began:

Scanning http://localhost:3000...
Enter fullscreen mode Exit fullscreen mode

At this stage, the framework starts:

  • endpoint discovery
  • route enumeration
  • vulnerability probing
  • fingerprinting
  • tool orchestration
  • workflow reasoning
  • attack-chain analysis

all powered locally through Ollama without requiring any external cloud-based AI provider.

What Makes This Interesting?

This setup demonstrates one of the most important aspects of Pentest-AI:

Fully local AI-assisted offensive security workflows.

In this environment:

  • The LLM runs locally through Ollama
  • The target application runs locally through Docker
  • Pentest-AI orchestrates scans locally
  • No cloud API keys are required
  • No external infrastructure is needed

This creates a completely self-hosted AI-powered AppSec lab environment directly on a local machine.

Important Observation

At this stage, Pentest-AI may not immediately display extensive findings if many external offensive tools are missing.

Since the framework relies heavily on:

  • nmap
  • nuclei
  • ffuf
  • sqlmap
  • gobuster
  • and other tooling

The quality of scans depends heavily on the installed security toolchain.

However, even with minimal tooling installed, the framework still demonstrates:

  • orchestration logic
  • local AI integration
  • probe execution
  • engagement workflows
  • MCP-style operational design

which is already extremely valuable for AI security research and experimentation.

Installing Security Tools

Pentest-AI wraps many external tools.

The framework provides several installation strategies.

Automatic Tool Installation

At an engagement startup, Pentest-AI predicts which tools are needed.

Example:

ptai start https://target.com
Enter fullscreen mode Exit fullscreen mode

It then prompts you to install missing tools.

Batch Installation Tiers

Install essential tools:

ptai setup --tier core
Enter fullscreen mode Exit fullscreen mode

Recommended tools:

ptai setup --tier recommended
Enter fullscreen mode Exit fullscreen mode

Full installation:

ptai setup --tier full
Enter fullscreen mode Exit fullscreen mode

Per-Tool Installation

Install only specific tools:

ptai setup --per-tool wpscan,dalfox,paramspider
Enter fullscreen mode Exit fullscreen mode

Interactive installer:

ptai setup --wizard
Enter fullscreen mode Exit fullscreen mode

Example Workflow

A typical workflow may look like this:

pip install ptai
ptai setup --tier recommended
claude mcp add pentest-ai -- ptai mcp
Enter fullscreen mode Exit fullscreen mode

Then inside Claude Code:

Run an authenticated OWASP Top 10 assessment against staging.example.com
Enter fullscreen mode Exit fullscreen mode

The system may then:

  1. Enumerate endpoints
  2. Authenticate into the application
  3. Run probes
  4. Launch external tools
  5. Correlate findings
  6. Validate vulnerabilities
  7. Generate reports

Playbooks

Pentest-AI supports YAML playbooks.

Playbooks encode reusable methodologies.

Example:

name: internal-ad-pentest

phases:
  - id: recon
    too
Enter fullscreen mode Exit fullscreen mode

Conclusion

Pentest-AI represents one of the most interesting examples of AI-native cybersecurity tooling currently emerging in the open-source security ecosystem. Instead of relying purely on static scanners or template-based detection, the framework combines LLM orchestration, MCP integrations, deterministic probes, wrapped offensive security tools, and attack-chain reasoning into a single workflow engine.

What makes the project especially impressive is its flexibility. Pentest-AI can operate with cloud-based models like Claude or GPT-4, but it can also run entirely offline using local LLMs through Ollama, enabling fully self-hosted AI-powered AppSec labs without API costs or external dependencies.

While fully autonomous penetration testing still has significant limitations and does not replace experienced human security researchers, Pentest-AI offers a strong glimpse into the future of AI-assisted offensive security, automated AppSec workflows, and agentic cybersecurity systems.

For AppSec engineers, AI researchers, bug bounty hunters, and cybersecurity enthusiasts, Pentest-AI is absolutely worth exploring.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

Website: https://www.techlatest.net/

Newsletter: https://substack.com/@techlatest

Twitter: https://twitter.com/TechlatestNet

LinkedIn: https://www.linkedin.com/in/techlatest-net/

YouTube:https://www.youtube.com/@techlatest_net/

Blogs: https://medium.com/@techlatest.net

Reddit Community: https://www.reddit.com/user/techlatest_net/


Top comments (0)