Most security scanners produce a list of vulnerabilities ranked by severity and leave the remediation work to you. After working on projects where that list grew long and the question "can someone actually exploit this right now?" remained unanswered, I built something different.
The result is Breach Gate, an open source CLI tool that combines static analysis, container scanning, dynamic API testing, and AI-assisted behavioral testing into a single pipeline. It outputs one clear answer: SAFE, UNSAFE, or REVIEW REQUIRED.
The Core Problem
Traditional scanners answer: "What vulnerabilities exist?"
Breach Gate answers: "Can an attacker actually compromise the system right now?"
The distinction matters in CI pipelines. A list of medium-severity findings does not tell you whether to block a deployment. A confirmed exploit does.
Breach Gate scores every finding using a multiplicative formula:
Risk = Reachability x Exploitability x Impact x Confidence
A vulnerability that is hard to reach, has no working proof-of-concept, and low confidence stays at a low risk score. A confirmed exploit with a working payload gets boosted to critical regardless of how the individual factors score.
What It Tests
AI-Assisted Behavioral Testing
The scanner generates OWASP-based test cases per endpoint and executes them against your live API. Two mechanisms keep false positives low:
Baseline diffing -- a benign request is sent to each endpoint before any attack probes. Response tokens that appear in the baseline are filtered from vulnerability indicators, eliminating a large class of false positives where generic words like "error" or "id" triggered matches.
Time-based blind injection -- responses delayed more than 3 seconds AND more than 3 times the baseline timing are flagged as potential blind SQL or command injection, which cannot be detected from response bodies alone.
Attack categories covered out of the box:
| Category | Detection Method |
|---|---|
| SQL Injection | Response body, error text, blind timing |
| Command Injection | Response body, blind timing |
| XSS | Reflected probe in response |
| Broken Access Control | Status code shift vs baseline |
| SSRF | Cloud metadata endpoint probing |
| Mass Assignment | Privilege field echo in response |
| JWT Attacks | Algorithm confusion, claim tampering, expired token |
| Path Traversal | File content indicators in response |
Static Analysis via Trivy
Scans your source code and dependencies for known CVEs, exposed secrets, and misconfigurations. Results feed into the same scoring pipeline as dynamic findings.
Container Scanning
Pulls your Docker image and runs Trivy against the filesystem and OS packages. Findings are correlated with the API endpoint they affect where possible.
GraphQL Security Probing
For GraphQL APIs, Breach Gate runs five dedicated probes: introspection exposure, depth-limit denial of service, field suggestion enumeration, variable injection, and IDOR by ID enumeration.
Dynamic Testing via OWASP ZAP
When ZAP is available (local or Docker), the scanner runs an active API scan and merges the results with findings from other scanners.
The Output
SECURITY VERDICT:
╔════════════════════════════════════════════════════════╗
║ UNSAFE TO DEPLOY ║
╚════════════════════════════════════════════════════════╝
Reason: Confirmed exploitation: SQL Injection, Command Injection.
Active attacks succeeded during testing.
2 CONFIRMED EXPLOITS:
SQL Injection on POST /api/data
Command Injection on POST /api/execute
Attack Surface (by endpoint):
POST /api/execute
Risk: 95%
Command Injection
Attack chain: Command Injection -> Full System Compromise
POST /api/data
Risk: 90%
SQL Injection
Attack chain: Injection -> System Compromise
Reports are generated in JSON, Markdown, SARIF, and HTML. The HTML report includes a category filter bar and one-click evidence copy.
CI Integration
Breach Gate is published to the GitHub Marketplace as a composite action:
- name: Run Breach Gate
uses: epten08/breach-gate@v1
with:
target: ${{ vars.STAGING_API_URL }}
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
format: json,markdown,sarif
output: security-reports
The action outputs a verdict value (PASS or FAIL) that downstream steps can consume, and the SARIF report integrates directly with GitHub Code Scanning.
For teams not on GitHub, the same scan runs via npm:
npx breach-gate scan --target https://staging.api.example.com --ci
The --ci flag sets a non-zero exit code on UNSAFE verdicts, which blocks the deployment step in any CI system.
Watch Mode
For continuous environments, a watch command runs scans on a configurable interval and diffs findings between runs:
breach-gate watch --target http://localhost:3000 --interval 300
New findings are logged as warnings. Resolved findings are logged as informational. This is useful for staging environments that receive frequent deployments.
Suppressing Known Findings
Teams working on legacy APIs often have accepted known issues that are tracked. A .breachgateignore file prevents those from blocking pipelines:
suppress:
- id: "finding-abc123"
reason: "Tracked in JIRA-456, fix scheduled for next sprint"
expires: "2026-06-01"
- pattern: "Missing security header"
endpoint: "/api/health"
reason: "Health check endpoint, intentionally minimal headers"
Rules with an expires date automatically stop suppressing after that date, which prevents forgotten suppressions from masking real regressions.
Getting Started
# Install globally
npm install -g breach-gate
# Run against your API
breach-gate scan --target http://localhost:3000
# Run the built-in demo to see a full vulnerable API scan
git clone https://github.com/epten08/breach-gate
cd breach-gate
npm install
npm run demo # starts a deliberately vulnerable API
npm run scan # scans it
An OpenAPI spec can be passed to give the scanner full endpoint coverage:
breach-gate scan --target http://localhost:3000 --openapi ./openapi.yml
Without a spec, the scanner infers common endpoint patterns and uses them as a starting point.
Lessons Learned
Reducing the false positive rate was more challenging than building the detection logic. Early versions flagged nearly everything because words like "error", "id", and "success" appeared in every API response. Combining baseline diffing with restricting body matches to 2xx responses brought the false positive rate to a manageable level.
Prompt design for the Anthropic API also required careful iteration. Prompts using direct offensive language were blocked by content filtering. Reframing the same tests as "authorized penetration testing" and "OWASP-based assessment probes" passed the filter while generating identical test cases.
Links
- GitHub: https://github.com/epten08/breachgate
- npm: https://www.npmjs.com/package/breach-gate
- GitHub Marketplace: https://github.com/marketplace/actions/breach-gate
Contributions, bug reports, and false positive reports are welcome. The contributing guide covers how to add new attack categories and scanners.
Top comments (0)