Travis Felder

Posted on Feb 27

Automated Threat Modeling with AI - How Thr8 Works

#security #devsecops #ai #github

Every security compliance framework asks the same question: "Where is your threat model?"

And every engineering team gives the same answer: "We will get to it."

PASTA (Process for Attack Simulation and Threat Analysis) is one of the most thorough threat modeling frameworks. It is risk-centric, covers 7 stages from business objectives through attack simulation, and produces actionable output. But it takes days of manual work per application. Most teams never start.

I built thr8 to automate this. It is a GitHub Action that generates complete PASTA threat models by combining static codebase analysis with AI-powered threat reasoning. This article walks through the architecture, the PASTA methodology, and how it integrates into CI/CD.

The PASTA Framework in 60 Seconds

PASTA has 7 stages. Most threat modeling tools skip half of them. thr8 covers all 7:

Stage	Name	What thr8 Does
1	Business Objectives	Identifies what the system protects and the impact of a breach
2	Technical Scope	Detects tech stack, infrastructure, data classification
3	Application Decomposition	Generates data flow diagrams across trust boundaries
4	Threat Analysis	Maps attack surfaces and threat vectors
5	Vulnerability Analysis	Identifies specific weaknesses with severity ratings
6	Attack Modeling	Creates realistic kill chain scenarios
7	Risk & Impact Analysis	Scores business risk with tactical recommendations

The key insight behind PASTA is that threats only matter in the context of business risk. A SQL injection in an internal admin tool is a different risk than a SQL injection in a payment processing API. thr8 captures that context.

Architecture: Static Analysis + AI Reasoning

thr8 runs a 4-stage pipeline. The first stage is deterministic (no AI). The second uses Claude. The third is templating. The fourth is optional GitHub integration.

  Discovery (Static)           Reasoning (Claude AI)         Output            Remediation
+---------------------+      +----------------------+      +----------+      +--------------+
|                     |      | Business Objectives   |      | Markdown |      | GitHub Issues|
|  Codebase Scanner   |----->| Attack Surfaces       |----->| JSON     |----->| Fix PRs      |
|                     |      | Kill Chain Scenarios   |      | HTML     |      |              |
| - Tech stack        |      | Risk Analysis         |      | PDF      |      | (optional)   |
| - Infrastructure    |      | Recommendations       |      |          |      |              |
| - API endpoints     |      |                       |      |          |      |              |
| - Data flows        |      | (3 focused API calls)  |      |          |      |              |
+---------------------+      +----------------------+      +----------+      +--------------+

Stage 1: Discovery (Static Analysis)

The CodebaseScannerAgent walks the repository tree, prioritizing security-relevant files. It uses a scoring system to read the most important files first:

Priority files (always included): package.json, Dockerfile, docker-compose.yml, terraform/*.tf, .env.example
High-signal source files (read first): routes, controllers, auth middleware, security configs
Infrastructure configs: Terraform HCL, Kubernetes manifests, Docker Compose YAML
General source: remaining source files up to a 120K character budget

The scanner does not use AI for this step. It reads files, respects a configurable size budget (8K chars per file, 120K total), skips binary files and lock files, and produces a structured context document.

Here is what the file prioritization looks like:

// Files sorted by security relevance
files.sort((a, b) => {
  const score = (p) => {
    if (/route|controller|handler|endpoint|api/i.test(p)) return 0;
    if (/auth|security|middleware|guard/i.test(p)) return 1;
    if (/config|setting/i.test(p)) return 2;
    if (/\.tf$|docker|k8s|helm|deploy/i.test(p)) return 3;
    if (/model|schema|migration|database/i.test(p)) return 4;
    if (/service|util|helper|lib/i.test(p)) return 5;
    return 6;
  };
  return score(a.path) - score(b.path);
});

The output of this stage is a JSON document containing:

system_context: project name, tech stack (languages, frameworks, databases, external services, auth mechanisms, security controls), infrastructure (cloud provider, containerization, services), API surface (endpoints with methods, paths, auth requirements, sensitive data), and sensitive patterns found in code
data_flows: traced data movement through the system with steps, protocols, data classification, and trust boundaries

Stage 2: Reasoning (Claude API)

The ThreatGeneratorAgent sends the discovery context to Claude along with a STRIDE attack pattern database. The pattern database contains ~40 pre-built attack patterns across 4 categories:

stride-api.json: API key theft, request body tampering, CORS misconfiguration, rate limiting bypass
stride-auth.json: Credential stuffing, session hijacking, JWT forgery, privilege escalation
stride-database.json: SQL injection, unencrypted data at rest, connection pool exhaustion
stride-storage.json: Unauthorized access, pre-signed URL abuse, missing encryption

Claude performs the PASTA analysis (Stages 1-2 and 4-7) and returns structured JSON covering:

{
  "business_objectives": [...],
  "overall_risk_status": "MEDIUM",
  "attack_surfaces": [
    {
      "name": "Public API Endpoint",
      "vector": "HTTP",
      "weakness": "Missing rate limiting on authentication endpoint",
      "vulnerabilities": [
        {
          "id": "API-AUTH-001",
          "title": "Brute Force Authentication",
          "description": "No rate limiting on /api/auth/login allows...",
          "severity": "High"
        }
      ]
    }
  ],
  "attack_scenarios": [
    {
      "name": "Account Takeover via Credential Stuffing",
      "objective": "Gain access to user accounts",
      "steps": [
        {
          "phase": "Reconnaissance",
          "action": "Enumerate valid usernames via registration endpoint",
          "exploits": ["API-AUTH-002"]
        },
        {
          "phase": "Exploitation",
          "action": "Brute force login with credential lists",
          "exploits": ["API-AUTH-001"]
        }
      ]
    }
  ],
  "risk_analysis": [...],
  "tactical_recommendations": [...]
}

The model is claude-sonnet-4-6. Total token usage is typically 2-5K input and 3-8K output per call (3 calls total). Cost: $0.05-0.15 per run.

Stage 3: Output Generation

The ReporterAgent renders the analysis into multiple formats using Handlebars templates:

Markdown (THREAT_MODEL.md) -- renders natively on GitHub with embedded Mermaid data flow diagrams:

graph LR
    auth_flow_0["Browser<br/><i>external_user</i>"]
    auth_flow_1["API Gateway<br/><i>load_balancer</i>"]
    auth_flow_2["Auth Service<br/><i>application</i>"]
    auth_flow_3["User Database<br/><i>database</i>"]
    auth_flow_0 -->|"HTTPS"| auth_flow_1
    auth_flow_1 -->|"internal"| auth_flow_2
    auth_flow_2 -->|"TLS"| auth_flow_3

JSON (threat-model.json) -- machine-readable for CI/CD integration, custom dashboards, or feeding into other security tools.

HTML (THREAT_MODEL.html) -- professional report with sidebar navigation, executive summary dashboard, color-coded severity levels, and embedded diagrams. Print-friendly for stakeholder distribution.

PDF (THREAT_MODEL.pdf) -- generated from HTML via headless Chrome when available on the runner.

Stage 4: Automated Remediation

The RemediatorAgent is the most interesting part. When you provide a github-token and enable create-issues and auto-fix, the action does not just report findings -- it acts on them.

For each vulnerability found:

Is auto-fix enabled AND severity is in pr-severity list?
  +-- YES --> Generate fix with Claude
  |           +-- High/medium confidence --> Open fix PR
  |           +-- Low confidence --> Fall back to issue
  +-- NO  --> Create GitHub Issue (if create-issues enabled)

Fix PRs are created on thr8/fix-{vuln-id} branches. Each PR includes:

The minimal code change needed
An explanation of what was fixed
Risk context (severity, business impact)
A list of changed files

Deduplication is built in. Each issue and PR body contains a hidden marker () that prevents duplicates on re-runs.

CI/CD Integration

Basic Setup

name: Threat Model
on:
  push:
    branches: [main]
  pull_request:

jobs:
  threat-model:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Generate Threat Model
        uses: cybrking/thr8@v1
        with:
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

      - name: Upload Report
        uses: actions/upload-artifact@v4
        with:
          name: threat-model
          path: threat-model/

Fail Builds on Critical Findings

- name: Generate Threat Model
  uses: cybrking/thr8@v1
  with:
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
    fail-on-high-risk: 'true'

Full Remediation Pipeline

jobs:
  threat-model:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
      issues: write
    steps:
      - uses: actions/checkout@v4

      - name: Generate Threat Model
        id: threat-model
        uses: cybrking/thr8@v1
        with:
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
          github-token: ${{ secrets.GITHUB_TOKEN }}
          create-issues: 'true'
          auto-fix: 'true'
          pr-severity: 'critical,high'

      - name: Summary
        run: |
          echo "Threats found: ${{ steps.threat-model.outputs.threats-found }}"
          echo "Critical: ${{ steps.threat-model.outputs.high-risk-count }}"
          echo "Issues created: ${{ steps.threat-model.outputs.issues-created }}"
          echo "Fix PRs created: ${{ steps.threat-model.outputs.prs-created }}"

Post Summary as PR Comment

- name: Comment on PR
  if: github.event_name == 'pull_request'
  uses: actions/github-script@v7
  with:
    script: |
      const fs = require('fs');
      const report = fs.readFileSync('threat-model/THREAT_MODEL.md', 'utf8');
      github.rest.issues.createComment({
        issue_number: context.issue.number,
        owner: context.repo.owner,
        repo: context.repo.repo,
        body: report
      });

Supported Tech Stacks

The codebase scanner automatically detects:

Category	Examples
Languages	JavaScript, TypeScript, Python, Go, Java, Ruby
Frameworks	Express, Django, Rails, FastAPI, Spring Boot, Next.js
Databases	PostgreSQL, MySQL, MongoDB, Redis, DynamoDB
Infrastructure	Terraform, Docker, Docker Compose, Kubernetes
Auth	JWT, OAuth, session-based, API keys
Cloud	AWS, GCP, Azure resource detection

Cost Transparency

Three Claude API calls per run using claude-sonnet-4-6:

Typical input: ~2-5K tokens per call
Typical output: ~3-8K tokens per call
Estimated cost: $0.05-0.15 per run (analysis only)

With auto-fix enabled, add ~$0.02-0.06 per run for fix generation.

For a team running this on 50 repos with weekly pushes to main, that is roughly $10-30/month for continuous threat modeling across your entire portfolio.

What It Does Not Do

Transparency matters. thr8 is not:

A DAST scanner. It does not make HTTP requests to your running application.
A replacement for penetration testing. It identifies architectural threats, not runtime vulnerabilities.
A compliance certification tool. It produces documentation that supports compliance, but does not certify anything.
Deterministic. Claude's analysis may vary between runs. The static discovery is deterministic, but the threat reasoning is probabilistic.

Try It

The repo is open source (MIT): github.com/cybrking/thr8

GitHub Marketplace: PASTA Threat Model Generator

Add it to a repo, run it, and open an issue with feedback. The whole setup takes less than 5 minutes.

DEV Community