DEV Community

Vijaya Bollu
Vijaya Bollu

Posted on

How I Built a Local AI Docker Vulnerability Scanner (No API Costs, No Cloud)

How I Built a Local AI Docker Vulnerability Scanner (No API Costs, No Cloud)


The Problem with Trivy Output

Running Trivy gives you a wall of CVE numbers. Most developers copy-paste them into Google and spend 20 minutes figuring out if each one actually matters for their use case.

I built a tool that fixes this.


What I Built

A local AI wrapper around Trivy that:

  • Scans any Docker image
  • Takes the raw CVE output
  • Feeds it to Ollama (local LLM — no API costs)
  • Returns plain English explanations + specific fix recommendations

The Interesting Finding

nginx:1.27-alpine: 14 vulnerabilities
nginx:alpine:       3 vulnerabilities
Enter fullscreen mode Exit fullscreen mode

Same base image family — pinned version had 4.5x more CVEs. The AI caught this pattern and recommended variants to compare automatically.


Tech Stack

  • Python 3.11
  • Trivy (vulnerability scanner)
  • Ollama + Llama 3.2 (local LLM)
  • Zero cloud dependencies

How It Works (Code Walkthrough)

The scanner has three moving parts: Trivy does the heavy lifting of CVE detection, Python orchestrates everything, and Ollama explains what it all means.

Step 1 — Scan with Trivy and parse the JSON:

def scan_image(self, image_name: str) -> Optional[Dict]:
    result = subprocess.run(
        ["trivy", "image", "--format", "json", "--severity", "HIGH,CRITICAL", image_name],
        capture_output=True, text=True, check=True
    )
    return json.loads(result.stdout)

def extract_vulnerabilities(self, scan_data: Dict) -> List[Dict]:
    vulnerabilities = []
    seen_vulns = set()  # deduplicate by CVE ID

    for result in scan_data.get("Results", []):
        for vuln in result.get("Vulnerabilities", []):
            vuln_id = vuln.get("VulnerabilityID", "N/A")
            if vuln_id in seen_vulns:
                continue
            seen_vulns.add(vuln_id)
            vulnerabilities.append({
                "id": vuln_id,
                "package": vuln.get("PkgName", "N/A"),
                "severity": vuln.get("Severity", "N/A"),
                "fixed_version": vuln.get("FixedVersion", "Not available")
            })
    return vulnerabilities
Enter fullscreen mode Exit fullscreen mode

Step 2 — Send each CVE to Ollama for a plain English explanation:

def explain_vulnerability(self, vuln: Dict) -> str:
    prompt = f"""You are a security expert explaining vulnerabilities to developers.

Vulnerability Details:
- ID: {vuln['id']}
- Package: {vuln['package']} (version {vuln['version']})
- Severity: {vuln['severity']}
- Title: {vuln['title']}

Explain in 2-3 sentences:
1. What this vulnerability means in simple terms
2. Why it's dangerous
3. How to fix it (fixed version: {vuln['fixed_version']})

Keep it concise and actionable. Use analogies if helpful."""

    response = requests.post(
        f"{self.ollama_host}/api/generate",
        json={"model": "llama3.2", "prompt": prompt, "stream": False},
        timeout=60
    )
    return response.json().get("response", "")
Enter fullscreen mode Exit fullscreen mode

Step 3 — Generate an overall summary with structured output:

The summary prompt forces Ollama into a key-value format so we can parse it reliably and build a comparison command on the fly — more on that in the next section.


The Trickiest Part

Getting Ollama to return structured output consistently was harder than expected. Free-form responses were great for individual CVE explanations, but the security summary needed to be parseable — I needed specific fields like SECURITY_POSTURE and VARIANTS_TO_TEST to programmatically build the comparison command.

The solution was strict prompt formatting: I told the model to respond in KEY: value pairs and gave it an explicit example. Then I split each line on : and built a dict. When parsing failed I fell back to a hardcoded comparison command. The other challenge was Llama 3.2 sometimes repeating itself — I solved that with a deduplication pass that checks for repeated section headers (**1., **Vulnerability, etc.) and drops them before printing.


Results

Before — Raw Trivy output:

CVE-2024-1234 (CRITICAL)
Package: openssl 1.1.1k
Description: Use-after-free in X509_verify_cert function
Enter fullscreen mode Exit fullscreen mode

😕 "What does this mean? Do I need to care about this?"

After — AI-enhanced output:

🤖 AI Explanation:
This is like leaving your house key under the doormat.
OpenSSL handles your HTTPS connections, and this bug lets
attackers potentially decrypt traffic. Fix: update your
Dockerfile base image to get openssl 1.1.1w or later.
Enter fullscreen mode Exit fullscreen mode

"Got it, I'll update the base image today."

Metric Value
Avg scan time 15–30 seconds
AI explanation per CVE ~3 seconds
Cloud API cost $0
Images tested 50+ (nginx, node, python, ubuntu)

Manual CVE triage that used to take 20+ minutes per image now takes under a minute for the top 5 vulnerabilities.


Try It Yourself

GitHub: https://github.com/ThinkWithOps/ai-devops-projects/tree/main/01-ai-docker-scanner
Full demo video: https://youtu.be/J6fmU6t9jUU

# Prerequisites: Docker, Trivy, Ollama + llama3.2 pulled
git clone 
cd 01-ai-docker-scanner
pip install -r requirements.txt
python src/docker_scanner.py nginx:latest
Enter fullscreen mode Exit fullscreen mode

What's Next

This is Project 1 in my AI+DevOps series. Next I built an AI K8s Pod Debugger — link in my profile.

Top comments (0)