Building IntelliHunt: An AI-Agentic Platform for Automated Cyber Threat Intelligence
Cyber Threat Intelligence (CTI) work is relentless. Every day, new CVEs are published, threat actors shift tactics, and defenders are left manually cross-referencing vulnerability databases, reading advisories, and hand-writing detection queries. It's important work — but a significant chunk of it is mechanical enough to automate.
That's the premise behind IntelliHunt: a fully containerized, AI-driven platform that takes a description of your software stack and produces an actionable threat intelligence report — complete with CVE analysis, organizational risk context, and Splunk detection queries — without you having to touch the NVD API or write a single SPL line manually.
IntelliHunt is designed to be run daily or on an ad hoc basis. Kick it off each morning and you get a focused snapshot of the last 24 hours of NVD disclosures relevant to your stack. Trigger it manually after a major vendor advisory or a new repository merge and you get an immediate exposure assessment. The pipeline is optimized to complete fast enough that either pattern is practical.
This post walks through how it's built, what makes it interesting architecturally, and the performance and UX decisions made along the way.
The Problem with Manual CTI
A typical CTI workflow looks something like this:
- Pull a list of software your organization runs
- Check NVD / CISA KEV for relevant CVEs
- Read each CVE advisory and related blogs to understand exploitability
- Figure out whether your specific version and configuration is actually affected
- Write detection logic for your SIEM
- Repeat for the next software component
Step 2 onward is where most of the time goes. Checking dozens of CVEs, reading technical writeups, and translating that into detection rules can take a skilled analyst hours per software component. IntelliHunt automates steps 2 through 5 using an agentic AI pipeline.
Architecture at a Glance
IntelliHunt is split into two containers that communicate over a bridge network:
┌─────────────────────────────────────────────┐
│ Docker Compose │
│ │
│ ┌──────────────┐ ┌───────────────────┐ │
│ │ Next.js 15 │───▶│ Django + Channels│ │
│ │ (port 3000) │ │ (port 8000) │ │
│ └──────────────┘ └────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ CrewAI Pipeline │ │
│ │ (spawned as subprocess)│
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ NVD / CISA / Web │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────┘
- Frontend: Next.js 15 (App Router) with Tailwind v4, Outfit + DM Sans font pair
- Backend: Django with Django Channels for WebSocket support
- AI Pipeline: CrewAI orchestrating two specialized agents
- Data Sources: NIST NVD CPE/CVE APIs, Google Search, web scraping
The Agentic Pipeline
The heart of IntelliHunt is a two-agent CrewAI crew that runs sequentially. Each agent has a defined role, goal, and toolset — and their outputs chain into each other.
Agent 1: The CTI Analyst
ctiAnalyst:
role: >
Senior Cyber Threat Intelligence Analyst with expertise in
vulnerability assessment and threat attribution
goal: >
Conduct comprehensive threat intelligence analysis including
exploitability assessment, threat actor attribution, and
organization-specific risk evaluation.
This agent receives structured vulnerability data from NVD (CVE IDs, CVSS scores, affected versions) and enriches it. It uses Google Search and a web scraper to pull in recent technical blogs, PoC repositories, and threat actor reporting. It outputs a rich structured object (EnhancedTrendingThreat) that includes:
- Exploitability metrics (attack vector, complexity, privileges required)
- Known threat actors and campaigns associated with the CVE
- Indicators of compromise (IPs, domains, file hashes)
- Organization-specific risk assessment based on the submitted software stack
- Recommended immediate actions
Agent 2: The CTH Analyst (Threat Hunter)
cthAnalyst:
role: >
Senior Cyber Threat Hunter and Detection Engineer with expertise
in SIEM rule development and threat hunting
goal: >
Develop sophisticated detection methods and hunting queries based
on comprehensive threat intelligence.
max_iter: 1
This agent takes the enriched threat data from Agent 1 and generates Splunk SPL detection queries. It outputs an EnhancedDetectionMethod that includes the query itself, confidence level, false positive risk, relevant log sourcetypes, and testing instructions. max_iter=1 caps the agent to a single reasoning pass — empirically this produces detection queries of the same quality as multi-pass runs while cutting execution time significantly.
Running the Pipeline
The pipeline is triggered asynchronously from Django — a POST to /api/generate/ spawns it in a background thread and returns a task_id immediately. The frontend polls /api/task/<id>/ every two seconds and streams the agent's stdout logs into a live log viewer:
process = subprocess.Popen(
['python', '-u', 'intelliHunt/crew/system.py', payload_file_path],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
stdin=subprocess.DEVNULL, # prevents CrewAI's interactive prompt from blocking
text=True,
bufsize=1
)
_log_flush_counter = 0
for line in process.stdout:
status_data['logs'].append(line.rstrip('\n'))
task_status[task_id] = status_data
# Batch disk writes — flush every 10 lines instead of every line
_log_flush_counter += 1
if _log_flush_counter % 10 == 0:
save_task_status(task_id, status_data)
The batched disk write is a small but meaningful optimization: writing the task status JSON file on every log line was causing measurable I/O overhead during verbose agent runs. Writing every 10 lines drops disk writes by 90% with no visible effect on the UI's polling cadence.
NVD / CPE Integration
Before the AI agents run, IntelliHunt fetches structured vulnerability data from the NIST National Vulnerability Database using the CPE (Common Platform Enumeration) API.
You describe your software stack in a YAML config:
software_stack:
operating_systems:
- vendor: microsoft
product: windows_server_2022
version: "21H2"
applications:
- vendor: apache
product: log4j
version: "2.14.1"
- vendor: openssl
product: openssl
version: "3.0.2"
The NVDCPEClient resolves each entry to its canonical CPE URI, then the CVEFetcher queries the NVD CVE API to find all known vulnerabilities matching those CPEs within the last 24 hours. Targeting a single day rather than a rolling week keeps the dataset small, focused on what's actually new, and means the agents don't burn context window chewing through month-old findings.
The YAML config can be:
- Uploaded directly via the UI (a template is provided for download)
- Manually entered through the UI form (vendor/product rows per OS and application)
- Synced from a CMDB via the Settings integrations panel (more on that below)
- Auto-detected via the Repository Scanner
Parallel CVE Fetching
The original implementation fetched CVEs for each CPE sequentially with a 5-second sleep between requests — fine for a handful of components, but noticeably slow with a full software stack. The updated fetcher parallelizes across CPEs:
from concurrent.futures import ThreadPoolExecutor, as_completed
with ThreadPoolExecutor(max_workers=5) as executor:
futures = {
executor.submit(self._fetch_cves_for_single_cpe, cpe, days_back): cpe
for cpe in cpe_list[:max_cpes]
}
for future in as_completed(futures):
results.extend(future.result())
Combined with reducing the inter-request sleep from 5 s to 0.6 s (still polite enough for the NVD rate limit), this brings typical CVE fetch time down from over a minute to under 15 seconds for a medium-sized stack.
CVE Prioritization
Rather than handing every fetched CVE to the agent crew, the pipeline now sorts by CVSS score and caps the input at the top 10:
inputs = sorted(inputs, key=lambda x: x.get('cvss_score') or 0, reverse=True)[:10]
Analysts care most about critical and high findings. Feeding 40 medium-severity CVEs through a full agentic enrichment loop wastes time and API tokens without producing proportionally better output.
Result Caching
NVD responses are cached for 12 hours. For daily runs this means the second run of the day — if someone triggers a re-run a few hours after the morning job — hits the cache instead of re-fetching the same set.
Repository Scanner
One of the more practically useful features is the Repo Scanner. Instead of manually cataloguing your software stack, you point it at a codebase and it figures out what you're running.
It runs two tools in parallel:
OSV-Scanner
Google's OSV-Scanner walks dependency manifests (requirements.txt, package.json, go.sum, pom.xml, Cargo.lock, etc.) and cross-references them against the OSV vulnerability database to produce a list of CVE IDs with affected package versions.
Semgrep
Semgrep runs static analysis across the codebase using its --config=auto rule set, catching code-level security issues beyond what dependency scanning can see — things like SQL injection patterns, hardcoded secrets, and insecure deserialization.
Both scanners are wrapped as CrewAI tools and handed to the Code Security Analyst agent:
@tool("run_security_scans")
def run_security_scans(directory: str) -> str:
"""Runs OSV-Scanner and Semgrep on a directory and returns a combined JSON summary."""
osv_cmd = ["osv-scanner", "-r", directory, "--json"]
sem_cmd = ["semgrep", "scan", "--config=auto", directory, "--json"]
# ... run both, combine results
After scanning, a Security Researcher agent enriches the found CVEs against CISA KEV (Known Exploited Vulnerabilities) and NVD to produce a prioritized report that highlights which findings are actively exploited in the wild.
The scanner accepts:
- A public or private GitHub URL
- A local .zip archive (uploaded through the UI)
Scan Information
Each completed scan produces a structured markdown report built from the raw agent output. For every detection found, the report includes:
- Detection method — the name and description of the specific vulnerability or code pattern identified
- Detection type — classification of the finding (e.g., dependency vulnerability, secret exposure, insecure code pattern)
- Confidence level — the agent's assessment of how certain the detection is
- False positive risk — likelihood that the finding is a false alarm given the codebase context
- Detection environment — the log sourcetypes, existing SIEM rules, and retention context relevant to hunting for this threat
- Threat indicators — broken down into network indicators (IPs, domains), host indicators (file paths, registry keys), behavioral indicators (process chains, lateral movement patterns), and temporal indicators (timing patterns associated with exploitation)
- References — links to CVE advisories, PoC repositories, and technical writeups sourced during enrichment
- Testing instructions — step-by-step guidance for validating the detection query in your environment
- Maintenance notes — tuning guidance and known edge cases for the generated detection logic
CMDB Integrations
Getting an accurate software stack into the tool has always been a friction point. The Settings panel includes a CMDB integration system that pulls your live asset inventory and writes it directly to the organization_cmdb.yaml config that drives report generation.
Supported integrations:
| Provider | Auth |
|---|---|
| ServiceNow | Basic (username / password) |
| BMC Helix ITSM | JWT login |
| Atlassian Assets (JSM) | Basic (email / API token) |
| Custom API | None, Basic, Bearer, or API key header |
Each integration has three operations:
- Test — validates credentials and connectivity without importing anything
- Import — pulls CI records, maps OS names to vendor/product pairs, and writes the resulting YAML config
- Last synced — records the timestamp of the last successful import so you know how fresh your inventory is
The custom integration type deserves a note: it accepts any REST endpoint that returns a JSON list or object with a predictable shape, which means it works against internal asset databases, homegrown CMDBs, or any inventory system that exposes an API.
The Dashboard UI
The front-end is a Next.js 15 app with a midnight charcoal-navy theme — deep layered backgrounds (#090c14), electric blue accent (#4a9eff), and jewel-toned severity colors (ruby critical, amber high, sapphire medium, emerald low). Typography uses the Outfit + DM Sans font pair: geometric display weight for headings, clean sans-serif for body text.
The design system is driven entirely by CSS custom properties, so every color decision — accent glows, glass surface backgrounds, focus rings — is controlled from one place in globals.css and propagates automatically.
What Gets Displayed
Metric Cards — four stat cards across the top show: Total CVEs, Critical severity count (CVSS 9.0+), High severity count (CVSS 7.0–8.9), and Medium/Low combined. Each card has a jewel-toned gradient background and a top accent line that goes full-opacity when that severity filter is active.
Severity Distribution — an SVG donut chart (built without a chart library — just hand-rolled SVG arc paths) shows the breakdown of CVEs by severity level with a live legend. The chart data is derived entirely from regex-parsing CVSS scores and severity keywords out of the report markdown.
Threats Ranked by Score — a horizontal bar chart shows the top CVEs ordered by CVSS score, with bars color-coded by severity tier.
Vulnerable Software — a searchable table of software products extracted from the report's "Affected Products" sections, showing CVE count, max score, and a severity badge per product.
Full Report — the complete markdown output from the AI crew is rendered below using react-markdown, with styled headings, code blocks, and tables.
The donut chart, for example, is built entirely in SVG with no third-party dependency:
const slices = entries.map(e => {
const frac = e.value / total;
const a0 = angle, a1 = angle + frac * 2 * Math.PI;
angle = a1;
const lg = frac > 0.5 ? 1 : 0;
return {
...e,
color: SEV_COLORS[e.key],
path: [
`M ${cx + r * Math.cos(a0)} ${cy + r * Math.sin(a0)}`,
`A ${r} ${r} 0 ${lg} 1 ${cx + r * Math.cos(a1)} ${cy + r * Math.sin(a1)}`,
`L ${cx + inner * Math.cos(a1)} ${cy + inner * Math.sin(a1)}`,
`A ${inner} ${inner} 0 ${lg} 0 ${cx + inner * Math.cos(a0)} ${cy + inner * Math.sin(a0)}`,
"Z",
].join(" "),
};
});
Staggered Entrance Animations
The stat card grid uses a CSS stagger pattern — each direct child gets a progressively longer animation-delay so the cards cascade in rather than all appearing at once:
.stagger > * { animation: fade-up 0.35s ease both; }
.stagger > *:nth-child(1) { animation-delay: 0ms; }
.stagger > *:nth-child(2) { animation-delay: 60ms; }
.stagger > *:nth-child(3) { animation-delay: 120ms; }
/* ... */
A single .stagger class on the grid container is all that's needed — no JavaScript, no IntersectionObserver. The charts row follows with a 320 ms delay so the cards finish entering before the charts animate in.
All animations are gated behind @media (prefers-reduced-motion: reduce) to respect system accessibility settings.
Performance Summary
The original implementation was oriented around 7-day CVE lookbacks and sequential API calls, which produced long run times that made daily use feel impractical. Here's what changed:
| Area | Before | After |
|---|---|---|
| CVE lookback window | 7 days | 1 day |
| NVD fetch strategy | Sequential | Parallel (ThreadPoolExecutor, 5 workers) |
| Per-request sleep | 5 s | 0.6 s |
| CPE limit per run | 50 | 15 |
| CVEs fed to agents | Unbounded | Top 10 by CVSS score |
| Agent max iterations | Default |
max_iter=1 (CTH analyst) |
| Result cache TTL | 6 hours | 12 hours |
| Task status disk writes | Every log line | Every 10 log lines |
The cumulative effect is that a typical run completes in roughly a third of the original time, which is what makes running this daily actually viable rather than aspirational.
Tech Stack Summary
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React 19, Tailwind v4 |
| Backend | Django 5, Django Channels (WebSockets) |
| AI Orchestration | CrewAI |
| LLM Providers | OpenAI, Groq (configurable via .env) |
| Vulnerability DB | NIST NVD CPE/CVE API |
| Static Analysis | Semgrep (--config=auto) |
| Dependency Scanning | OSV-Scanner |
| CMDB Integrations | ServiceNow, BMC Helix, Atlassian Assets, Custom |
| Container Runtime | Docker + Docker Compose |
Getting Started
Prerequisites
- Docker and Docker Compose
- An API key for OpenAI or Groq
Setup
# 1. Clone the repository
git clone https://github.com/spkatragadda/intelliHunt
cd intellihunt
# 2. Configure your environment
cp env.example .env
# Edit .env — set MODEL, OPENAI_API_KEY or GROQ_KEY, MAX_TOKENS, MAX_RPM
# 3. Build and launch
docker-compose up --build
Navigate to http://localhost:3000. On first load, if no report exists yet, the UI drops you into the Generate tab automatically.
Generating Your First Report
- In the Generate tab, add your operating system(s) and application(s) using the vendor/product input rows — or upload a YAML config file (download the template first for reference), or connect a CMDB integration in Settings and import your inventory.
- Click Run Report.
- Watch the live log stream as the CrewAI agents work through the analysis.
- When complete, the app switches to the Dashboard tab and displays your metrics.
Scanning a Repository
Navigate to Repository Scanner in the sidebar. Paste a GitHub URL or upload a .zip of your repo. The scan runs OSV-Scanner and Semgrep, then enriches the findings against NVD and CISA KEV. The report persists in the scan history — you can re-open it any time by clicking the row.
Design Decisions Worth Noting
Why 1-day CVE window? The original 7-day window was a safe default, but in practice most CTI teams only care about what's new since their last run. A 1-day window means daily runs produce focused, fresh output rather than re-processing a growing backlog of already-reviewed findings. For deeper historical analysis the window can be extended in the config.
Why subprocess for the AI pipeline? The CrewAI crew is resource-intensive and can run for several minutes. Spawning it as a subprocess keeps Django responsive and makes it straightforward to stream stdout logs back to the client without blocking the web server. A proper Celery + Redis setup would be the production answer, but the subprocess approach works well for single-instance deployments.
Why hand-roll the SVG charts? Adding Recharts or Chart.js for two charts would pull in hundreds of kilobytes of JavaScript. The donut and bar chart shapes are simple enough that 30–40 lines of SVG math handles them cleanly.
Why parse the markdown on the client? The report is already being fetched and rendered — running a few regexes over it to extract CVSS scores and severity labels is essentially free and avoids building a separate structured output endpoint.
Why localStorage for scan history? Repo scans take minutes. Users navigate away, close tabs, and come back later. Storing the report content client-side survives server restarts and requires no database schema changes. The practical downside — localStorage size limits — isn't a concern for typical markdown reports of a few KB, and history is capped at 20 records. This feature should be updated to store recently ran reports in the database.
What's Next
- Local inference support — run against Ollama/llama.cpp models for air-gapped environments where sending vulnerability data to an external API isn't acceptable
- Sigma rule generation — extend detection output beyond Splunk SPL to Sigma rules usable across Elastic, Microsoft Sentinel, and other SIEMs
- OSINT feed integration — pull from MISP, AlienVault OTX, and ISAC feeds to broaden the threat intelligence context
- Scheduled runs — built-in cron scheduler so reports run automatically without manual intervention
- Additional SIEM integrations — Elastic SIEM and Microsoft Sentinel KQL support
- Robust Database Storage - Store generated reports in database and index by type and date
Closing Thoughts
Building IntelliHunt has been an exercise in figuring out where AI genuinely saves time in a security workflow vs. where you still need a human in the loop. The NVD data fetching, CVE enrichment, and initial detection query drafting are all solid candidates for automation — the agents are surprisingly good at turning raw CVE data into structured, context-aware reports. The final human review step (validating the detection logic against your specific environment, tuning false positive thresholds) stays manual, and probably should.
The performance work was a reminder that automation tools only get used if they're fast enough to fit into a daily workflow. A tool that takes 20 minutes to run gets run weekly, if at all. The same tool at 5–7 minutes could get run every morning.
The codebase is open source. Contributions, issues, and ideas are welcome.





Top comments (0)