DEV Community

Malika
Malika Subscriber

Posted on

# Why Wait 90 Days for NVD? I Built a System That Creates CVEs Instantly.

tags: security, ai, cybersecurity, n8n

SOC-CERT: AI-Powered Threat Intelligence — showing data sources (NIST, CERT-FR, BleepingComputer, CISA), CVE enrichment pipeline, Cohere AI agent, and multi-channel output (Gmail, Slack, Sheets) with 99.8% uptime and 0 false positives

I won a dev.to challenge last August by building something I desperately needed: an automated threat intelligence system that doesn't drown your team in noise.

But this article isn't about winning. It's about what I learned building SOC-CERT — and why every developer should think differently about security automation in 2025.


The Problem Nobody Talks About

Most security advice stops at "keep your dependencies updated, run a static scanner." That's necessary but nowhere near enough.

Here's the real problem: NVD publishes CVEs 30 to 90 days after a vulnerability is discovered. That gap is your exposure window. You're running dependencies with known exploits, and the official database hasn't catalogued them yet.

Meanwhile, your security team — if you even have one — is drowning in alerts from 5 different feeds, all reporting the same CVE with slightly different formatting. Alert fatigue sets in. Things get missed.

I built SOC-CERT to fix exactly this.


What SOC-CERT Actually Does

SOC-CERT is an automated threat intelligence pipeline built with n8n and Bright Data that:

  • Monitors CISA, NIST NVD, CERT-FR, and BleepingComputer simultaneously
  • Runs every CVE through Cohere Command-R for AI-powered severity scoring
  • Uses hash-based deduplication so the same CVE from 5 different sources generates exactly one alert
  • Delivers structured notifications to Slack, Gmail, and Google Sheets in under 5 minutes from detection

It processes 100+ CVEs daily with 99.8% uptime, completely free using tier services.

The architecture looks like this:

Data Sources (CISA, NIST, CERT-FR, BleepingComputer)
        ↓
Bright Data (proxy rotation, anti-bot bypass)
        ↓
n8n Orchestration Layer
        ↓
Cohere AI Agent (severity scoring + structured output)
        ↓
Hash-based dedup + change detection
        ↓
Multi-channel notifications (Slack / Gmail / Sheets)
Enter fullscreen mode Exit fullscreen mode

Full SOC-CERT pipeline diagram — from Cron trigger and Rate Limiter through NIST, CERT-FR, BleepingComputer and CISA data collection, enrichment via CISA KEV, CIRCL and AlienVault OTX, AI agent with Cohere + memory, through to Slack, Gmail and Google Sheets output


The Technical Decisions That Actually Mattered

1. Hash-based deduplication — the most underrated feature

This was the insight that changed everything. Instead of trying to parse and compare CVE IDs across sources (which fail constantly due to formatting differences), I hash the normalized CVE content and track changes.

// Simplified dedup logic in n8n
const contentHash = crypto
  .createHash('sha256')
  .update(JSON.stringify({ id: cve.id, description: cve.description }))
  .digest('hex');

if (seenHashes.has(contentHash)) {
  return; // Already processed, skip
}

seenHashes.add(contentHash);
// Process and alert...
Enter fullscreen mode Exit fullscreen mode

Result: zero duplicate alerts across 4 sources running 24/7.

n8n workflow in 3 sections — top: data collection from NIST, CERT-FR, BleepingComputer, CISA with Bright Data and CISA KEV enrichment; middle: Normalize → Diff/Hash Check → AI Agent (Cohere + Simple Memory) → AI Output Parser → Structure Alerts; bottom: Structure Alerts branching to Health Check, Critical Alerts, Append to Sheet, Slack Notify and Gmail Alert Message

2. Constraining the AI output — not trusting it blindly

The biggest mistake in AI-powered automation is treating the model as a decision-maker. Cohere Command-R is excellent at structured extraction — but only if you constrain it hard.

My system prompt wasn't "analyze this CVE." It was a strict schema definition:

Analyze the following CVE data and return ONLY valid JSON with these exact fields:
- severity: CRITICAL | HIGH | MEDIUM | LOW
- cvss_estimated: float between 0-10
- affected_components: array of strings
- exploitation_likelihood: integer 1-5
- summary: max 100 words

Do not include any text outside the JSON object.
Enter fullscreen mode Exit fullscreen mode

This + an output parser node in n8n = consistent, machine-readable data every time.

3. Retry logic and graceful degradation

NIST NVD goes down. CERT-FR rate-limits aggressively. BleepingComputer blocks scrapers regularly.

The pipeline uses a Continue on Error pattern with 3 retry attempts and exponential backoff per source. If a source fails, the pipeline logs the failure, alerts the admin, and continues processing the other sources. Partial data is infinitely better than a crashed pipeline.


The 90-Day Gap Problem — and the Virtual CVE Solution

After winning the n8n challenge, I took this further. Building the Chrome extension version of SOC-CERT revealed something deeper:

Even with a perfect pipeline, you're still reactive. You're tracking vulnerabilities after they've been named.

The solution I built: Virtual CVE Intelligence.

Side-by-side comparison: NVD process (90-day documentation delay, emerging threats invisible, enterprises vulnerable) vs SOC-CERT Guardian protection (2.3-second detection, Virtual CVE intelligence, immediate protection) — showing CVE-2026-148724 detected as Emerging Threat with score 75

Normal CVE lifecycle:
Day 0:  Vulnerability discovered
Day 30: Research completed
Day 60: CVE submitted to MITRE
Day 90: Published in NVD
→ 90-day window with zero tracking

Virtual CVE approach:
Second 0:  User visits suspicious URL
Second 2:  Gemini Nano detects threat pattern
Second 5:  Virtual CVE generated (CVE-2026-XXXXX)
Second 10: Alert with remediation steps
→ Immediate tracking from detection moment
Enter fullscreen mode Exit fullscreen mode

The Chrome extension combines local Gemini Nano analysis (for speed and privacy) with server-side n8n enrichment against the CISA KEV catalog. Two-stage analysis: fast local result in 2 seconds, enriched result with real CVE correlation shortly after.

SOC-CERT Guardian Chrome extension panel showing: analyzed URL testphp.vulnweb.com rated Malicious at 90% risk score with 95% confidence, SQL Injection and URL Encoding threat indicators, Gemini Nano AI analysis results, and deep CVE correlation identifying CVE-2020-0618 Critical with remediation recommendations


What This Means for Any Web Stack

Here's where I want to talk to you directly as a developer.

Your app is probably pulling in dozens of third-party dependencies. Each one is a potential vulnerability surface. Running a dependency scanner in CI is good practice — but it only catches CVEs that are already in the advisory database.

A few patterns from SOC-CERT that translate directly to any stack:

Pattern 1: Monitor, don't just audit

Virtual CVE Intelligence dashboard showing real-time stats: 224 Virtual CVEs Created, 36 Threats Detected in 24h, 2.3s average detection time, 87% AI confidence — SOC-CERT Guardian v1.0

Set up a lightweight n8n workflow (or even a cron job) that queries the CISA KEV API daily for any CVE tagged with your key dependencies.

# Example for Ruby/Rails — adapt the keywords to your own stack
# (Node.js → express, lodash / Python → django, flask / PHP → wordpress, laravel)
task :check_kev => :environment do
  response = HTTParty.get("https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json")
  kev = JSON.parse(response.body)["vulnerabilities"]

  your_stack = ["rails", "rack", "devise", "activerecord"] # ← change this

  relevant = kev.select do |cve|
    your_stack.any? { |dep| cve["product"].downcase.include?(dep) }
  end

  relevant.each do |cve|
    SlackNotifier.alert("🚨 KEV match: #{cve['cveID']} affects #{cve['product']}")
  end
end
Enter fullscreen mode Exit fullscreen mode

Pattern 2: Deduplicate your alerts

If you're piping security alerts to Slack from multiple sources, implement the same hash-based dedup approach. Your team will thank you.

Pattern 3: Treat AI as a structured data extractor, not an oracle

When using AI to triage security alerts, give it a strict output schema. Vague prompts produce vague outputs. In security, vagueness costs you.


The Stack — Full Transparency

SOC-CERT uses entirely free tiers:

Component Tool Cost
Orchestration n8n (self-hosted) Free
Scraping Bright Data Free tier
AI scoring Cohere Command-R Free tier
Notifications Slack + Gmail Free
Logging Google Sheets Free
CVE Extension AI Gemini Nano (Chrome) Free

Total infrastructure cost: $0/month.

This matters because one of the biggest barriers to security tooling for small teams is cost. Enterprise threat intelligence platforms cost $50K+/year. SOC-CERT proves you don't need that.


What I'd Do Differently

Honest retrospective:

I'd implement MITRE ATT&CK mapping from day one. Right now, SOC-CERT tells you a CVE exists and its severity. What it doesn't do well yet is map that to actual attacker TTPs — "here's how they'd exploit this in practice." That's the next level.

I'd add time-window deduplication earlier. Hash-based dedup prevents the same CVE from being alerted twice in the same run. But if a CVE resurfaces 3 weeks later in a new source, it alerts again. Redis-based state tracking with TTL would fix this cleanly.

I'd separate the enrichment pipeline from the alerting pipeline. Currently they're coupled in a single n8n workflow. Decoupling them would allow async enrichment without blocking alerts.


The Bigger Picture

Three months, three challenges, one ecosystem:

  • August 2025: n8n pipeline — won the AI Agents Challenge
  • September 2025: KendoReact dashboard — enterprise visualization layer
  • October 2025: Chrome AI extension — proactive browser-level detection

SOC-CERT Dashboard built with KendoReact — showing real-time alerts table (Failed login attempt HIGH, Malware activity MEDIUM, Failed login attempt LOW), severity distribution donut chart (45% HIGH, 20% MEDIUM, 35% LOW), and AI Analyst panel

Each product reused the same core intelligence layer. That's the real lesson: build systems, not features. A well-designed automation pipeline compounds in value every time you extend it.


Resources


If you're a developer or small team thinking about security automation — and you can't afford enterprise tooling — I hope this gives you a concrete starting point.

Drop your questions in the comments. Always happy to dig into implementation details. 🛡️

— Malika | Full-stack Rails dev, Paris | github.com/joupify

Top comments (1)

Collapse
 
benjamin_nguyen_8ca6ff360 profile image
Benjamin Nguyen

yes, it is! Cybersecurity will continue to evolve in the age of ai.