Mr Elite

Posted on May 19 • Originally published at securityelites.com

How AI and LLMs are discovering zero-days faster than human researchers in 2026

#inacking #inecurity #econ #ecurityestingools

📰 Originally published on Securityelites — AI Red Team Education — the canonical, fully-updated version of this article.

In 2024, a research team at Google DeepMind used an AI system called AlphaCode 2 to discover a zero-day vulnerability in the SQLite database. The system identified a buffer overflow that had been present in the codebase for years and had been missed by decades of human review and traditional fuzzing. My framing on AI vulnerability discovery: the human researcher is no longer the rate-limiting factor in finding bugs. The rate-limiting factor is now compute and clever prompting. For bug bounty hunters, red teamers, and security researchers, understanding how AI changes the vulnerability discovery pipeline is not optional — it defines the competitive landscape for the next five years.

What You’ll Learn

How AI-assisted vulnerability discovery works — the full pipeline
LLM-assisted code review vs AI-assisted fuzzing — different tools for different contexts
Real documented cases of AI discovering vulnerabilities in production software
How to integrate AI tools into your own vulnerability research workflow
The limitations — where AI fails at vulnerability discovery

⏱️ 35 min read · 3 exercises ### AI Vulnerability Discovery 2026 1. The AI Vulnerability Discovery Pipeline 2. LLM-Assisted Code Review 3. AI-Assisted Fuzzing 4. Real Documented AI Discoveries 5. Limitations — Where AI Falls Short AI vulnerability discovery is the offensive research application of AI that most directly accelerates the penetration testing workflow. The AI Red Teaming Guide covers how to incorporate AI vulnerability discovery into formal assessment methodology. The LLM Fuzzing Techniques article goes deep on one specific sub-technique covered here.

The AI Vulnerability Discovery Pipeline

My framework for AI vulnerability discovery has four stages, each with different AI contribution levels. The stages where AI adds the most value are code triage (identifying which files and functions to audit) and pattern recognition (flagging code patterns known to be vulnerable). The stages where human expertise is still essential are vulnerability confirmation (does this code actually reach a dangerous state in practice?) and exploitation (can the vulnerability be triggered in a real attack?).

AI VULNERABILITY DISCOVERY PIPELINECopy

Stage 1: Target and scope selection

AI contribution: LOW — human researcher selects target based on strategic value
AI utility: help research attack surface, dependency graphs, previous vuln history

Stage 2: Code triage and prioritisation

AI contribution: HIGH — this is where AI earns its keep
Task: “Given this 200,000 line codebase, which files handle user input near dangerous functions?”
Task: “Flag all uses of strcpy, sprintf, gets, system() in this C codebase”
Task: “Identify all SQL query construction patterns not using prepared statements”
Time saved: hours of grep/manual triage compressed to minutes

Stage 3: Pattern-based vulnerability detection

AI contribution: MEDIUM-HIGH — good at known patterns, poor at novel logic bugs
Task: “Review this function for buffer overflow conditions”
Task: “Does this authentication logic contain a bypass condition?”
Task: “Is this JWT validation complete or does it have any edge case bypasses?”

Stage 4: Exploitation and confirmation

AI contribution: LOW-MEDIUM — AI assists PoC drafting, human confirms exploitability
Task: “Draft a proof-of-concept for this buffer overflow given this function signature”
Reality: AI PoC drafts are starting points, not final exploits

LLM-Assisted Code Review

LLM-assisted code review is my most-used AI tool in vulnerability research. The workflow: paste a function or module into the LLM, ask it to identify security issues, and review the flagged items. The LLM acts as a first-pass filter that identifies obvious patterns — I then focus my expert time on the items it flags and the areas it’s likely to miss (complex authentication logic, race conditions, integer overflow edge cases).

LLM CODE REVIEW — PROMPTS AND PATTERNSCopy

General vulnerability review prompt

“Review this [language] code for security vulnerabilities.
Focus on: injection flaws, buffer overflows, authentication bypasses,
insecure deserialization, path traversal. For each issue:
1) Describe the vulnerability 2) Show the vulnerable line 3) Explain exploitability”

Targeted SQL injection review

“Identify all SQL query construction in this code.
For each query: is user input concatenated directly? Is parameterisation used?
Rate the SQLi risk of each query: Critical/High/Medium/Low”

Authentication logic review

“Analyse this authentication function for bypass conditions.
Consider: type confusion, null handling, comparison edge cases,
off-by-one errors, integer overflow in length checks”

What LLMs find well vs poorly

GOOD: injection patterns, obvious deserialization, missing auth checks, hardcoded credentials
GOOD: deprecated/dangerous function usage (strcpy, eval, system())
POOR: complex multi-function logic flows where bug requires state understanding
POOR: race conditions (time-of-check-to-time-of-use flaws)
POOR: novel logic bugs with no prior similar pattern in training data

EXERCISE 1 — THINK LIKE A RESEARCHER (20 MIN)
Use LLM-Assisted Code Review on a Real Target

TARGET: Any open-source project from GitHub that processes user input.

(Good targets: old PHP apps, C utilities, Python web frameworks)

Step 1: Select a target Go to GitHub and find a PHP or C project with: – User input handling (web form, CLI argument, file parsing) – Ideally older code (2010-2018 vintage → more likely to have issues) – Reasonable size (500–2000 lines)

Step 2: Paste a key function into an LLM Choose a function that handles user input. Prompt: “Review this code for security vulnerabilities. Focus on injection flaws, buffer overflows, authentication bypasses. For each issue found: 1) Describe the vulnerability 2) Show the vulnerable line 3) Explain if it is exploitable and how”

📖 Read the complete guide on Securityelites — AI Red Team Education

This article continues with deeper technical detail, screenshots, code samples, and an interactive lab walk-through. Read the full article on Securityelites — AI Red Team Education →

This article was originally written and published by the Securityelites — AI Red Team Education team. For more cybersecurity tutorials, ethical hacking guides, and CTF walk-throughs, visit Securityelites — AI Red Team Education.

DEV Community