How I Built an AI That Discovers Zero-Days Autonomously
The Problem
Penetration testing is broken. A human pentester can try maybe 15-20 techniques before mental fatigue sets in. They forget lessons between engagements. They repeat mistakes. They cost companies $10,000-$50,000 per test and take weeks to deliver results.
But the real problem is deeper. Every pentesting tool ever built does exactly what it's told. SQLmap does SQL injection. Nmap does port scanning. Metasploit runs pre-written exploits. None of them THINK. None of them LEARN. None of them ADAPT.
The Question
What if you could build a system that wakes up, studies a target, selects the right tools for THAT specific target, remembers what worked and what failed, queries the world's vulnerability databases in real-time, discovers new vulnerabilities through mathematical analysis, and never gives up until it finds a way in?
What if it could do all of this autonomously - no human choosing attacks, no pre-written scripts, no fixed decision tree?
That's what I set out to build.
The Architecture
The system has several key components that work together as a cognitive engine:
The Locksmith: Before attacking, it examines the target and selects ONLY the tools that apply. It doesn't try SQL injection on a server with no web forms. It doesn't try Samba exploits on a Windows machine. It scores every tool against the target's fingerprint - OS, open ports, services, web technologies - and picks the highest-scoring ones. A Metasploitable-style VM with 22 open backdoors gets a different strategy than a hardened router with 3 ports.
The Memory: Every success and failure is stored with the target's fingerprint. When the system encounters a machine it's seen before, it recognizes it instantly and runs the known working exploit in under a second. When it encounters a NEW target that's SIMILAR to one it's cracked, it prioritizes the techniques that worked on the similar target. It also blacklists techniques that repeatedly fail on specific target types - so it never wastes time running the same failing attack twice.
The Global Intelligence Network: During an attack, the system queries live CVE databases (NVD), ExploitDB, GitHub Security Advisories, and Packet Storm in real-time. It filters results by the target's exact software versions and immediately tries matching exploits. It's not limited to what's installed locally - it has access to the entire world's vulnerability knowledge.
The Oracle: For unknown services, the system builds a mathematical model of the target's behavior. It collects input-output pairs, calculates entropy and response patterns, identifies boundary conditions, and predicts exactly where crashes will occur. It then generates polymorphic shellcode and delivers it at the predicted boundary. No signatures. No databases. Pure mathematical vulnerability prediction.
The Synthesis Engine: When all known exploits fail, the system doesn't give up. It analyzes WHY they failed, identifies patterns in the failures, and synthesizes entirely new approaches. It might discover that crashing one service opens another. It might find that resource exhaustion weakens authentication. It creates attack chains no human would think to try.
The Discovery
During testing against a WordPress-based CTF machine (Mr Robot), something unexpected happened.
Every known technique failed. SQL injection bypassed the login but gave no admin access. WordPress brute force couldn't find the password fast enough in a 7-million-word dictionary. The Easter egg hunter found nothing. The Oracle found no boundary conditions on the web server.
Then the system tried a resource exhaustion attack - overwhelming the target with connections, flooding memory, bombing processes. The target survived. But when it came back, something had changed.
The post-crash recovery state accepted admin:admin as valid credentials.
This wasn't the real Mr Robot password (which is elliot:ER28-0652). This was something different - a state-based authentication bypass induced by controlled chaos. The authentication mechanism, stressed by resource exhaustion, defaulted to a less secure state during recovery.
The system confirmed this three separate times across different runs. Each time, the chaos weakened the target. Each time, admin:admin worked in the recovery window.
This vulnerability isn't in any CVE database. No signature detects it. No human pentester would think to try it - "overwhelm the server, then try admin:admin" is not in any playbook. But the system found it because it doesn't think like a human. It tries everything, observes the results, and finds the gaps.
What Makes This Different
Autonomous decision-making: The system doesn't follow a script. It analyzes the target and decides what to do. On Metasploitable, it goes for backdoors. On Kioptrix, it goes for SQL injection. On Brainpan, it goes for buffer overflows. On WordPress, it goes for brute force. On unknown targets, it synthesizes new approaches.
Cross-target learning: Every engagement makes the system smarter. It remembers what worked on Metasploitable and applies those lessons to similar targets. It remembers what failed on TP-Link routers and never tries those techniques on routers again.
Real-time global intelligence: It doesn't just use what's installed locally. It queries the entire world's vulnerability knowledge during attacks and immediately tries matching exploits.
Zero-day discovery: It found a genuine authentication bypass that no human had documented. Not by being smarter than humans, but by being more persistent. It tried things no human would think to try because no human has the patience to try 50 different attack types in sequence.
The Results
Five completely different targets. Five different attack strategies. Five shells.
| Target Type | Attack Method | Time to Shell |
|---|---|---|
| Metasploitable 2 | Backdoor exploitation | < 1 second |
| Kioptrix 2 | SQL injection → RCE | 4 seconds |
| Brainpan | Buffer overflow | 4 seconds |
| Kioptrix 1 | Samba exploit | Autonomous |
| Mr Robot | Zero-day auth bypass | 30 minutes |
One hardened TP-Link router resisted everything - proving the system correctly identifies secure targets.
Why This Matters
This isn't about building a "hacking tool." It's about proving that autonomous security testing is possible. That a system can think, learn, adapt, and discover without human guidance. That the gap between "running a script" and "conducting a penetration test" can be closed by artificial intelligence.
The implications go beyond offensive security. If we can build systems that autonomously find vulnerabilities, we can also build systems that autonomously patch them. The same cognitive engine that selects exploits could select defenses. The same memory that remembers attacks could remember mitigations.
We're not there yet. But this is a step toward that future.
The author is a security researcher who built this system as a proof of concept for autonomous vulnerability discovery. The code is not publicly available. Research inquiries welcome.
Top comments (0)