The Problem
Every bug bounty hunter knows the drill: wake up, check new programs, run subdomain enumeration, port scan, screenshot, check for known vulns, and then—finally—start hunting. That first two hours of every day? Pure toil. I was spending more time running tools than actually finding bugs.
So I decided to build an AI agent that does all of that for me while I drink my coffee.
The Stack
I built the agent in Python using a few key pieces:
- Hermes (the open-source AI agent framework) as the orchestration layer
- subfinder + httpx + nuclei for the recon pipeline
- Playwright for screenshotting and JavaScript-heavy page analysis
- A custom GPT-4o integration that reads recon output and prioritizes targets
The agent wakes up at 6 AM, pulls the latest program list from HackerOne and Bugcrowd, runs the full recon stack against every new and updated program, and delivers a prioritized report to my Slack before I've even finished my first pour-over.
The Architecture
Here's the flow:
[6:00 AM Cron Trigger]
|
v
[Fetch Programs] --> [Deduplicate & Filter]
|
v
[Subdomain Enum] --> [Port Scan] --> [Screenshot]
|
v
[AI Analysis: score each target 1-10]
|
v
[Generate Report] --> [Push to Slack + Notion]
The secret sauce is the AI analysis step. Instead of me staring at 500 screenshots every morning, GPT-4o scans each one, identifies login panels, interesting API endpoints, exposed admin consoles, and gives each target a priority score. Targets scoring 7+ get flagged for immediate manual review.
The Hard Parts
Rate limiting. Hitting 50 subdomains with httpx in parallel is a great way to get your IP banned. I built an adaptive rate limiter that backs off exponentially when it detects throttling, and rotates through a pool of residential proxies when things get spicy.
False positives. Nuclei is loud. Really loud. The AI agent cross-references nuclei findings against actual HTTP responses and screenshots, filtering out ~60% of templates that fire on every WordPress site on the planet.
State management. The agent needs to remember what it scanned yesterday so it doesn't re-scan 10,000 subdomains every morning. I built a SQLite-backed state store that tracks scan history per program, with intelligent diffing to only scan what's new or changed.
The Results
After three months of running this every morning:
- Recon time: 2 hours → 0 minutes (fully automated)
- Valid findings submitted: +40% (more time hunting, less time scanning)
- Bounties earned: +65% (focusing on high-value targets the AI surfaces)
- Coffee consumed: unchanged (still a lot)
You Can Build This Too
If you want to skip the months of building and tweaking, I've packaged the entire automation pipeline into a Bug Bounty Automation Kit that you can deploy in about 10 minutes. It includes the full recon stack, the AI scoring engine, pre-configured Slack/Notion integrations, and a one-click deploy script.
👉 Get the Bug Bounty Automation Kit — $15 on LemonSqueezy
For the DIY crowd, I also open-sourced the core agent framework on GitHub. The full code is at github.com/ulnit/agent-store — fork it, break it, improve it, and let me know what you build.
And if you're more into general AI automation, check out the AI Agent Toolkit — it's a deep-dive into building production-grade AI agents that actually ship, not just demo-ware.
What's Next
I'm currently working on v2 of the agent that uses browser-use to actually navigate target applications and perform light reconnaissance interactively—filling out forms, following redirects, testing auth flows. The goal? An agent that can run a full first-pass pentest while I'm still asleep.
This is day 47 of my "build in public" journey. Follow along on GitHub for daily updates.
Top comments (0)