DEV Community

Moth
Moth

Posted on

One Hacker Used ChatGPT to Break Into 600 Firewalls Across 55 Countries. He Wasn't Even Good.

Amazon's threat intelligence team just published the most detailed case study yet of AI-augmented cyberattack at scale. A single financially motivated actor — Russian-speaking, working alone or with a small crew — compromised more than 600 FortiGate firewall appliances across 55 countries between January 11 and February 18 this year. Five weeks. Six hundred corporate perimeters breached.

The attacker's skill level: low to average.

That's the part that matters. Not the scale. The gap between the attacker's ability and the attacker's output.

The Attack

No zero-day. No novel exploit. The campaign worked because thousands of organizations left their FortiGate management interfaces exposed to the internet on ports 443, 8443, 10443, and 4443 — and protected them with weak, reused credentials and single-factor authentication. The attacker scanned systematically from a single IP address, 212.11.64.250, and walked through front doors that were never locked.

What made this different from a conventional credential-stuffing campaign was what happened next. The attacker used at least two commercial large language model providers to automate every phase of the operation. One LLM served as the primary developer and planner. The second handled lateral movement inside compromised networks.

The stolen data included complete device configurations, SSL-VPN credentials, admin passwords, firewall policies, and network topology maps — everything needed to move deeper. Post-compromise, the attacker launched DCSync attacks against Active Directory, used pass-the-hash and pass-the-ticket for lateral movement, ran NTLM relay attacks, and targeted Veeam Backup servers using CVE-2023-27532 and CVE-2024-40711. The playbook reads like a ransomware precursor.

Amazon found over 1,400 files across 139 subdirectories on the attacker's exposed infrastructure. The custom tooling told its own story. ARXON, a Model Context Protocol server, processed reconnaissance data through the LLMs. CHECKER2, written in Go, orchestrated parallel VPN scanning. Both tools bore hallmarks of AI-generated code: simplistic architecture, placeholder comments, weak error handling.

The Implications

The attacker repeatedly failed against hardened environments. Organizations with multi-factor authentication, isolated management interfaces, and patched software were abandoned quickly. The campaign didn't succeed because the attacker was sophisticated. It succeeded because 600 organizations weren't.

Amazon's CJ Moses framed it directly: "This activity is distinguished by the threat actor's use of multiple commercial GenAI services to implement and scale well-known attack techniques."

Implement and scale. Not invent. The AI didn't discover new vulnerabilities or write novel malware. It turned a mediocre operator into someone who could run a 55-country campaign in five weeks — scanning targets, generating tools, planning lateral movement, and extracting credentials across hundreds of networks simultaneously.

This is the threat model shift that security vendors have been warning about since GPT-4 shipped. A low-skill actor with commercial AI tools now achieves what previously required an advanced persistent threat team with years of tradecraft. The 600 breached organizations span South Asia, Latin America, the Caribbean, West Africa, Northern Europe, and Southeast Asia. Not targeted. Opportunistic. The attacker hit everything that was exposed and moved on from everything that wasn't.

What It Means

The cybersecurity industry has spent two years debating whether AI would meaningfully change the threat landscape. Amazon just published the answer: yes, and it already has.

The attacker didn't need a zero-day. Didn't need specialized training. Didn't need a team. Needed two LLM subscriptions and a list of exposed management ports. The 600 firewalls that fell weren't running outdated firmware with known critical vulnerabilities. They were running current software behind bad configurations — the kind of low-hanging fruit that exists in every enterprise network, in every country, right now.

Google's John Hultquist called this a "canary in the coal mine." The phrase undersells it. The canary is dead. The question is how many miners are still walking in.


If you work with AI prompts professionally, check out my prompt engineering toolkit on Polar.sh — structured templates for getting better results from any model.

Top comments (0)