jtarkington77

Posted on Dec 5, 2025

When Malware Starts Prompt-Engineering Itself

#cybersecurity #malware #ai #blueteam

Somewhere right now, there’s a sketchy little script on a compromised Windows box asking an AI model how to be sneakier.

Not a red-teamer. Not a malware dev in Visual Studio. Just the malware itself, pinging an LLM API and basically saying:

“Rewrite me so I stop getting caught.”

That’s the shift we’re watching in real time.

A recent report from Google’s Threat Intelligence Group (GTIG) walks through the first real wave of "AI-enabled" malware seen in actual operations, not just on conference slides. We’re talking about names like PROMPTFLUX, PROMPTSTEAL, and FRUITSHELL — all trying, in different ways, to bolt large language models (LLMs) onto old-school tradecraft.

The punchline is simple:

The malware code is still pretty rudimentary.
The architecture, however, is brilliant.

If you only look at the code quality, you’re going to miss what they’re really doing.

They aren't building better bombs; they are building smarter delivery systems.

What’s Actually New Here?

Threat actors have been using AI as a glorified Stack Overflow for a while:

“Write me a PowerShell script to do X,” or “Fix this compile error.”

That’s boring. That’s just a developer using tools.

The interesting part in this report is the runtime angle: malware calling out to LLMs during execution to change how it behaves on the victim machine.

Instead of shipping a fully baked malware family with a static decision tree, they are shipping a thin client and outsourcing the tactical decision-making to Gemini, Hugging Face models, or whatever else is cheap and accessible.

It’s basically malware with a plug-in.

The plug-in just happens to be a Large Language Model.

Meet the First Wave (It’s Not Just Skiddies)

The names might sound like someone let an intern name the projects, but the actors behind them are serious.

In fact, Google’s reporting ties PROMPTSTEAL to APT28 (aka Fancy Bear / Forest Blizzard) — the Russian military intelligence group responsible for some of the most significant cyberattacks of the last decade.

When nation-states enter the chat, it’s no longer a science experiment.

Here’s the breakdown of what is actually hitting networks.

PROMPTFLUX: The Polymorphic Loop

PROMPTFLUX is a VBScript-based dropper that talks to Google’s Gemini API. Its job isn’t to be clever on its own. Its job is to ask Gemini to rewrite its own source code.

According to the analysis, PROMPTFLUX attempts to generate new, obfuscated iterations of itself as often as every hour. It then saves these fresh variants into persistence locations like the Startup folder.

The old way: attackers pack/encrypt the malware once before sending it.

The AI way: the malware repacks itself constantly, asking an LLM to generate unique variations that break signature-based detection.

PROMPTSTEAL: The Consultant on Retainer

PROMPTSTEAL is the family linked to APT28. It’s a Python data miner that uses the Hugging Face API to query a Qwen2.5-Coder model for Windows commands.

Roughly, the loop looks like this:

Malware scrapes system info.
Malware sends context to the LLM: “Here is the environment. What should I run to find sensitive files?”
The LLM replies with specific commands for discovery and exfiltration.

The malware isn’t shipping with a hard-coded list of targets.

It’s shipping with a loop that keeps asking an AI for the next best move.

FRUITSHELL & PROMPTLOCK

FRUITSHELL is a reverse shell that uses hard-coded prompts to dynamically adjust how it evades detection and communicates back to the attacker.

GTIG also calls out PROMPTLOCK, an AI-powered ransomware proof-of-concept. It doesn’t just encrypt files; it feeds hard-coded prompts into a local LLM to generate Lua scripts for scanning, exfiltration, encryption, and even shaping ransom-note content in ways that crank up the psychological pressure on the victim.

In other words: even the “PoC” stuff is showing where this is going.

AI APIs Are the New C2

Here’s the mental shift Blue Teams need to make immediately:

If your endpoint is quietly calling Gemini, Hugging Face, or a custom model on a VPS, that is not just "weird traffic." That is Command and Control (C2).

It just doesn’t look like the C2 we grew up on. Traditionally, C2 has meant:

HTTP(S) beacons to sketchy, low-reputation domains
Encrypted traffic to bulletproof hosts
DNS tunneling

Now, imagine a world where your malware beacons out to generativelanguage.googleapis.com or api-inference.huggingface.co.

From the network firewall’s perspective, it looks like:

“Oh, this user is just using an AI productivity tool.”

If you don’t have a strong baseline on who in your environment is allowed to talk to those APIs — from which hosts and using which identities — you’ve basically given attackers a free, encrypted outbound channel with the label “Innovation” slapped on top.

Prototype Malware, Real Humans

On paper, this is all "experimental" and "nascent." In practice, it tells you a lot about how threat actors are adapting.

The report calls out a simple, terrifying trick regarding guardrails. When Gemini initially refused to generate offensive code for PROMPTFLUX, the operator didn’t give up. They simply reframed the request as a Capture-The-Flag (CTF) exercise.

Suddenly, the same model that said “I cannot assist with malware” handed over useful building blocks for obfuscation and persistence.

We keep talking about "AI-powered threats" like the model is the villain. It’s not.

The dangerous part is the human feedback loop:

They learn which prompts bypass safety filters.
They learn which open-source models (like Qwen) have zero or minimal safety filters.
They learn how far they can push "legitimate" APIs before they get rate-limited or blocked.

“This Isn’t Skynet” Is Not Comforting

Right now, almost every write-up is careful to say some version of:

“AI-enabled malware is still immature and often detectable.”

And that’s true.

A lot of the samples have sloppy execution.
They depend on external network access to work at all.
They leave very obvious artifacts for EDR (like massive Python libraries dropped on disk).

But "immature" doesn’t mean "safe." It means we’re early in the learning curve.

We’ve seen this movie before with polymorphic engines in the 90s and Malware-as-a-Service in the 2010s. LLMs are just the latest mutator.

The difference this time is velocity.

Instead of a niche assembly coder painstakingly building a mutation engine, you now have copy-paste access to obfuscation strategies and cheap AI APIs doing the heavy lifting.

So What Do We Actually Do About It?

The good news: you don’t need a massive "AI for XDR" budget to start taking this seriously. You need to treat AI interactions with the same suspicion you treat PowerShell.

1. Treat AI endpoints like high-risk C2

Start small and practical. Decide which systems are allowed to talk to AI services at all.

Developer boxes? Maybe.
Domain controllers and file servers? Absolutely not.

Put egress controls around common AI endpoints. Alert on new processes reaching out to those domains — especially things that shouldn’t be running Python or VBScript in the first place.

2. Stop sprinkling long-lived API keys

If your organization is experimenting with AI, you are likely leaving API keys hardcoded in scripts or environment variables.

Centralize API calls through a gateway.
Scope tokens tightly (rate limits, IP ranges, least privilege).
Rotate keys.

The more AI access you casually hand to random machines, the more surface area an attacker has to hijack your paid quota to generate their malware.

3. Hunt for the “thinking” latency

There is a specific behavioral quirk to this malware: latency.

Unlike a hard-coded script that executes instantly, AI-enabled malware has to pause:

Process starts.
Pause (network call to API).
Wait (LLM token generation).
Execute new command.

That pause — the "thinking time" — is a huntable anomaly. Look for processes that hang with an open network connection to an AI provider before spawning a child process or writing new script content to disk.

Don’t Sleep on the Prototypes

Right now, PROMPTFLUX and its cousins look like what they are: first attempts. They’re weird. They’re noisy. And in a lot of environments, they’re catchable.

That’s exactly why they matter.

They show us what attackers are trying to learn: how to turn AI APIs into on-demand C2, and how to mutate code fast enough to outpace signatures.

If you’re on the Blue Team, this isn’t the moment to panic.

It’s the moment to quietly adjust your mental model.

AI isn’t just a thing your security vendor bolts onto their product slide. It’s now a resource your adversaries can rent by the hour.

The malware is still dumb.

The operators aren't.

Your job is to make sure your defenses aren’t either.

See more of my work and tools

Portfolio: https://jtarkington-portfolio.netlify.app

GitHub: https://github.com/jtarkington77

DEV Community