AI Can't Stop AI? Wrong Problem. Wrong Layer.

#ai #webdev #infosec #cybersecurity

ThreatLocker's new campaign is clever marketing — but it's solving a completely different problem than the one they're claiming to solve. Let's break it down.

I saw ThreatLocker's "AI can't stop AI" ad this week and my adrenaline spiked. Not because they're wrong about their own product — they're not. It's a solid Zero Trust endpoint solution. My adrenaline spiked because they're exploiting a gap in how most people understand the security stack, and that gap is getting people hurt.

The claim: "Existing AI defense tools must decide what to trust — and attackers exploit that gap." Therefore, use Zero Trust allowlisting instead of AI.

The problem: that's not even the attack surface we're talking about when we say "AI attacks."

The Layer They Conveniently Skipped

ThreatLocker protects the endpoint execution layer. Default-deny application control, ringfencing, no unauthorized executables, USB lockdown. Excellent product for that job. Genuinely.

But here's the thing: you can't allowlist a prompt. You can't ringfence a token stream.

When an attacker crafts a prompt injection to exfiltrate data through your LLM-powered customer support bot — ThreatLocker sees nothing. No unauthorized process launched. No suspicious binary executed. The attack happened entirely within the model's inference pipeline, and it looked like a normal API call the whole time.

This is the AI inference layer, and it's a completely different attack surface that Zero Trust endpoint controls don't touch at all.

What AI Attacks Actually Look Like in 2025

Let me be concrete. The attack vectors I'm talking about aren't "an AI-generated phishing email" that lands in Outlook. That's a social engineering problem, and sure, AI made it cheaper and faster. But that's old territory.

The attacks I'm talking about are:

Prompt injection — injecting adversarial instructions into data your LLM will process, hijacking its behavior mid-task
Jailbreaks and semantic manipulation — exploiting the model's own reasoning to bypass its safety guidelines
Capability inference — probing what a model won't answer to reverse-engineer its capabilities and constraints, then using that map to find the gaps
RAG poisoning — corrupting the retrieval layer so the model confidently serves attacker-controlled content as ground truth
Agentic tool abuse — in multi-step pipelines, manipulating intermediate outputs so downstream agents take harmful actions

None of these produce suspicious executables. None of them trigger endpoint behavioral analysis. They're semantic attacks — they live in the meaning of words, not in binary code.

Why You Literally Need AI to Stop Them

Here's the argument ThreatLocker doesn't want to make: for semantic attacks, only AI can operate at the required depth.

A regex catches known patterns. A static rule blocks known strings. But adversarial prompts are infinite in variation. The attacker doesn't need to reuse the same payload — they just need to find any path through the model's decision space that achieves their goal. And because language is unbounded, those paths are unbounded too.

This is exactly why I built the automated adversarial red/blue team loop into Sentinel. The red team is an AI agent whose job is to generate novel attack vectors against the system. The blue team is the detection pipeline. They run drills against each other continuously, and when the red team finds something that gets through, the system learns from it.

In a recent published run, the loop caught 9 out of 10 attack variants automatically. The one that escaped? Capability Inference Through Negation — a technique where the attacker infers what the model can do by cataloguing what it refuses to do, then constructs requests that stay just outside the refusal boundary. That attack vector didn't exist in any prior training data or ruleset. The red team invented it during the drill.

No human security team is generating and evaluating adversarial prompts at that velocity. No static rule set catches a technique that didn't exist yesterday. The only thing that can keep up with an AI attacker is an AI defender — one that's already seen the attack before the attacker tries it in production.

The False Dichotomy

To be fair to ThreatLocker, they're not actually claiming their product defends against LLM-layer attacks. They're smart enough not to say that. But the campaign headline — "AI can't stop AI" — is broad enough that it poisons the well for the entire category of AI-native security tooling.

The reality is these are two different layers that should both exist in your stack:

Layer	Threat	Tool
Endpoint execution	Unauthorized processes, ransomware, lateral movement	Zero Trust allowlisting (ThreatLocker, etc.)
AI inference	Prompt injection, jailbreaks, RAG poisoning, agentic abuse	AI Firewall (Sentinel, etc.)

They're not competitors. They don't overlap. You need both, and conflating them is either a mistake or a marketing move — neither of which serves the people trying to defend real systems.

The Uncomfortable Truth

AI-powered attacks on AI systems are accelerating faster than any human-curated defense can track. The threat surface is semantic, not binary. The attack vectors are novel by design. The payloads look like normal traffic.

ThreatLocker is right that prediction-based AI has blind spots. Every detection system does — that's why Sentinel runs adversarial drills to find them before attackers do. The answer to AI blind spots isn't to abandon AI defense. It's to build AI that attacks itself, learns from what it finds, and ships those findings as tighter detection.

Humans just can't keep up with that loop. The math doesn't work.

I'm Cori — network architect, founder of Skyblue Soft, and builder of Sentinel, an AI Firewall and proxy for LLMs and agentic pipelines. If you're building with LLMs and want to understand what your actual attack surface looks like, the red/blue team loop is worth reading about in my previous article.

Follow me here on dev.to @coridev for more practitioner-first AI security content.