Traditional cybersecurity feels concrete. "Close port 22" — you run netstat, confirm it's closed, move on. "Patch CVE-2024-1234", you update, verify the version, done. Each action is discrete and verifiable.
AI agent security feels like the opposite. "Protect against prompt injection" sounds like "defend against bad conversations." How do you even measure that? Lock down the LLM so it can't do anything useful?
This perception gap is a problem. Server hardening feels real. Defending against harmful conversations? Impossible.
But AI security can become more concrete if you realize that many attacks follow the same structured patterns as traditional malware — we just haven't been talking about them that way.
In what is becoming a widely cited and influential paper, Ben Nassi, Bruce Schneier, and Oleg Brodt mapped real-world AI security incidents into a framework they call the Promptware Kill Chain.
This is a multi-stage attack mechanism with discrete, observable stages.
Luckily, the kill chain can be disrupted, but it requires people to fundamentally reassess how they think about AI agent security.
Top comments (0)