Cato Networks just made the most aggressive architectural bet in the SASE market: NVIDIA GPUs deployed directly inside every one of its 85+ global Points of Presence. The new Neural Edge platform closes the gap between traffic inspection and AI-driven analysis by running both in the same location, in a single pass.
For anyone building or operating cloud security infrastructure, this raises a fundamental architecture question that applies far beyond one vendor: where does your AI actually run?
The Problem: Inspect Here, Analyze There, Enforce Later
Most cloud-delivered security platforms inspect traffic at their PoPs using CPU-based engines — stateful firewalling, URL filtering, signature-based IPS — and that works fine for traditional workloads. But AI-driven security models (semantic DLP, behavioral analytics, LLM-based threat detection) require a fundamentally different compute profile: matrix multiplication and tensor operations that CPUs handle poorly at scale.
The common workaround? Offload AI analysis to a hyperscaler GPU farm. That creates:
- Variable latency from the round-trip to external compute
- A broken enforcement loop — detection happens in one place, policy enforcement in another
- Scaling bottlenecks tied to hyperscaler GPU capacity, not your security edge
As Cato's global field CTO Brian Anderson put it: "Many vendors use AI for detection, but the key architectural question is where the AI runs. That separation introduces additional latency variability, and it breaks the tight loop between analysis and enforcement."
Neural Edge: Single-Pass GPU-Accelerated Inspection
Cato Neural Edge deploys NVIDIA GPUs inside every PoP. The result is a single-pass inspection pipeline where FWaaS, SWG, IPS, CASB, DLP, and AI-driven analysis all execute in the same location:
| Component | Traditional SASE | Cato Neural Edge |
|---|---|---|
| Traffic inspection | CPU-based, in PoP | CPU + GPU, in PoP |
| AI threat analysis | Offloaded to hyperscaler GPU | Inline, same PoP |
| Policy enforcement | In PoP (post-analysis delay) | In PoP (real-time, single pass) |
| Latency variability | High (external round-trip) | Low (collocated compute) |
| Semantic DLP | Limited by CPU capacity | GPU-accelerated classification |
| Model update cycle | External dependency | PoP-native deployment |
Cato SVP Nimmy Reichenberg described the philosophy at RSAC 2026: "We've always believed that by owning our own cloud, we can provide a very resilient service to our customers, and we're just bringing GPUs to our own cloud as opposed to using somebody else's GPUs."
Three Security Problems GPUs Actually Solve
This isn't GPU hype for its own sake. There are specific security workloads where parallel processing changes what's architecturally feasible:
1. Semantic DLP Classification
Traditional DLP uses regex patterns and exact data matching. AI-powered DLP understands context — it can identify intellectual property or sensitive business logic in natural language prompts to AI tools. GPU-powered enforcement enables "deeper semantic inspection, large-scale pattern analysis, and real-time adaptive intelligence inline" according to Cato's technical documentation.
2. AI Prompt and Response Inspection
As enterprises adopt copilots and AI agents, security teams must inspect conversational AI traffic in real time. Prompt injection attacks, data exfiltration via natural language, and jailbreak attempts require inference-level analysis — not pattern matching. GPU acceleration makes this feasible at enterprise scale without degrading user experience.
3. Behavioral Anomaly Detection Across Encrypted Flows
Behavioral models analyzing metadata patterns, session characteristics, and flow telemetry benefit from GPU parallel processing even with TLS 1.3 inspection. The 650 Group analyst report noted that GPU integration enables security services that scale with "the compute intensity of AI workloads."
For zero trust architectures, this changes the economics: continuous verification and adaptive policy enforcement become computationally practical, not just theoretically desirable.
Cato AI Security: Governing the AI Tool Sprawl
Launched alongside Neural Edge, Cato AI Security addresses the governance side of enterprise AI adoption. Built on Cato's acquisition of Aim Security (September 2025), it provides unified controls for three categories of AI risk:
- Shadow AI — visibility and policy enforcement for employee usage of ChatGPT, Claude, Gemini, and other GenAI tools
- Homegrown AI applications — prompt injection protection, output filtering, and API-level security embedded in the network path
- Agentic AI guardrails — as Reichenberg noted: "A year ago, nobody asked us to secure MCP servers because they didn't exist. Nobody asked us to secure agentic browsers because they didn't exist."
The key architectural decision is convergence: AI security runs on the same SASE platform, same console (CMA), same policy engine, and shared data lake. No separate tool, no separate pane of glass.
Notably, Cato AI Security is available standalone — you can deploy AI governance without committing to full SASE transformation.
The Competitive Pressure
Cato's GPU bet forces every SASE vendor to answer: where does your AI run?
- Zscaler runs a massive cloud security platform but relies on CPU-based inspection with AI analysis handled separately
- Palo Alto Networks (Prisma SASE) has deep AI/ML capabilities but processes much of the AI workload in centralized locations rather than at every PoP
- Cisco Secure Connect benefits from hardware expertise but faces the challenge of integrating an appliance-centric model into cloud-native SASE
- Netskope emphasizes real-time data protection but hasn't announced GPU-native PoP infrastructure
The DPU/SmartNIC market adds context: Dell'Oro Group projects 30% CAGR over the next five years, driven by NVIDIA's BlueField platform. Hardware-accelerated network and security processing is becoming fundamental infrastructure, not a niche.
What This Means for Security Engineers
The SASE market is growing at 26% CAGR (Gartner, 2026). Whether you work with Cato or not, the architecture patterns here are worth understanding:
| Traditional Security Concept | GPU-SASE Equivalent |
|---|---|
| Zone-Based Firewall (ZBFW) | Per-PoP inline policy enforcement |
| IPS signature engine | AI-driven threat classifier (GPU) |
| Posture assessment | Continuous zero trust verification |
| TLS inspection | Single-pass encrypted traffic analysis |
| NetFlow/behavioral analytics | GPU-accelerated anomaly detection |
| VPN tunnel security | SD-WAN overlay with integrated SSE |
Practical steps:
- Study SASE architecture patterns. Understand how SD-WAN, SSE (SWG, CASB, ZTNA, FWaaS), and single-pass processing work together
- Learn AI security fundamentals. Prompt injection, model poisoning, data exfiltration through AI tools — these are the new attack vectors. Check out Cato's published research on EchoLeak (zero-click AI vulnerability) and CurXecute (RCE via Cursor MCP)
- Track the DPU/SmartNIC ecosystem. NVIDIA BlueField, AMD Pensando, and Intel IPU are reshaping how network processing happens at the infrastructure level
- Understand AI governance requirements. The EU AI Act and NIST AI RMF will drive security policy requirements that network teams must implement
The convergence of GPU compute, AI inspection, and network security isn't a future trend — it's shipping in production at 85+ global locations today.
Originally published at firstpasslab.com. For more deep dives on network security architecture, follow us there.
🤖 AI Disclosure: This article was adapted from original research with AI assistance. All technical claims are sourced from vendor announcements, analyst reports, and industry coverage linked throughout the article.



Top comments (0)