FirstPassLab

Posted on Mar 28 • Originally published at firstpasslab.com

GPUs in Every PoP: Inside Cato Neural Edge and the Shift to GPU-Accelerated Cloud Security

#security #networking #ai #cloud

Cato Networks just made the most aggressive architectural bet in the SASE market: NVIDIA GPUs deployed directly inside every one of its 85+ global Points of Presence. The new Neural Edge platform closes the gap between traffic inspection and AI-driven analysis by running both in the same location, in a single pass.

For anyone building or operating cloud security infrastructure, this raises a fundamental architecture question that applies far beyond one vendor: where does your AI actually run?

The Problem: Inspect Here, Analyze There, Enforce Later

Most cloud-delivered security platforms inspect traffic at their PoPs using CPU-based engines — stateful firewalling, URL filtering, signature-based IPS — and that works fine for traditional workloads. But AI-driven security models (semantic DLP, behavioral analytics, LLM-based threat detection) require a fundamentally different compute profile: matrix multiplication and tensor operations that CPUs handle poorly at scale.

The common workaround? Offload AI analysis to a hyperscaler GPU farm. That creates:

Variable latency from the round-trip to external compute
A broken enforcement loop — detection happens in one place, policy enforcement in another
Scaling bottlenecks tied to hyperscaler GPU capacity, not your security edge

As Cato's global field CTO Brian Anderson put it: "Many vendors use AI for detection, but the key architectural question is where the AI runs. That separation introduces additional latency variability, and it breaks the tight loop between analysis and enforcement."

Neural Edge: Single-Pass GPU-Accelerated Inspection

Cato Neural Edge deploys NVIDIA GPUs inside every PoP. The result is a single-pass inspection pipeline where FWaaS, SWG, IPS, CASB, DLP, and AI-driven analysis all execute in the same location:

Component	Traditional SASE	Cato Neural Edge
Traffic inspection	CPU-based, in PoP	CPU + GPU, in PoP
AI threat analysis	Offloaded to hyperscaler GPU	Inline, same PoP
Policy enforcement	In PoP (post-analysis delay)	In PoP (real-time, single pass)
Latency variability	High (external round-trip)	Low (collocated compute)
Semantic DLP	Limited by CPU capacity	GPU-accelerated classification
Model update cycle	External dependency	PoP-native deployment

Cato SVP Nimmy Reichenberg described the philosophy at RSAC 2026: "We've always believed that by owning our own cloud, we can provide a very resilient service to our customers, and we're just bringing GPUs to our own cloud as opposed to using somebody else's GPUs."

Three Security Problems GPUs Actually Solve

This isn't GPU hype for its own sake. There are specific security workloads where parallel processing changes what's architecturally feasible:

1. Semantic DLP Classification

Traditional DLP uses regex patterns and exact data matching. AI-powered DLP understands context — it can identify intellectual property or sensitive business logic in natural language prompts to AI tools. GPU-powered enforcement enables "deeper semantic inspection, large-scale pattern analysis, and real-time adaptive intelligence inline" according to Cato's technical documentation.

2. AI Prompt and Response Inspection

As enterprises adopt copilots and AI agents, security teams must inspect conversational AI traffic in real time. Prompt injection attacks, data exfiltration via natural language, and jailbreak attempts require inference-level analysis — not pattern matching. GPU acceleration makes this feasible at enterprise scale without degrading user experience.

3. Behavioral Anomaly Detection Across Encrypted Flows

Behavioral models analyzing metadata patterns, session characteristics, and flow telemetry benefit from GPU parallel processing even with TLS 1.3 inspection. The 650 Group analyst report noted that GPU integration enables security services that scale with "the compute intensity of AI workloads."

For zero trust architectures, this changes the economics: continuous verification and adaptive policy enforcement become computationally practical, not just theoretically desirable.

Cato AI Security: Governing the AI Tool Sprawl

Launched alongside Neural Edge, Cato AI Security addresses the governance side of enterprise AI adoption. Built on Cato's acquisition of Aim Security (September 2025), it provides unified controls for three categories of AI risk:

Shadow AI — visibility and policy enforcement for employee usage of ChatGPT, Claude, Gemini, and other GenAI tools
Homegrown AI applications — prompt injection protection, output filtering, and API-level security embedded in the network path
Agentic AI guardrails — as Reichenberg noted: "A year ago, nobody asked us to secure MCP servers because they didn't exist. Nobody asked us to secure agentic browsers because they didn't exist."

The key architectural decision is convergence: AI security runs on the same SASE platform, same console (CMA), same policy engine, and shared data lake. No separate tool, no separate pane of glass.

Notably, Cato AI Security is available standalone — you can deploy AI governance without committing to full SASE transformation.

The Competitive Pressure

Cato's GPU bet forces every SASE vendor to answer: where does your AI run?

Zscaler runs a massive cloud security platform but relies on CPU-based inspection with AI analysis handled separately
Palo Alto Networks (Prisma SASE) has deep AI/ML capabilities but processes much of the AI workload in centralized locations rather than at every PoP
Cisco Secure Connect benefits from hardware expertise but faces the challenge of integrating an appliance-centric model into cloud-native SASE
Netskope emphasizes real-time data protection but hasn't announced GPU-native PoP infrastructure

The DPU/SmartNIC market adds context: Dell'Oro Group projects 30% CAGR over the next five years, driven by NVIDIA's BlueField platform. Hardware-accelerated network and security processing is becoming fundamental infrastructure, not a niche.

What This Means for Security Engineers

The SASE market is growing at 26% CAGR (Gartner, 2026). Whether you work with Cato or not, the architecture patterns here are worth understanding:

Traditional Security Concept	GPU-SASE Equivalent
Zone-Based Firewall (ZBFW)	Per-PoP inline policy enforcement
IPS signature engine	AI-driven threat classifier (GPU)
Posture assessment	Continuous zero trust verification
TLS inspection	Single-pass encrypted traffic analysis
NetFlow/behavioral analytics	GPU-accelerated anomaly detection
VPN tunnel security	SD-WAN overlay with integrated SSE

Practical steps:

Study SASE architecture patterns. Understand how SD-WAN, SSE (SWG, CASB, ZTNA, FWaaS), and single-pass processing work together
Learn AI security fundamentals. Prompt injection, model poisoning, data exfiltration through AI tools — these are the new attack vectors. Check out Cato's published research on EchoLeak (zero-click AI vulnerability) and CurXecute (RCE via Cursor MCP)
Track the DPU/SmartNIC ecosystem. NVIDIA BlueField, AMD Pensando, and Intel IPU are reshaping how network processing happens at the infrastructure level
Understand AI governance requirements. The EU AI Act and NIST AI RMF will drive security policy requirements that network teams must implement

The convergence of GPU compute, AI inspection, and network security isn't a future trend — it's shipping in production at 85+ global locations today.

Originally published at firstpasslab.com. For more deep dives on network security architecture, follow us there.

🤖 AI Disclosure: This article was adapted from original research with AI assistance. All technical claims are sourced from vendor announcements, analyst reports, and industry coverage linked throughout the article.

DEV Community