Kubernetes Is Not an LLM Security Boundary

#devops #ai #security #kubernetes

The CNCF flagged it three days ago. Most teams haven't processed what it actually means.

Kubernetes lacks built-in mechanisms to enforce application-level or semantic controls over AI systems. That's not a bug. It's not a misconfiguration. It's a category error in how we're thinking about AI workload security.

Kubernetes isolates containers. It does not isolate decisions.

What Kubernetes Actually Controls

To be clear about the problem, you need to be precise about the scope.

Kubernetes enforces pod isolation, RBAC, network policy, resource limits, and admission control. A well-configured cluster with Cilium, Kyverno, and Falco is genuinely hardened.

All of those controls operate at the infrastructure layer. None of them understand what an LLM is doing inside that boundary.

The Three-Layer Problem

Think of it as three distinct boundaries:

Infrastructure Boundary (Kubernetes): Controls compute, network, identity. Cannot see model behavior, prompts, or outputs.

Application Boundary: Controls API access and service logic. Cannot see model reasoning or semantic intent.

LLM Boundary — the actual risk layer: Controls prompts, outputs, tool usage. This is the layer your current tooling doesn't reach.

Most teams have the first two layers covered. The third is largely unaddressed.

The Failure Mode Kubernetes Will Never Catch

Here's the production scenario that matters:

User submits a prompt with a hidden injection instruction
Model retrieves internal context via RAG
Model outputs sensitive internal data in its response
Response returns HTTP 200
No alerts fire. No logs capture what the model decided. From Kubernetes' perspective: successful request. Pod healthy. RBAC respected. Latency within SLA.

From a security perspective: complete boundary failure.

This is the observability inversion. Traditional monitoring asks: did it run? was it fast? did it error?

LLM observability needs to ask: was it correct? was it safe? was it allowed?

Infrastructure observability measures execution. LLM observability measures outcomes.

What the Actual Boundary Requires

Four control layers need to exist above Kubernetes:

Ingress Control — prompt validation and injection filtering before the model sees the request.

Egress Control — output scanning and PII detection before the response leaves the system.

Action Control — for agentic systems with tool access, explicit allow-lists scoped per model and context. RBAC governs which service account can call which API. This governs which model, in which context, is permitted to trigger which action. Not the same constraint.

Audit Control — sovereign, immutable inference logging. If your inference logs live in a vendor's platform, you don't fully own the audit trail.

Emerging implementations like Kong AI Gateway and Portkey are building toward this pattern — but the pattern matters more than the product. These four components need to exist regardless of what implements them.

When Kubernetes Is Enough

To be honest: there are AI workloads where infrastructure controls are sufficient.

Stateless, isolated LLM — no persistent context
No tool access — text output only
No sensitive context in scope
No external system impact If your workload meets all four conditions, your infrastructure boundary largely holds.

The moment you add RAG retrieval, tool use, memory, or agentic orchestration — any one of them — you're operating at the LLM Boundary layer, and Kubernetes alone isn't sufficient.

Most enterprise AI workloads don't meet those conditions.

The Practical Takeaway

Your Kubernetes security posture is necessary. It is not sufficient for LLM workloads.

The cluster can be hardened. The model is still non-deterministic. Those are two different problems requiring two different control layers.

If you're running LLMs on Kubernetes with only infrastructure-layer controls, you have a boundary problem you haven't measured yet. The absence of alerts isn't evidence of safety — it's evidence that your observability doesn't reach the layer where LLM risk lives.

Full architecture breakdown including the LLM Security Boundary Model and LLM Control Plane Pattern framework at rack2cloud.com