DEV Community

NTCTech
NTCTech

Posted on • Originally published at rack2cloud.com

Kubernetes Is Not an LLM Security Boundary

The CNCF flagged it three days ago. Most teams haven't processed what it actually means.

Kubernetes lacks built-in mechanisms to enforce application-level or semantic controls over AI systems. That's not a bug. It's not a misconfiguration. It's a category error in how we're thinking about AI workload security.

Kubernetes isolates containers. It does not isolate decisions.

LLM Security Boundary Model — three layers: Infrastructure Boundary, Application Boundary, and LLM Boundary showing where Kubernetes visibility ends

What Kubernetes Actually Controls

To be clear about the problem, you need to be precise about the scope.

Kubernetes enforces pod isolation, RBAC, network policy, resource limits, and admission control. A well-configured cluster with Cilium, Kyverno, and Falco is genuinely hardened.

All of those controls operate at the infrastructure layer. None of them understand what an LLM is doing inside that boundary.


The Three-Layer Problem

Think of it as three distinct boundaries:

Infrastructure Boundary (Kubernetes): Controls compute, network, identity. Cannot see model behavior, prompts, or outputs.

Application Boundary: Controls API access and service logic. Cannot see model reasoning or semantic intent.

LLM Boundary — the actual risk layer: Controls prompts, outputs, tool usage. This is the layer your current tooling doesn't reach.

Most teams have the first two layers covered. The third is largely unaddressed.


The Failure Mode Kubernetes Will Never Catch

Here's the production scenario that matters:

  1. User submits a prompt with a hidden injection instruction
  2. Model retrieves internal context via RAG
  3. Model outputs sensitive internal data in its response
  4. Response returns HTTP 200
  5. No alerts fire. No logs capture what the model decided. From Kubernetes' perspective: successful request. Pod healthy. RBAC respected. Latency within SLA.

From a security perspective: complete boundary failure.

LLM security boundary failure — five-step scenario showing how a prompt injection attack returns 200 OK with no Kubernetes alerts

This is the observability inversion. Traditional monitoring asks: did it run? was it fast? did it error?

LLM observability needs to ask: was it correct? was it safe? was it allowed?

Infrastructure observability measures execution. LLM observability measures outcomes.


What the Actual Boundary Requires

Four control layers need to exist above Kubernetes:

Ingress Control — prompt validation and injection filtering before the model sees the request.

Egress Control — output scanning and PII detection before the response leaves the system.

Action Control — for agentic systems with tool access, explicit allow-lists scoped per model and context. RBAC governs which service account can call which API. This governs which model, in which context, is permitted to trigger which action. Not the same constraint.

Audit Control — sovereign, immutable inference logging. If your inference logs live in a vendor's platform, you don't fully own the audit trail.

Emerging implementations like Kong AI Gateway and Portkey are building toward this pattern — but the pattern matters more than the product. These four components need to exist regardless of what implements them.

LLM Control Plane Pattern — four enforcement components: Ingress Control, Egress Control, Action Control, Audit Control

When Kubernetes Is Enough

To be honest: there are AI workloads where infrastructure controls are sufficient.

  • Stateless, isolated LLM — no persistent context
  • No tool access — text output only
  • No sensitive context in scope
  • No external system impact If your workload meets all four conditions, your infrastructure boundary largely holds.

The moment you add RAG retrieval, tool use, memory, or agentic orchestration — any one of them — you're operating at the LLM Boundary layer, and Kubernetes alone isn't sufficient.

Most enterprise AI workloads don't meet those conditions.


The Practical Takeaway

Your Kubernetes security posture is necessary. It is not sufficient for LLM workloads.

The cluster can be hardened. The model is still non-deterministic. Those are two different problems requiring two different control layers.

If you're running LLMs on Kubernetes with only infrastructure-layer controls, you have a boundary problem you haven't measured yet. The absence of alerts isn't evidence of safety — it's evidence that your observability doesn't reach the layer where LLM risk lives.


Full architecture breakdown including the LLM Security Boundary Model and LLM Control Plane Pattern framework at rack2cloud.com

Top comments (0)