Dmitry Protsenko

Posted on Sep 16 • Originally published at protsenko.dev

Kubernetes Security: Best Practices to Protect Your Cluster

#kubernetes #containers #cybersecurity #docker

You can find the original version of this article on my website, protsenko.dev.

Hi! In this article, I'm sharing 12 collected Kubernetes security best practices for making your cluster secure by writing secure deployments/services, etc. This article is based on my experience creating the Kubernetes Security IDEA plugin and all of the practices covered by the plugin.

If this project has been helpful to you, please consider giving it a ⭐ on GitHub to help others discover it.

12 Kubernetes Hardening Best Practices

1. Using Non-Root Containers

Always try to run containers as a non-root user. By default, containers execute as the root user (UID 0) inside the container, unless the image or Kubernetes securityContext specifies otherwise. If you run a container as root, it might seem harmless. If an attacker breaks out of the container, they will have root on the host. Even without a full breakout, a process running as root with certain misconfigurations (like some of the above capabilities or host mounts) could do more damage to the host. There’s a lack of certain preventive security controls when running as root, which increases the risk of container escape.

The better practice is to run as an unprivileged user. You can create a user in your container image (many official images have a user like node, nginx, etc.) Then either set that as the default in the Dockerfile or use Kubernetes to request it. Kubernetes securityContext has fields runAsUser and runAsNonRoot which helps enforce this. For example:

securityContext:
  runAsUser: 1000      # UID 1000 (non-root user)
  runAsNonRoot: true   # Ensure the container will not start as root

By specifying runAsNonRoot: true, the kubelet will actually refuse to start the container if it would run as UID 0. This is a guardrail in case someone tries to deploy an image that runs as root – it won’t run unless you explicitly allow root.

If you have an image that must run as root (some older software might assume it, or it needs privileged access), think carefully – can you modify the image or use a different solution? Running as root should be the exception, not the norm.

Tip: Many base images provide a non-root user, but don’t activate it by default. For example, the official Node.js image has a user “node” (UID 1000). You can use that in Kubernetes by doing runAsUser: 1000. For images that lack a user, consider rebuilding the image to add one or switching to an image that supports non-root operation.

Running as non-root adds an extra layer to Kubernetes security. Even if an attacker gets code execution in the container, they hit a lower-privileged user boundary, and it’s harder to escalate from there. Combine this with not running privileged and dropping caps, and your container is much less attractive to attackers.

2. Using Privileged Containers

Don’t run containers in privileged mode unless absolutely necessary (and practically, it’s almost never necessary for typical apps). A privileged container (securityContext.privileged: true) has nearly all the same access to the host as processes on the host do. It lifts most of the restrictions that containers normally have. When you run a container privileged, it can access devices on the host, and it can become almost indistinguishable from a host process. Privileged mode will share the host’s namespaces (IPC, PID, etc.), and disable many other controls (seccomp, AppArmor, capabilities limits). In essence, a privileged container is “just a process on the host with root privileges,” which negates the security benefits of using containers.

Someone could use privileged mode for low-level system tasks (for example, a container that needs to manipulate network interfaces or administer the host). But even in those cases, modern Kubernetes has alternatives (like using specific capabilities, or running as a daemon on the host outside of Kubernetes). Granting full privilege is like handing the keys to your kingdom to that container. If compromised, the attacker will trivially root the node and possibly move laterally in the cluster.

Kubernetes’s baseline policy forbids privileged containers for general workloads. Tools like admission controllers or Pod Security Policies (in the past) would prevent you from deploying privileged pods in most namespaces, and for good reason.

Example to avoid:

securityContext:
  privileged: true

If you see that in a manifest, think twice. Why do you need these rights? Can we instead just give it the specific capability it needs? Or run it differently? Containers are using privileged mode for cluster infrastructure components, not for user applications. However, running something with elevated privileges is a bad idea, not only for Kubernetes security

If you absolutely must run something privileged (say, a CSI driver or a networking plugin that must manipulate the host network stack), isolate it to its own namespace and prevent untrusted users from deploying containers there. By avoiding privileged mode, you retain the isolation mechanisms (like cgroups, seccomp, AppArmor, namespaces, capabilities restrictions) that make containers a secure way to deploy applications.

3. Do not use hostPath Volumes

Avoid hostPath volumes in your Pods whenever possible. A hostPath volume mounts a file or directory from the host node’s filesystem directly into a pod. This is essentially giving the container direct access to part of the host’s file system. The security implications are significant: if an attacker compromises the container, they could read or modify critical files on the host through the hostPath mount. Even if the container isn’t running as root, an attacker can combine hostPath with other escalations (like running privileged or a sticky bit attack) to tamper with the node.

HostPath volumes “present security risks that could lead to container escape.” They break the isolation between your application and the host OS. For example, consider if you mount /var/run/docker.sock from the host (a common but extremely risky practice) – the container can then control the Docker daemon and effectively gain root on the host. Even mounting something innocuous, like /var/log could allow a malicious container to poison logs or consume disk space. Writing to any hostPath with system files could potentially crash the node or alter its state.

Kubernetes acknowledges this risk: PodSecurity Restricted forbids hostPath volumes entirely. If you must use a hostPath (for example, some daemon needs to read a host file), consider making it read-only and limiting the path as much as possible. Also, run that pod with the least privileges (non-root user, no extra caps, not privileged).

Example to avoid:

volumes:
- name: host-files
  hostPath:
    path: /etc
    type: Directory

The above would give the container access to the host’s /etc directory – clearly a bad idea, as it could read passwords or modify config. If your workload needs to read host info, see if there’s an API or Kubernetes mechanism (like Downward API for some metadata) instead.

There are a few legitimate use cases for hostPath (like a log collection agent reading /var/log/ or a storage plugin writing to a host directory), but those should be deployed with tight controls and usually in dedicated namespaces. For most apps, you shouldn’t need hostPath at all. Use ConfigMap/Secret for config, EmptyDir for scratch space, and PVC for persistent storage. By avoiding hostPath, you keep the container fully sandboxed from the host’s filesystem, making Kubernetes security better.

4. Do not use hostPort as Opens the Node’s Port

Be cautious with the hostPort setting on Pods. When you specify a hostPort for a container, that port on the Kubernetes node (host machine) is opened and mapped to your pod. This can be risky because it exposes the host’s network interface to the container. If an attacker compromises the container, they could potentially intercept traffic on that host port or exploit it to gain deeper access. Exposing a host port to a container can open network pathways into your cluster, allowing the container to intercept traffic to a host service or bypass network policies. It also constrains scheduling because only one Pod per node can use each host port, and it can lead to port conflicts.

In general, you should avoid using hostPort unless absolutely necessary. Kubernetes Services (NodePort or LoadBalancer types) or Ingress resources better serve most use cases, such as exposing a service externally, because they handle traffic routing more securely without binding directly to the host’s network ports. Reserve hostPort for low-level system pods or networking tools that require a specific port on every node, and even then, use it sparingly.

Example to avoid:

apiVersion: v1
kind: Pod
metadata:
  name: hostport-pod
spec:
  containers:
  - name: nginx
    image: nginx:latest
    ports:
    - containerPort: 80
      hostPort: 80

5. Do not share the Host Namespace

Pods can request to share certain namespaces with the host (node) – namely, the network, PID (process), and IPC namespaces. When a container shares the host’s namespace, it essentially breaks the isolation between the container and the host for that aspect. You should avoid it for most workloads. For example:

If a pod sets hostNetwork: true, it means the pod is using the host machine’s network interface directly. The pod can see all host network interfaces and even potentially sniff traffic. This breaks the default network isolation between pods and the host.
If hostPID: true, the container shares the host’s process ID space. That means it can see (and potentially interact with) processes running on the host (or other pods on the host). An attacker might leverage this to tamper with host processes or simply gather sensitive info.
If hostIPC: true, the pod shares the host’s inter-process communication namespace (things like shared memory segments). That could allow a malicious container to read/write shared memory used by something on the host.

In short, sharing host namespaces can lead to a container escape, compromising Kubernetes security. Pods that share namespaces with the host can communicate with host processes and glean information about the host, which is why baseline security policies disallow it.

Unless you are running a system-level daemon that needs this (e.g., a monitoring agent that needs to see all host processes, or a network plugin that needs host networking), you should leave these fields false (which is the default).

Example to avoid:

spec:
  hostNetwork: true
  hostPID: true
  hostIPC: true

Each of those should normally be false or not set at all. If you need one of them for a specific reason (say, hostNetwork for a networking pod), isolate that to a dedicated namespace or node and tightly control it. And never run general application pods with any of those enabled.

Kubernetes Pod Security Standards (Baseline and Restricted) disallow sharing host namespaces for exactly these reasons. Adhere to that: pods should live in their own namespaces, not the host’s.

6. Do not use Insecure Capabilities

Drop unnecessary Linux capabilities from your containers. By default, containers run with a limited set of Linux capabilities – these are like fine-grained permissions that the root user inside the container can have. Granting additional or “non-default” capabilities can be dangerous. Certain powerful capabilities (for example, SYS_ADMIN or NET_ADMIN) can allow a process in a container to perform actions that might lead to container escapes or privilege escalations on the node. As Google’s GKE security guidance notes: giving a container extra capabilities could allow it to break out of the container sandbox.

If you don’t explicitly drop capabilities, a container still has a small set of default capabilities. For better security, it’s best practice to drop all capabilities and only add back what you truly need. This adheres to the principle of least privilege. Kubernetes lets you specify this in the pod or container security context. For example:

securityContext:
  capabilities:
    drop: ["ALL"]
    add: ["NET_BIND_SERVICE"]

In the above snippet, we drop everything and then only add NET_BIND_SERVICE (which allows binding to ports below 1024) as an example of a minimally required capability. The Kubernetes Restricted policy profile actually expects that containers drop ALL capabilities and, at most, add only a very limited set, like NET_BIND_SERVICE. Many common containers (especially web apps) do not require any special Linux capabilities to function.

Example to avoid:

securityContext:
  capabilities:
    add: ["NET_RAW", "SYS_ADMIN"]

Here, NET_RAW allows the container to create raw sockets (which could be abused for packet spoofing or sniffing), and SYS_ADMIN is an extremely privileged capability that [among other things] allows mounting file systems, configuring network interfaces, etc. An attacker could use these to escape the container and damage Kubernetes security.

If your application truly needs a specific capability, add only that one and carefully audit the implications. In general, try to run with as few capabilities as possible.

7. AppArmor Profile Disabled or Overridden

Avoid disabling or overriding AppArmor profiles for your containers. AppArmor is a Linux kernel security module that can confine what a container can do at the system level. On AppArmor-supported hosts, Kubernetes applies a default profile (runtime/default) to containers, which restrict certain actions. If you run a pod with an unconfined AppArmor profile, you are effectively turning off these protections. In fact, experts consider running a container with an unconfined AppArmor profile a bad security practice. It means AppArmor doesn't restrict the container at all, which increases the potential damage if an attacker compromises that container.

By default, if you don’t specify an AppArmor profile, the container runtime’s default policy is used (which is typically a reasonably safe profile that provides essential Kubernetes Security protection). You should not explicitly set the profile to unconfined (which disables AppArmor). Instead, allow the default or use a tailored profile if you have one. Kubernetes Pod Security policies recommend using either the runtime default or specific allowed profiles, and preventing any override to an unconfined state.

Example to avoid:

metadata:
  name: insecure-pod
  annotations:
    container.apparmor.security.beta.kubernetes.io/my-container: unconfined
spec:
  containers:
  - name: my-container
    image: alpine

In the above snippet, the annotation forces the container’s AppArmor profile to unconfined, disabling AppArmor confinement. Instead, you should either omit the annotation (to use the default profile) or set it to runtime/default (to explicitly use the default). This ensures AppArmor is enforcing some restrictions on what the container can access on the host.

8. Do not override Non-Default /proc Mount

Ensure your containers use the default /proc mount behavior. In Linux, /proc it is a virtual filesystem that exposes process and kernel information. Container runtimes normally mask or hide certain paths /proc to prevent containers from seeing sensitive host information. Kubernetes has a setting procMount in the security context that can be either Default (the normal, masked behavior) or Unmasked. Do not use Unmasked for procMount unless you have a very good reason. An “unmasked” /proc means the container can see a lot more system info, which can lead to information leakage or even assist in a container escape.

Deployments with an unsafe /proc mount (procMount=Unmasked) bypass the default kernel protections. An unmasked /proc can potentially expose host information to the container, resulting in information leaks or providing an avenue for attackers to escalate privileges. For example, an Unmasked /proc might reveal details of processes running on the host or allow access to /proc/kcore (which could be dangerous). Unless you’re doing low-level debugging or monitoring that explicitly requires this (which is rare and usually better handled another way), you should not change the procMount from its default to maintain strong Kubernetes Security.

The best practice is simple: leave procMount as Default, which is also the Kubernetes default behavior if you don’t specify it. The Pod Security Restricted standards require that /proc the mask remain the default for all containers.

Example to avoid:

securityContext:
  procMount: Unmasked

In summary, do not unmask /proc. Keep the default masks that Kubernetes and the container runtime provide—they reduce the container’s visibility into the host’s processes and kernel.

9. Do not use Restricting Volume Types

Not all types of volumes are equal when it comes to security. Kubernetes supports many volume types (ConfigMaps, Secrets, persistent volumes, hostPath, NFS, emptyDir, etc.), but some can expose your Pod to risk. The Pod Security “Restricted” standard defines an allow-list of safe volume types that a pod can use. The Restricted policy allows only the following volume types: ConfigMap, CSI, DownwardAPI, emptyDir, Ephemeral (inline CSI volumes), PersistentVolumeClaim, Projected, and Secret. In practice, these are volumes that do not directly mount the host’s filesystem in an unsafe way.

The volume types not on that list (for example, hostPath, NFS, awsElasticBlockStore, and some others) are either inherently risky or are better handled via PersistentVolumeClaims. For instance, a hostPath volume mounts a directory from the node’s filesystem into your container – this can easily lead to container escapes or tampering with host files (we discuss hostPath in detail in the next section). Using a hostPath volume mounts a node’s directory into your container, risking container escapes and host file tampering (discussed in the next section). NFS and other network storage volumes present a lower privilege escalation risk, but without proper management, they can enable denial-of-service or data tampering. To prevent this, manage them through the PersistentVolume subsystem instead of directly within a Pod spec.

Best practice: Use only the necessary volume types and prefer higher-level abstractions. If you need to mount storage, use PersistentVolumeClaim (with a proper StorageClass) instead of directly using a hostPath or other host-dependent volume. This way, the cluster can enforce storage isolation, and you avoid giving the container direct access to the host. You should pass most config data via ConfigMap or Secret volumes rather than baking it into images or using host paths.

If you have to enforce this, consider enabling the Kubernetes Pod Security Admission controller in restricted mode for your namespaces. It will automatically forbid Pods that use disallowed volume types. In short, limit volume types to the safe set – basically, ephemeral volumes (emptyDir, etc.), config/secret volumes, and PVCs backed by external storage. This reduces the risk of a container directly accessing the host or other unintended data sources, thereby strengthening Kubernetes Security.

10. Do not set Custom SELinux Options

Avoid specifying custom SELinux options for your pods unless you really know what you are doing. SELinux is another Linux kernel security mechanism (a Mandatory Access Control system) that labels resources and defines which processes can access which resources. Kubernetes, by default, will let the container runtime apply a default SELinux context to your container (usually a confined type like container_t). You have the option to override the SELinux context via the pod or container securityContext (seLinuxOptions field), but changing these labels can weaken isolation if done incorrectly.

The Snyk security blog warns: altering the SELinux labels of a container process could potentially allow that process to escape its container and access the host filesystem. In simpler terms, the default SELinux policy on a host prevents containers from seeing or modifying host files. If you override the SELinux type or role to something more privileged (or turn SELinux to permissive mode on the host), a compromised container might break out and read/write host files it shouldn’t.

Kubernetes’ Pod Security Standards reflect this by restricting SELinux options. The Restricted profile forbids setting a custom SELinux user or role, and only allows specific SELinux types (the standard container types like container_t or container_init_t). Unless you have a specific need (for example, integrating with a host that uses SELinux extensively and has custom policies), you typically won’t set these at all. Just let the container runtime apply the default confinement.

Example to avoid:

securityContext:
  seLinuxOptions:
    user: system_u
    role: system_r
    type: spc_t      # “spc_t” is a special type for super-privileged containers

The above would label the container in a very permissive way (depending on host policy, spc_t might allow broad access). This is not recommended unless absolutely required by your security team’s policy. In most cases, you should omit seLinuxOptions entirely. If you do need it, stick to the container types provided by your distribution (for example, on Red Hat-based systems, container_t is the confined type for containers).

In summary, do not override SELinux labels to something less restrictive. The defaults are there to keep your container constrained and maintain Kubernetes Security. Manage SELinux at the cluster/node level rather than per Pod unless you’re an SELinux expert with a clear goal. And if SELinux is too much overhead, consider using AppArmor or seccomp for adding security – but never making the container more privileged than defaults.

11. Left Seccomp Profile by default

Enable a seccomp profile for your containers (or use the default one); do not run containers with seccomp turned off (“unconfined”). Seccomp (secure computing mode) is a Linux kernel feature that can filter system calls that a process is allowed to make. Kubernetes lets you specify a seccomp profile for pods/containers. If you don’t specify anything, historically many runtimes would run the container as unconfined (no filtering), but newer Kubernetes versions and runtimes often apply a default seccomp profile (e.g., Docker’s default seccomp profile) automatically. Regardless, you want seccomp filtering in place.

Running a container with seccomp=Unconfined means it can call any syscalls it wants, which broadens the attack surface. Unconfined places no restrictions on syscalls – allowing all system calls, which reduces security. In contrast, the default seccomp profile blocks dozens of dangerous syscalls that containers typically never need (like manipulating kernel modules, system clocks, etc.). These blocked calls are often those that could be used to break out or do harm to the host.

Best practice: use RuntimeDefault seccomp profile (Kubernetes’ way of saying “use the container runtime’s default seccomp policy”) or a specific custom profile if you have one. The Kubernetes Restricted policy requires that seccomp be explicitly set to either RuntimeDefault or a named profile, and not left as Unconfined to maintain strong Kubernetes Security

This ensures the container is running under seccomp confinement. If you wanted to use a custom profile, you’d set type: Localhost and provide the profile file, but that’s an advanced scenario. The main thing is – don’t set seccompProfile type to Unconfined. For example:

securityContext:
  seccompProfile:
    type: Unconfined

The above would explicitly disable syscall filtering, exposing you to exploits that leverage obscure syscalls. There have been real-world container breakouts that depended on having seccomp off, so turning it on is a simple way to mitigate whole classes of kernel vulnerabilities.

Unless you have a specific container that is failing due to seccomp (in which case, consider adjusting the profile rather than removing it entirely), you should always use seccomp. It’s a transparent layer of defense with little to no performance cost in typical applications.

12. Beware of using Insecure Sysctls

Do not enable unsafe sysctls in your Pods. Sysctls (system controls) are kernel parameters that can tweak networking, memory, and other settings. Kubernetes classifies sysctls into safe and unsafe. Safe sysctls are those that are namespaced to the container or pod (meaning their effects are limited to that pod) and isolated from the host. Unsafe sysctls are those that apply to the entire host kernel and could affect all pods or even compromise security. Examples of unsafe sysctls might include things like kernel.shmmax (which affects kernel shared memory limits globally) or net.ipv4.ip_forward (which could change node-level networking behavior).

Enabling unsafe sysctls can disable important security mechanisms or negatively impact the node’s stability. They might allow a pod to consume resources beyond its limits or interfere with other pods. In the worst case, a bad actor could use an unsafe sysctl to panic the kernel or elevate privileges.

Kubernetes by default will prevent pods from using unsafe sysctls unless the cluster admin has explicitly allowed it (there’s a feature gate and a whitelist one can set on the kubelet). The best practice is to stick to the safe sysctls. According to the Pod Security Standards, you should disallow all but an allowed safe subset of sysctls. Safe sysctls include a handful of names like net.ipv4.ip_local_port_range, net.ipv4.tcp_syncookies, net.ipv4.ping_group_range, etc., which are known not to break isolation.

If you find yourself needing to set a kernel parameter for your application to run, double-check if it’s truly namespaced. For example, increasing kernel.shmmax for a database – rather use a proper mechanism or ensure it’s allowed, because that setting affects the host kernel’s shared memory allowance for all processes.

Example: The following Pod securityContext shows setting a sysctl:

securityContext:
  sysctls:
  - name: kernel.shmmax
    value: "16777216"

This particular sysctl (kernel.shmmax) is not namespaced per pod – it would raise the shared memory segment size limit on the host kernel itself. This could impact other pods or processes on the host. Such a sysctl is considered unsafe and would be rejected by Kubernetes unless the cluster is configured to allow it (and it generally shouldn’t be). In contrast, a “safe” sysctl like net.ipv4.tcp_syncookies could be set in a pod’s spec if needed, because it’s isolated to the pod’s network namespace.

In summary, avoid using sysctls that are not explicitly documented as safe for Kubernetes. If you absolutely require an unsafe sysctl for a specialized application, you’ll need a waiver in the cluster, and you should isolate that workload as much as possible. For everyone else – stick to defaults; don’t turn your pods into mini kernel tweakers. The default kernel settings are usually fine, and if not, they should be tuned on the host by admins, not on a per-pod basis by application owners.

Your Kubernetes Security Action Plan

By adhering to these 12 Kubernetes security best practices, you significantly harden your Kubernetes cluster’s security. Many of these boil down to least privilege – giving your pods only the access they truly need and nothing more.

It’s a good idea to integrate these checks into your development and CI/CD process. For the best IDE integration, give the Kubernetes Security plugin for JetBrains IDEs a try. It is written in pure Kotlin and utilizes the features of the IntelliJ platform. All of this makes shift-left happen due to the fly checks. If you are interested in IDE plugin development, read my article about "How I made Docker linter for IntelliJ IDEA"

Don’t miss my new articles—follow me on LinkedIn!

DEV Community

Kubernetes Security: Best Practices to Protect Your Cluster

12 Kubernetes Hardening Best Practices

1. Using Non-Root Containers

2. Using Privileged Containers

3. Do not use hostPath Volumes

4. Do not use hostPort as Opens the Node’s Port

5. Do not share the Host Namespace

6. Do not use Insecure Capabilities

7. AppArmor Profile Disabled or Overridden

8. Do not override Non-Default /proc Mount

9. Do not use Restricting Volume Types

10. Do not set Custom SELinux Options

11. Left Seccomp Profile by default

12. Beware of using Insecure Sysctls

Your Kubernetes Security Action Plan

Top comments (0)