Kubernetes Security Best Practices 2026: The Complete Hardening Guide
Introduction
A single misconfigured Kubernetes cluster can expose your entire infrastructure in minutes. In 2025 alone, over 60% of organizations reported at least one Kubernetes security incident — and the majority traced back to preventable misconfigurations, not zero-day exploits.
Kubernetes ships with minimal security defaults. Everything is open. Pods can talk to each other freely. Service accounts carry cluster-admin privileges by default. Secrets sit in etcd unencrypted. If you deploy a vanilla cluster and walk away, you are effectively running with the doors unlocked.
This guide walks through 4 critical security layers you must implement in any production Kubernetes cluster. Each section includes real, copy-paste-ready YAML manifests and CLI commands. Whether you run EKS, GKE, AKS, or bare-metal kubeadm clusters, these practices apply universally.
Who this is for: DevOps engineers, SREs, platform engineers, and anyone responsible for production Kubernetes clusters. You should be comfortable with kubectl and basic YAML. No prior security specialization required.
What you will implement by the end of this guide:
- Fine-grained RBAC with least-privilege principles
- Pod Security Admission replacing deprecated PSPs
- Zero-trust network policies with default-deny
- Image vulnerability scanning with Trivy in CI/CD
Let's lock it down.
1. RBAC: Least Privilege from Day One
Role-Based Access Control (RBAC) is your first and most important line of defense. The principle is simple: every user, service account, and application should have exactly the permissions it needs — nothing more.
The Problem with Default RBAC
Out of the box, Kubernetes grants the system:masters group (used by kubeadm init) cluster-admin. Every default service account in the kube-system namespace gets elevated privileges. The default service account in every namespace exists automatically and — unless you explicitly bind it — has no permissions. But many teams accidentally grant it broad access during development and forget to revoke it.
Here is the most common anti-pattern we see in production audits:
# BAD: cluster-admin bound to default service account
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dangerous-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: default
namespace: default
This gives every pod in the default namespace unrestricted control over the entire cluster. If any pod gets compromised, the attacker owns everything.
RBAC Best Practices
1. Use namespace-scoped Roles, not ClusterRoles, whenever possible.
A Role is bound to a single namespace. A ClusterRole is cluster-wide. Most applications only need access to resources in their own namespace.
# GOOD: namespace-scoped Role for a web application
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: webapp-role
namespace: production
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "update"]
2. Create a dedicated ServiceAccount for every application.
Never use the default service account. Always create a named ServiceAccount and bind it to a specific Role.
apiVersion: v1
kind: ServiceAccount
metadata:
name: webapp-sa
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: webapp-binding
namespace: production
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: webapp-role
subjects:
- kind: ServiceAccount
name: webapp-sa
namespace: production
Then reference it in your Deployment:
spec:
serviceAccountName: webapp-sa
automountServiceAccountToken: true
3. Avoid wildcard verbs and resources.
Every * in your RBAC rules is a potential escalation path. Be specific:
# BAD
verbs: ["*"]
resources: ["*"]
# GOOD: explicit
verbs: ["get", "list", "watch"]
resources: ["pods", "services"]
4. Use kubectl auth can-i to verify permissions.
Before deploying, test what a service account can actually do:
kubectl auth can-i create deployments \
--as=system:serviceaccount:production:webapp-sa \
--namespace=production
5. Separate human users from machine accounts.
Use OIDC (OpenID Connect) for human authentication. Service accounts are for pods and CI/CD pipelines. Map OIDC groups to Kubernetes roles — never give individual users cluster-admin. Tools like Dex, Keycloak, or cloud-provider IAM (aws-iam-authenticator for EKS, GCP IAM for GKE) handle this cleanly.
6. Audit your RBAC regularly.
Run this one-liner to find overly permissive bindings:
kubectl get clusterrolebindings -o json | \
jq '.items[] | select(.roleRef.name=="cluster-admin") | .subjects'
You will be surprised how many cluster-admin bindings accumulate over time. RBAC auditing tools like kubescape, kube-bench, or popeye can automate this check.
RBAC for CI/CD Pipelines
CI/CD systems (GitHub Actions, GitLab CI, ArgoCD) need API access to deploy. The pattern: create a ServiceAccount with the minimum permissions required for deployments, extract its token, and inject it into your pipeline secrets.
kubectl create serviceaccount cicd-deployer -n production
kubectl create rolebinding cicd-deployer-binding \
--role=webapp-role \
--serviceaccount=production:cicd-deployer \
-n production
# For Kubernetes 1.24+, create a long-lived token:
kubectl create token cicd-deployer -n production --duration=8760h
Store that token in your CI secrets manager — never in source code.
2. Pod Security Standards & Pod Security Admission
Pod Security Policies (PSPs) were deprecated in Kubernetes 1.21 and removed in 1.25. The replacement is Pod Security Admission (PSA) — a built-in admission controller that enforces Pod Security Standards at the namespace level.
The Three Pod Security Standards
| Standard | Description | Key Restrictions |
|---|---|---|
| Privileged | Unrestricted. Equivalent to no policy. | None. Use only for system namespaces. |
| Baseline | Prevents known privilege escalations. Minimum for production. | No hostNetwork, hostPID, hostIPC, hostPorts, privileged containers, or hostPath volumes. |
| Restricted | Hardened following Pod hardening best practices. | Everything Baseline restricts, plus: must run as non-root, seccomp profile required, capabilities dropped to NET_BIND_SERVICE only, read-only root filesystem. |
Enforcing PSA with Namespace Labels
PSA uses namespace labels — no separate CRD needed. Apply a label to any namespace:
# Enforce the restricted policy on a namespace
kubectl label namespace production \
pod-security.kubernetes.io/enforce=restricted
# Also set audit and warn modes for visibility
kubectl label namespace production \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/warn=restricted
Three enforcement modes exist for each label:
- enforce: reject pods that violate the policy
- audit: allow but log violations to the audit log
- warn: allow but show a warning to the user
Gradual rollout strategy: Start with warn and audit modes for a week. Fix all warnings. Then switch to enforce. Never jump straight to enforce=restricted on existing production namespaces — you will break running workloads.
Writing a Restricted-Compliant Pod
Here is a pod that passes the restricted Pod Security Standard:
apiVersion: v1
kind: Pod
metadata:
name: secure-nginx
namespace: production
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: nginx
image: nginx:1.25-alpine
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"]
readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
mountPath: /tmp
- name: nginx-cache
mountPath: /var/cache/nginx
volumes:
- name: tmp
emptyDir: {}
- name: nginx-cache
emptyDir: {}
Key points:
-
runAsNonRoot: true— the container must not run as UID 0 -
seccompProfile: RuntimeDefault— blocks dangerous syscalls by default -
capabilities.drop: ["ALL"]— strip all Linux capabilities -
readOnlyRootFilesystem: true— attackers cannot write to the filesystem -
allowPrivilegeEscalation: false— no setuid binaries
Exemptions (When You Need Them)
Some system workloads genuinely need privileged access — CNI plugins, CSI drivers, monitoring agents. Use namespace exemptions:
# In kube-apiserver.yaml
apiVersion: v1
kind: Pod
spec:
containers:
- command:
- kube-apiserver
- --admission-control-config-file=/etc/kubernetes/admission.yaml
And in the admission configuration:
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
exemptions:
namespaces: ["kube-system", "cert-manager", "ingress-nginx"]
usernames: ["system:serviceaccount:kube-system:calico-node"]
3. Network Policies: Zero-Trust Inside the Cluster
Kubernetes networking is flat by default. Every pod can reach every other pod in the cluster, across all namespaces — with zero built-in filtering. If one pod gets compromised, it can scan your entire internal network, hit internal APIs, and pivot to databases.
Network Policies are your internal firewall. They are Kubernetes-native resources that control traffic flow at Layers 3 and 4 (IP and port level). Think of them as security group rules for pods.
Default-Deny: Lock Everything First
Start by denying all ingress and egress traffic. Then selectively open only what each application needs:
# Default deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {} # selects all pods
policyTypes:
- Ingress
# Default deny all egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
With these two policies in place, no pod can receive or initiate traffic. Then layer on allow rules for specific flows:
# Allow frontend → backend on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
Real-World Zero-Trust Policy Patterns
Pattern 1: Allow only from the ingress controller
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
- podSelector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
Pattern 2: Allow DNS egress only (block everything else)
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
Pattern 3: Allow egress only to specific external IPs
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/8 # internal VPC only
except:
- 10.0.0.0/28 # except management subnet
Cilium: Beyond Basic Network Policies
If your CNI is Cilium (increasingly common in 2026 for eBPF-based networking), you get Layer 7 policies — filtering by HTTP method, path, or DNS name:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: l7-policy
spec:
endpointSelector:
matchLabels:
app: api
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: "GET"
path: "/api/v1/.*"
Layer 7 policies let you declare: only GET requests to /api/v1/ endpoints are allowed — no POST, no DELETE, no /admin. A compromised frontend pod cannot abuse backend endpoints it should not access.
Testing Network Policies
Use netshoot (a network debugging container) to verify connectivity:
kubectl run netshoot --rm -it --image nicolaka/netshoot -- /bin/bash
# From inside, test connectivity:
curl backend-service.production.svc.cluster.local:8080
# Test DNS:
nslookup kubernetes.default
Always test both positive (traffic that should flow) and negative (traffic that should be blocked) cases. Network policies are easy to misconfigure — a single missing label selector can leave a hole wide open.
4. Image Security: Scan Every Container with Trivy
Running containers from untrusted or unverified images is the most common entry point for supply-chain attacks. In 2024, a compromised xz-utils backdoor nearly made it into production containers worldwide. In 2025, multiple malicious NPM and PyPI packages were found embedded in popular Docker images.
The rule: every image that enters your cluster must be scanned. Trivy, by Aqua Security, is the de-facto open-source scanner — fast, comprehensive, and CI/CD-friendly. It scans OS packages, language dependencies, and misconfigurations in a single pass.
Scanning Images
trivy image nginx:1.25-alpine
# Filter by severity — only flag HIGH and CRITICAL
trivy image --severity HIGH,CRITICAL nginx:1.25-alpine
# Output as JSON for pipeline integration
trivy image --format json --output trivy-report.json myapp:latest
# Scan filesystem for IaC misconfigurations
trivy config ./kubernetes/
Integrating Trivy into CI/CD
The most effective pattern: scan in your pipeline and block deployments for HIGH or CRITICAL findings.
GitHub Actions — Trivy scan step:
- name: Scan container image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: sarif
output: trivy-results.sarif
severity: HIGH,CRITICAL
exit-code: 1
GitLab CI — Trivy scan job:
trivy-scan:
stage: security
image: aquasec/trivy:latest
script:
- trivy image --severity HIGH,CRITICAL --exit-code 1 $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
allow_failure: false
Trivy Operator: Continuous Cluster Scanning
For runtime scanning of images already deployed in your cluster, install the Trivy Operator:
helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm install trivy-operator aqua/trivy-operator \
--namespace trivy-system \
--create-namespace \
--set trivy.ignoreUnfixed=true
Query vulnerability reports across your namespaces:
kubectl get vulnerabilityreports -n production
kubectl describe vulnerabilityreport replicaset-myapp-7d4f8b9c6d-nginx
Image Pinning and Digest-Based References
Never use floating tags like :latest in production. Pin to a content-addressable digest:
# BAD: floating tag, can change underneath you
image: nginx:latest
# GOOD: digest pinning, immutable reference
image: nginx@sha256:aed492c4d72c4a4e2f4d7d5e1f3b6c8a9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4
Extract the digest of any image:
docker inspect nginx:1.25-alpine | jq -r '.[0].RepoDigests[0]'
Combine digest pinning with automated PRs from Renovate or Dependabot to keep digests updated without manual toil.
Conclusion
Kubernetes security is a layered defense. Start with RBAC — if every pod runs as cluster-admin, nothing else matters. Then enforce Pod Security Standards to prevent containers from escaping their sandbox. Lock down the network with default-deny policies so a compromised pod can't pivot. Finally, scan every image with Trivy to catch vulnerabilities before they reach production.
Each of these four layers independently raises the bar. Together, they turn a wide-open cluster into a hardened environment where an attacker needs to defeat multiple defenses to cause real damage. The YAML in this guide is production-ready — copy, adapt, and apply it today.
Originally published at devtocash.com
Top comments (0)