Author: A senior DevOps engineer who’s broken (and fixed) too many clusters so you don’t have to.
Executive Summary
- Carve tenants with Namespaces to isolate RBAC, policies, and quotas per team.
- Enforce fair sharing with ResourceQuota + LimitRange so noisy neighbors can’t starve the cluster.
- Lock down traffic with NetworkPolicy: start with default-deny, then allow the minimum (DNS, app→DB).
- Harden workloads with Pod Security Admission (PSA) to block privileged/unsafe pod specs at the namespace boundary.
-
Ship a repeatable baseline: two team namespaces, quotas/limits, default-deny + specific allows, PSA
restricted
.
Prereqs
- A Kubernetes cluster (≥ v1.28 recommended) and
kubectl
configured. - Cluster has CNI that supports NetworkPolicy (Calico, Cilium, Antrea, etc.).
- Helm optional (not required here).
- You have cluster-admin for initial setup.
kubectl version --short
kubectl get nodes -o wide
kubectl get pods -n kube-system
Concepts
1) Namespaces for isolation
Definition: Namespaces slice a cluster into logical tenants with independent RBAC, quotas, and policies. Prevents accidental cross-team impact, scopes access and makes policy application simple (kubectl label ns
once).
Best practices:
- One namespace per team/app-tier, not per environment object (use labels like
env=prod
). - Name predictably:
team-a
,team-b
,platform
, etc. - Attach PSA labels, default network policies, and quotas at namespace creation.
Commands:
kubectl create ns team-a
kubectl create ns team-b
kubectl label ns team-a env=dev --overwrite
kubectl get ns --show-labels
Before → After:
-
Before: All pods in
default
, broad access, hard to apply policies. -
After:
team-a/
,team-b/
with their own quotas, PSA, and network baselines.
When to use: Always—namespaces are table stakes for multi-tenancy.
2) ResourceQuota & LimitRange
Definition: ResourceQuota
caps aggregate usage per namespace; LimitRange
sets per-pod/container defaults and maxima. Stops runaway resource grabs and ensures every pod has sensible requests/limits for scheduling and stability.
Best practices:
- Pair them: RQ for team-level ceilings; LR for sane per-workload defaults.
- Set CPU/memory requests+limits and object counts (pods, PVCs, services).
- Include ephemeral-storage where supported; keep room for rollouts (e.g., 20–30% headroom).
Commands:
kubectl apply -n team-a -f quota-team-a.yaml
kubectl get resourcequota -n team-a
kubectl describe limitrange -n team-a
Before → After:
- Before: Pods without limits; one job consumes all CPU → others starve.
- After: Each container gets defaults; namespace can’t exceed its slice.
When to use: Whenever multiple teams share nodes or costs matter (i.e., always).
3) NetworkPolicy
Definition: NetworkPolicy declares which pods may talk to which, for ingress and egress. Prevents lateral movement, accidental chats between unrelated services, and data exfiltration.
Best practices:
- Start with default-deny for both ingress and egress.
- Then allow only what you need (DNS 53, app→DB 5432, etc.).
- Use labels consistently; avoid relying on IPs. Add namespaceSelector + podSelector for system services like CoreDNS.
Commands:
kubectl apply -n team-a -f np-default-deny.yaml
kubectl apply -n team-a -f np-allow-dns.yaml
kubectl apply -n team-a -f np-allow-app-to-db.yaml
Before → After:
- Before: Any pod can connect to any pod/Internet.
- After: Only DNS + app→db allowed; everything else dropped.
When to use: In any regulated or multi-tenant cluster; also in prod by default.
4) Pod Security Admission (PSA)
Definition: PSA enforces Kubernetes security profiles (privileged
, baseline
, restricted
) via namespace labels. Blocks dangerous specs (privileged, hostPID/IPC, hostPath, capability escalation) before pods land on nodes.
Best practices:
- Default
restricted
enforce on app namespaces; usebaseline
temporarily while migrating. - Apply enforce, warn, and audit labels during rollout to see breaks before enforcing.
- Keep images running as non-root; avoid broad capabilities and host volumes.
Commands:
kubectl label ns team-a \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/warn-version=latest \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/audit-version=latest --overwrite
Before → After:
-
Before: Developers could deploy privileged pods or mount
/var/run/docker.sock
. - After: Such pods are rejected at admission with a clear error.
When to use: Immediately after namespace creation; dev/prod alike (stricter in prod).
Diagram 1 — Namespace/Tenant Layout
Mini-Lab (≈25 min): Two tenants, quotas/limits, default-deny, app→db allow, PSA restricted
You can paste these as-is; tweak CPU/memory as needed.
Create namespaces + PSA labels
kubectl create ns team-a
kubectl create ns team-b
for ns in team-a team-b; do
kubectl label ns $ns \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/warn-version=latest \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/audit-version=latest --overwrite
done
Apply ResourceQuota + LimitRange
quota-team-a.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-a-quota
spec:
hard:
pods: "30"
requests.cpu: "8"
requests.memory: 16Gi
limits.cpu: "16"
limits.memory: 32Gi
persistentvolumeclaims: "10"
services: "10"
services.loadbalancers: "2"
---
apiVersion: v1
kind: LimitRange
metadata:
name: team-a-defaults
spec:
limits:
- type: Container
defaultRequest:
cpu: "200m"
memory: "256Mi"
default:
cpu: "500m"
memory: "512Mi"
max:
cpu: "2"
memory: "2Gi"
Apply to both namespaces:
kubectl apply -n team-a -f quota-team-a.yaml
kubectl apply -n team-b -f quota-team-a.yaml
Deploy a tiny app and a “db” in team-a
app-db.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: db
labels: { app: db }
spec:
replicas: 1
selector: { matchLabels: { app: db } }
template:
metadata: { labels: { app: db } }
spec:
containers:
- name: db
image: postgres:16-alpine
env:
- { name: POSTGRES_PASSWORD, value: dev }
ports:
- { containerPort: 5432, name: pg }
---
apiVersion: v1
kind: Service
metadata:
name: db
labels: { app: db }
spec:
selector: { app: db }
ports:
- { port: 5432, targetPort: pg, protocol: TCP }
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
labels: { app: web }
spec:
replicas: 1
selector: { matchLabels: { app: web } }
template:
metadata: { labels: { app: web } }
spec:
containers:
- name: web
image: curlimages/curl:8.10.1
command: ["sleep","infinity"]
kubectl apply -n team-a -f app-db.yaml
kubectl wait -n team-a --for=condition=available deploy/web deploy/db --timeout=90s
Smoke test (pre-policy, should connect):
POD=$(kubectl -n team-a get pod -l app=web -o jsonpath='{.items[0].metadata.name}')
kubectl -n team-a exec -it "$POD" -- sh -lc 'curl -m2 db:5432 || true'
# Expect: connection established or at least a TCP handshake banner (not blocked)
Enforce default-deny (ingress+egress) in team-a
np-default-deny.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {} # select all pods
policyTypes:
- Ingress
- Egress
kubectl apply -n team-a -f np-default-deny.yaml
Test (should now fail):
kubectl -n team-a exec -it "$POD" -- sh -lc 'curl -m2 db:5432 || echo BLOCKED'
# Expect: BLOCKED / timeout
4) Allow only DNS + app→db
np-allow-dns.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
spec:
podSelector: { } # all pods may resolve DNS
policyTypes: [ Egress ]
egress:
- to:
- namespaceSelector:
matchLabels: { kubernetes.io/metadata.name: kube-system }
podSelector:
matchLabels: { k8s-app: kube-dns }
ports:
- { protocol: UDP, port: 53 }
- { protocol: TCP, port: 53 }
np-db-ingress-from-web.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: db-allow-from-web
spec:
podSelector:
matchLabels: { app: db }
policyTypes: [ Ingress ]
ingress:
- from:
- podSelector:
matchLabels: { app: web }
ports:
- { protocol: TCP, port: 5432 }
np-web-egress-to-db.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: web-allow-to-db
spec:
podSelector:
matchLabels: { app: web }
policyTypes: [ Egress ]
egress:
- to:
- podSelector:
matchLabels: { app: db }
ports:
- { protocol: TCP, port: 5432 }
kubectl apply -n team-a -f np-allow-dns.yaml -f np-db-ingress-from-web.yaml -f np-web-egress-to-db.yaml
kubectl -n team-a exec -it "$POD" -- sh -lc 'curl -m2 db:5432 || true'
# Expect: succeeds (policy allows only web→db and DNS egress)
Repeat the same baseline (quotas/limits, default-deny, PSA labels) for
team-b
.
Diagram 2 — NetworkPolicy Traffic Matrix
YAML & Bash Reference
PSA labels (namespace):
kubectl label ns team-a pod-security.kubernetes.io/enforce=restricted --overwrite
kubectl label ns team-a pod-security.kubernetes.io/{enforce, warn, audit}-version=latest --overwrite
ResourceQuota & LimitRange:
apiVersion: v1
kind: ResourceQuota
metadata: { name: <ns>-quota }
spec:
hard:
pods: "30"
requests.cpu: "8"
requests.memory: 16Gi
limits.cpu: "16"
limits.memory: 32Gi
---
apiVersion: v1
kind: LimitRange
metadata: { name: <ns>-defaults }
spec:
limits:
- type: Container
defaultRequest: { cpu: "200m", memory: "256Mi" }
default: { cpu: "500m", memory: "512Mi" }
max: { cpu: "2", memory: "2Gi" }
Default-deny + allows (template): see np-default-deny.yaml
, np-allow-dns.yaml
, np-db-ingress-from-web.yaml
, np-web-egress-to-db.yaml
above.
Namespace-scoped context (handy):
kubectl config set-context --current --namespace=team-a
# or create a named context once:
kubectl config set-context team-a --cluster=$(kubectl config current-context) --user=$(kubectl config view -o jsonpath='{.contexts[?(@.name=="'$(kubectl config current-context)'")].context.user}') --namespace=team-a
kubectl config use-context team-a
Cheatsheet Table
Task | Command / File | Notes |
---|---|---|
Create namespace | kubectl create ns <name> |
Add labels (env=prod , PSA) immediately. |
Label PSA restricted | kubectl label ns <name> pod-security.kubernetes.io/enforce=restricted --overwrite |
Add warn /audit to preview breaks. |
Apply quotas/limits | kubectl apply -n <ns> -f quota-<ns>.yaml |
Pair ResourceQuota with LimitRange . |
Check quota usage | kubectl describe resourcequota -n <ns> |
Watch Used vs Hard . |
Default-deny policy | kubectl apply -n <ns> -f np-default-deny.yaml |
Deny both Ingress and Egress. |
Allow DNS | kubectl apply -n <ns> -f np-allow-dns.yaml |
Needed for service discovery. |
Allow app→db | kubectl apply -n <ns> -f np-allow-app-to-db.yaml |
Pair with ingress on db and egress on app. |
Switch namespace | kubectl config set-context --current --namespace=<ns> |
Keeps commands short & safe. |
Dry-run a spec | kubectl apply -f x.yaml --dry-run=server |
Admission & schema checks without deploying. |
Pitfalls & Recovery
“Policies don’t seem to apply” (order confusion).
Symptom: You created allows but traffic still blocked.
Why: Policies are additive; any default-deny remains in effect unless an allow matches both sides (ingress on target + egress on source if you deny both).
Fix: Ensure you have egress allow on source and ingress allow on destination.DNS broke after default-deny egress.
Symptom:curl db
orkubectl logs
stalls; apps can’t resolve service names.
Fix: Addnp-allow-dns.yaml
allowing egress tokube-system
/k8s-app=kube-dns
on TCP/UDP 53.Pods rejected after enabling PSA restricted.
Symptom:Error from server (Forbidden)
with fields likerunAsNonRoot
.
Fix: Adjust workload security context (non-root, drop caps, no hostPath). Temporarily setenforce=baseline
andwarn=restricted
to migrate, then flip back.Quota exceeded during rollout.
Symptom: New ReplicaSet can’t scale (exceeded quota
).
Fix: Increasepods
/requests.cpu/memory
quota, or lowerreplicas
. Keep rollout headroom (20–30%).No limits → eviction during pressure.
Symptom: Pods OOMKilled or preempted under load.
Fix: Set defaults viaLimitRange
; ensure requests approximate real usage (watchkubectl top pod
).East-west traffic across namespaces unexpectedly blocked.
Symptom: Cross-ns calls time out.
Fix: AddnamespaceSelector
+podSelector
in allow rules, or route via ingresses with clear policies.
Wrap-Up (What this unlocks for reliability—Post 2 teaser)
With Namespaces, Quotas/LimitRanges, NetworkPolicies, and PSA in place, you’ve built a security and fairness floor: teams can’t trample each other, pods can’t talk without permission, and unsafe specs are stopped at the door.
In Post 2, we’ll layer observability SLOs, PodDisruptionBudgets, health/readiness gates, and autoscaling on top of this baseline to keep releases smooth and reliability measurable.
Appendix: “Before → After” Quick Contrasts
Networking:
Before:web
can reach everything → lateral risk.
After: Onlyweb → db
+ DNS; all else denied.Resources:
Before: No limits; one job spikes → node thrash.
After: Default requests/limits; quotas cap team usage.Security:
Before: Privileged pods slip through.
After: PSArestricted
blocks them with actionable errors.
Got questions or want a tailored baseline for your stack? Drop them in, and I’ll fold them into a separate post.
Top comments (0)