cloud-sky-ops

Posted on Sep 27

Post 1/10 — Multi-Tenancy & Security Baseline with Namespaces, Quotas, NetworkPolicies, and Pod Security Admission

#kubernetes #devops #learning #programming

Author: A senior DevOps engineer who’s broken (and fixed) too many clusters so you don’t have to.

Executive Summary

Carve tenants with Namespaces to isolate RBAC, policies, and quotas per team.
Enforce fair sharing with ResourceQuota + LimitRange so noisy neighbors can’t starve the cluster.
Lock down traffic with NetworkPolicy: start with default-deny, then allow the minimum (DNS, app→DB).
Harden workloads with Pod Security Admission (PSA) to block privileged/unsafe pod specs at the namespace boundary.
Ship a repeatable baseline: two team namespaces, quotas/limits, default-deny + specific allows, PSA restricted.

Prereqs

A Kubernetes cluster (≥ v1.28 recommended) and kubectl configured.
Cluster has CNI that supports NetworkPolicy (Calico, Cilium, Antrea, etc.).
Helm optional (not required here).
You have cluster-admin for initial setup.

kubectl version --short
kubectl get nodes -o wide
kubectl get pods -n kube-system

Concepts

1) Namespaces for isolation

Definition: Namespaces slice a cluster into logical tenants with independent RBAC, quotas, and policies. Prevents accidental cross-team impact, scopes access and makes policy application simple (kubectl label ns once).
Best practices:

One namespace per team/app-tier, not per environment object (use labels like env=prod).
Name predictably: team-a, team-b, platform, etc.
Attach PSA labels, default network policies, and quotas at namespace creation.

Commands:

kubectl create ns team-a
kubectl create ns team-b
kubectl label ns team-a env=dev --overwrite
kubectl get ns --show-labels

Before → After:

Before: All pods in default, broad access, hard to apply policies.
After: team-a/, team-b/ with their own quotas, PSA, and network baselines.

When to use: Always—namespaces are table stakes for multi-tenancy.

2) ResourceQuota & LimitRange

Definition: ResourceQuota caps aggregate usage per namespace; LimitRange sets per-pod/container defaults and maxima. Stops runaway resource grabs and ensures every pod has sensible requests/limits for scheduling and stability.

Best practices:

Pair them: RQ for team-level ceilings; LR for sane per-workload defaults.
Set CPU/memory requests+limits and object counts (pods, PVCs, services).
Include ephemeral-storage where supported; keep room for rollouts (e.g., 20–30% headroom).

Commands:

kubectl apply -n team-a -f quota-team-a.yaml
kubectl get resourcequota -n team-a
kubectl describe limitrange -n team-a

Before → After:

Before: Pods without limits; one job consumes all CPU → others starve.
After: Each container gets defaults; namespace can’t exceed its slice.

When to use: Whenever multiple teams share nodes or costs matter (i.e., always).

3) NetworkPolicy

Definition: NetworkPolicy declares which pods may talk to which, for ingress and egress. Prevents lateral movement, accidental chats between unrelated services, and data exfiltration.

Best practices:

Start with default-deny for both ingress and egress.
Then allow only what you need (DNS 53, app→DB 5432, etc.).
Use labels consistently; avoid relying on IPs. Add namespaceSelector + podSelector for system services like CoreDNS.

Commands:

kubectl apply -n team-a -f np-default-deny.yaml
kubectl apply -n team-a -f np-allow-dns.yaml
kubectl apply -n team-a -f np-allow-app-to-db.yaml

Before → After:

Before: Any pod can connect to any pod/Internet.
After: Only DNS + app→db allowed; everything else dropped.

When to use: In any regulated or multi-tenant cluster; also in prod by default.

4) Pod Security Admission (PSA)

Definition: PSA enforces Kubernetes security profiles (privileged, baseline, restricted) via namespace labels. Blocks dangerous specs (privileged, hostPID/IPC, hostPath, capability escalation) before pods land on nodes.

Best practices:

Default restricted enforce on app namespaces; use baseline temporarily while migrating.
Apply enforce, warn, and audit labels during rollout to see breaks before enforcing.
Keep images running as non-root; avoid broad capabilities and host volumes.

Commands:

kubectl label ns team-a \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/enforce-version=latest \
  pod-security.kubernetes.io/warn=restricted \
  pod-security.kubernetes.io/warn-version=latest \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/audit-version=latest --overwrite

Before → After:

Before: Developers could deploy privileged pods or mount /var/run/docker.sock.
After: Such pods are rejected at admission with a clear error.

When to use: Immediately after namespace creation; dev/prod alike (stricter in prod).

Diagram 1 — Namespace/Tenant Layout

Mini-Lab (≈25 min): Two tenants, quotas/limits, default-deny, app→db allow, PSA restricted

You can paste these as-is; tweak CPU/memory as needed.

Create namespaces + PSA labels

kubectl create ns team-a
kubectl create ns team-b

for ns in team-a team-b; do
  kubectl label ns $ns \
    pod-security.kubernetes.io/enforce=restricted \
    pod-security.kubernetes.io/enforce-version=latest \
    pod-security.kubernetes.io/warn=restricted \
    pod-security.kubernetes.io/warn-version=latest \
    pod-security.kubernetes.io/audit=restricted \
    pod-security.kubernetes.io/audit-version=latest --overwrite
done

Apply ResourceQuota + LimitRange

quota-team-a.yaml

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-a-quota
spec:
  hard:
    pods: "30"
    requests.cpu: "8"
    requests.memory: 16Gi
    limits.cpu: "16"
    limits.memory: 32Gi
    persistentvolumeclaims: "10"
    services: "10"
    services.loadbalancers: "2"
---
apiVersion: v1
kind: LimitRange
metadata:
  name: team-a-defaults
spec:
  limits:
  - type: Container
    defaultRequest:
      cpu: "200m"
      memory: "256Mi"
    default:
      cpu: "500m"
      memory: "512Mi"
    max:
      cpu: "2"
      memory: "2Gi"

Apply to both namespaces:

kubectl apply -n team-a -f quota-team-a.yaml
kubectl apply -n team-b -f quota-team-a.yaml

Deploy a tiny app and a “db” in team-a

app-db.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: db
  labels: { app: db }
spec:
  replicas: 1
  selector: { matchLabels: { app: db } }
  template:
    metadata: { labels: { app: db } }
    spec:
      containers:
      - name: db
        image: postgres:16-alpine
        env:
        - { name: POSTGRES_PASSWORD, value: dev }
        ports:
        - { containerPort: 5432, name: pg }
---
apiVersion: v1
kind: Service
metadata:
  name: db
  labels: { app: db }
spec:
  selector: { app: db }
  ports:
  - { port: 5432, targetPort: pg, protocol: TCP }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
  labels: { app: web }
spec:
  replicas: 1
  selector: { matchLabels: { app: web } }
  template:
    metadata: { labels: { app: web } }
    spec:
      containers:
      - name: web
        image: curlimages/curl:8.10.1
        command: ["sleep","infinity"]

kubectl apply -n team-a -f app-db.yaml
kubectl wait -n team-a --for=condition=available deploy/web deploy/db --timeout=90s

Smoke test (pre-policy, should connect):

POD=$(kubectl -n team-a get pod -l app=web -o jsonpath='{.items[0].metadata.name}')
kubectl -n team-a exec -it "$POD" -- sh -lc 'curl -m2 db:5432 || true'
# Expect: connection established or at least a TCP handshake banner (not blocked)

Enforce default-deny (ingress+egress) in team-a

np-default-deny.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
spec:
  podSelector: {}        # select all pods
  policyTypes:
  - Ingress
  - Egress

kubectl apply -n team-a -f np-default-deny.yaml

Test (should now fail):

kubectl -n team-a exec -it "$POD" -- sh -lc 'curl -m2 db:5432 || echo BLOCKED'
# Expect: BLOCKED / timeout

4) Allow only DNS + app→db

np-allow-dns.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
spec:
  podSelector: { }      # all pods may resolve DNS
  policyTypes: [ Egress ]
  egress:
  - to:
    - namespaceSelector:
        matchLabels: { kubernetes.io/metadata.name: kube-system }
      podSelector:
        matchLabels: { k8s-app: kube-dns }
    ports:
    - { protocol: UDP, port: 53 }
    - { protocol: TCP, port: 53 }

np-db-ingress-from-web.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: db-allow-from-web
spec:
  podSelector:
    matchLabels: { app: db }
  policyTypes: [ Ingress ]
  ingress:
  - from:
    - podSelector:
        matchLabels: { app: web }
    ports:
    - { protocol: TCP, port: 5432 }

np-web-egress-to-db.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: web-allow-to-db
spec:
  podSelector:
    matchLabels: { app: web }
  policyTypes: [ Egress ]
  egress:
  - to:
    - podSelector:
        matchLabels: { app: db }
    ports:
    - { protocol: TCP, port: 5432 }

kubectl apply -n team-a -f np-allow-dns.yaml -f np-db-ingress-from-web.yaml -f np-web-egress-to-db.yaml
kubectl -n team-a exec -it "$POD" -- sh -lc 'curl -m2 db:5432 || true'
# Expect: succeeds (policy allows only web→db and DNS egress)

Repeat the same baseline (quotas/limits, default-deny, PSA labels) for team-b.

Diagram 2 — NetworkPolicy Traffic Matrix

YAML & Bash Reference

PSA labels (namespace):

kubectl label ns team-a pod-security.kubernetes.io/enforce=restricted --overwrite
kubectl label ns team-a pod-security.kubernetes.io/{enforce, warn, audit}-version=latest --overwrite

ResourceQuota & LimitRange:

apiVersion: v1
kind: ResourceQuota
metadata: { name: <ns>-quota }
spec:
  hard:
    pods: "30"
    requests.cpu: "8"
    requests.memory: 16Gi
    limits.cpu: "16"
    limits.memory: 32Gi
---
apiVersion: v1
kind: LimitRange
metadata: { name: <ns>-defaults }
spec:
  limits:
  - type: Container
    defaultRequest: { cpu: "200m", memory: "256Mi" }
    default:        { cpu: "500m", memory: "512Mi" }
    max:            { cpu: "2",    memory: "2Gi" }

Default-deny + allows (template): see np-default-deny.yaml, np-allow-dns.yaml, np-db-ingress-from-web.yaml, np-web-egress-to-db.yaml above.

Namespace-scoped context (handy):

kubectl config set-context --current --namespace=team-a
# or create a named context once:
kubectl config set-context team-a --cluster=$(kubectl config current-context) --user=$(kubectl config view -o jsonpath='{.contexts[?(@.name=="'$(kubectl config current-context)'")].context.user}') --namespace=team-a
kubectl config use-context team-a

Cheatsheet Table

Task	Command / File	Notes
Create namespace	`kubectl create ns <name>`	Add labels (`env=prod`, PSA) immediately.
Label PSA restricted	`kubectl label ns <name> pod-security.kubernetes.io/enforce=restricted --overwrite`	Add `warn`/`audit` to preview breaks.
Apply quotas/limits	`kubectl apply -n <ns> -f quota-<ns>.yaml`	Pair `ResourceQuota` with `LimitRange`.
Check quota usage	`kubectl describe resourcequota -n <ns>`	Watch `Used` vs `Hard`.
Default-deny policy	`kubectl apply -n <ns> -f np-default-deny.yaml`	Deny both Ingress and Egress.
Allow DNS	`kubectl apply -n <ns> -f np-allow-dns.yaml`	Needed for service discovery.
Allow app→db	`kubectl apply -n <ns> -f np-allow-app-to-db.yaml`	Pair with ingress on db and egress on app.
Switch namespace	`kubectl config set-context --current --namespace=<ns>`	Keeps commands short & safe.
Dry-run a spec	`kubectl apply -f x.yaml --dry-run=server`	Admission & schema checks without deploying.

Pitfalls & Recovery

“Policies don’t seem to apply” (order confusion).
Symptom: You created allows but traffic still blocked.
Why: Policies are additive; any default-deny remains in effect unless an allow matches both sides (ingress on target + egress on source if you deny both).
Fix: Ensure you have egress allow on source and ingress allow on destination.
DNS broke after default-deny egress.
Symptom: curl db or kubectl logs stalls; apps can’t resolve service names.
Fix: Add np-allow-dns.yaml allowing egress to kube-system/k8s-app=kube-dns on TCP/UDP 53.
Pods rejected after enabling PSA restricted.
Symptom: Error from server (Forbidden) with fields like runAsNonRoot.
Fix: Adjust workload security context (non-root, drop caps, no hostPath). Temporarily set enforce=baseline and warn=restricted to migrate, then flip back.
Quota exceeded during rollout.
Symptom: New ReplicaSet can’t scale (exceeded quota).
Fix: Increase pods/requests.cpu/memory quota, or lower replicas. Keep rollout headroom (20–30%).
No limits → eviction during pressure.
Symptom: Pods OOMKilled or preempted under load.
Fix: Set defaults via LimitRange; ensure requests approximate real usage (watch kubectl top pod).
East-west traffic across namespaces unexpectedly blocked.
Symptom: Cross-ns calls time out.
Fix: Add namespaceSelector + podSelector in allow rules, or route via ingresses with clear policies.

Wrap-Up (What this unlocks for reliability—Post 2 teaser)

With Namespaces, Quotas/LimitRanges, NetworkPolicies, and PSA in place, you’ve built a security and fairness floor: teams can’t trample each other, pods can’t talk without permission, and unsafe specs are stopped at the door.

In Post 2, we’ll layer observability SLOs, PodDisruptionBudgets, health/readiness gates, and autoscaling on top of this baseline to keep releases smooth and reliability measurable.

Appendix: “Before → After” Quick Contrasts

Networking:
Before: web can reach everything → lateral risk.
After: Only web → db + DNS; all else denied.
Resources:
Before: No limits; one job spikes → node thrash.
After: Default requests/limits; quotas cap team usage.
Security:
Before: Privileged pods slip through.
After: PSA restricted blocks them with actionable errors.

Got questions or want a tailored baseline for your stack? Drop them in, and I’ll fold them into a separate post.

DEV Community

Post 1/10 — Multi-Tenancy & Security Baseline with Namespaces, Quotas, NetworkPolicies, and Pod Security Admission

Executive Summary

Prereqs

Concepts

1) Namespaces for isolation

2) ResourceQuota & LimitRange

3) NetworkPolicy

4) Pod Security Admission (PSA)

Diagram 1 — Namespace/Tenant Layout

Mini-Lab (≈25 min): Two tenants, quotas/limits, default-deny, app→db allow, PSA restricted

Create namespaces + PSA labels

Apply ResourceQuota + LimitRange

Deploy a tiny app and a “db” in team-a

Enforce default-deny (ingress+egress) in team-a

4) Allow only DNS + app→db

Diagram 2 — NetworkPolicy Traffic Matrix

YAML & Bash Reference

Cheatsheet Table

Pitfalls & Recovery

Wrap-Up (What this unlocks for reliability—Post 2 teaser)

Appendix: “Before → After” Quick Contrasts

Top comments (0)