Noah Makau

Posted on May 22

The Disk-Pressure Incident That Taught Me to Always Set LimitRanges and Other Lessons from Mirroring EKS Locally.

#kubernetes #platformengineering #tutorial #eks

Part 6 of 7 — The Mac Kubernetes Lab: A Production-Mirror Setup from Scratch.

Previously in Part 5: We installed Istio with revision-based upgrades, MetalLB for LoadBalancer IPs, and practised traffic management with Gateways, VirtualServices, and fault injection. The cluster behaves. Now we wire up the last three pieces that turn it from “a working local cluster” into “a real mirror of our production EKS.”

The cluster works. Istio is running. MetalLB is handing out IPs. But it’s still missing three layers that make the production parity actually meaningful:

Vault Kubernetes auth — pods authenticate to Vault with their service account tokens, the same way they do in production. No hardcoded secrets, no static credentials.
Crossplane with the AWS provider — infrastructure compositions you can develop and test locally before they touch real AWS resources, or any other thought of OpenStack?
LimitRanges — default resource requests on every namespace. This one comes from a real incident I want to talk about.

The LimitRange story is the most important of the three, so I’ll tell it properly when we get there. First, the auth layer.

Vault Kubernetes auth.

Vault’s Kubernetes auth method lets pods authenticate by presenting their service account JWT. Vault validates the token against the Kubernetes API server and exchanges it for a Vault token with the appropriate policies attached.

On the production EKS clusters at work, this is how microservices retrieve database credentials, API keys, and TLS certificates: no hard-coded secrets, no secret sprawl, every issuance audit-logged in Vault.

Setting it up locally means I can test the full injection workflow without a VPN, and debug failures on a cluster where the stakes are zero.

Installing the Vault agent injector.

We deploy just the Vault agent injector in the lab cluster. It points to the external Vault VM rather than running its own Vault server:

# 💻 Mac
kubectx lab-cluster

helm repo add hashicorp https://helm.releases.hashicorp.com
helm repo update

# Get the vault VM IP - does not persist across sessions
export VAULT_IP=$(orb run -m vault hostname -I | awk '{print $1}')
echo "VAULT_IP=$VAULT_IP"
helm install vault hashicorp/vault \
  --namespace vault --create-namespace \
  --set "injector.externalVaultAddr=http://$VAULT_IP:8200"

kubectl get pods -n vault
# vault-agent-injector-xxx   1/1   Running   0   30s

Configuring K8s auth on Vault.

Run this on the vault VM, pointing Vault at the lab cluster’s API server:

# 🖥️ VM: vault

# Re-export - always required, doesn't persist across sessions
export VAULT_ADDR='http://127.0.0.1:8200'
export VAULT_ROOT_TOKEN=$(grep 'Initial Root Token' ~/vault-init.txt | awk '{print $NF}')

# If Vault is sealed after a reboot:
# vault operator unseal $(grep 'Unseal Key 1' ~/vault-init.txt | awk '{print $NF}')
vault login $VAULT_ROOT_TOKEN

# Get CP_IP from the Mac terminal: orb run -m cp01 hostname -I | awk '{print $1}'
export CP_IP=<cp01-ip>

# Regenerate the CA cert if /tmp was cleared after reboot
vault read -field=certificate pki_k8s/issuer/default > /tmp/lab-ca.crt

# Enable Kubernetes auth (safe to re-run - ignores "already enabled")
vault auth enable -path=lab-k8s kubernetes 2>/dev/null || echo "already enabled"

# Configure - point Vault at the lab cluster API server
vault write auth/lab-k8s/config \
  kubernetes_host="https://$CP_IP:6443" \
  kubernetes_ca_cert=@/tmp/lab-ca.crt

vault read auth/lab-k8s/config

Testing K8s auth.

Create a simple role and test it from a pod:

# 🖥️ VM: vault

# Create a policy
vault policy write read-secrets - <<EOF
path "secret/data/myapp/*" {
  capabilities = ["read"]
}
EOF
# Create a K8s auth role
vault write auth/lab-k8s/role/myapp \
  bound_service_account_names=myapp \
  bound_service_account_namespaces=default \
  policies=read-secrets \
  ttl=1h

# Write a test secret
vault secrets enable -path=secret kv-v2 2>/dev/null || true
vault kv put secret/myapp/config db_password="supersecret"

# 💻 Mac — deploy a pod with Vault annotations
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: myapp
  namespace: default
---
apiVersion: v1
kind: Pod
metadata:
  name: vault-test
  namespace: default
  annotations:
    vault.hashicorp.com/agent-inject: "true"
    vault.hashicorp.com/role: "myapp"
    vault.hashicorp.com/agent-inject-secret-config: "secret/data/myapp/config"
spec:
  serviceAccountName: myapp
  containers:
  - name: app
    image: busybox
    command: ["sleep", "3600"]
EOF

# Check the secret was injected
kubectl exec vault-test -c app -- cat /vault/secrets/config
# db_password: supersecret

If that last line returns the password, the whole chain works: service account JWT → Vault validation → Vault token → secret retrieval → file injection. Every link of the chain is what a real production app does.

Crossplane.

Crossplane turns a Kubernetes cluster into a universal control plane for cloud infrastructure. Instead of Terraform modules or CloudFormation stacks, you define infrastructure as Kubernetes custom resources, and Crossplane reconciles them continuously.
I use it at work to provision AWS resources (EKS node groups, RDS, S3 buckets, IAM roles) and VMware Cloud Director resources through a custom provider. The lab version mirrors the AWS side of that.

Installation:

# 💻 Mac
helm repo add crossplane-stable https://charts.crossplane.io/stable
helm repo update

# Composition Functions are enabled by default in recent versions.
# The --enable-composition-functions flag was removed.
helm install crossplane crossplane-stable/crossplane \
  --namespace crossplane-system --create-namespace

kubectl get pods -n crossplane-system -w
# NAME                                       READY   STATUS    AGE
# crossplane-xxx                             1/1     Running   60s
# crossplane-rbac-manager-xxx               1/1     Running   60s

Installing the AWS provider

💻 Mac

kubectl apply -f - <<EOF
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-aws-ec2
spec:
package: xpkg.upbound.io/upbound/provider-aws-ec2:latest
EOF

kubectl get pkg

NAME INSTALLED HEALTHY PACKAGE AGE

provider-aws-ec2 True True xpkg.upbound.io/upbound/provider-... 60s

A minimal composition:

A bare-minimum ProviderConfig enough to verify the install is working:

# 💻 Mac
kubectl apply -f - <<EOF
apiVersion: aws.upbound.io/v1beta1
kind: ProviderConfig
metadata:
  name: default
spec:
  credentials:
    source: Secret
    secretRef:
      namespace: crossplane-system
      name: aws-creds
      key: creds
EOF

In a real setup, you create a IRSA ( IAM Role for Service Account) to authenticate and give the provider permission to create and monitor resources. For local validation, the provider installs, and the compositions can be validated structurally without ever calling AWS.

The LimitRange story.

This is the one that came from a real incident at work.

We had repeated disk-pressure events in our production EKS cluster. Pods with no resource requests had crept into a few namespaces — someone deployed a YAML that omitted resources: entirely, and nobody caught it in review. The Kubernetes scheduler had no signal about their consumption, so nodes ended up overcommitted. Then ephemeral storage filled up, eviction kicked in, and a couple of unrelated pods went down with it. Total downtime measured in tens of minutes. Cause-and-effect chain that took a while to untangle.

The fix is one of the most boring features in Kubernetes: LimitRanges. They set default resource requests and limits at the namespace level. Any container that doesn’t specify its own requests gets the defaults applied automatically by the admission controller. The scheduler always has a signal. Overcommit becomes a deliberate choice, not an accident.

# 💻 Mac
kubectl apply -f - <<EOF
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: default
spec:
  limits:
  - default:
      memory: 512Mi
      cpu: 500m
    defaultRequest:
      memory: 128Mi
      cpu: 100m
    max:
      ephemeral-storage: 2Gi
    type: Container
EOF

Apply this to every namespace that hosts workloads. In production, I now apply it as a post-provisioning step on every new namespace:

# 💻 Mac — apply to multiple namespaces
for ns in default vault crossplane-system istio-system; do
  kubectl apply -f limitrange.yaml -n $ns
done

The `ephemeral-storage` max is the part that specifically addresses the disk-pressure failure mode — it bounds how much scratch space a container can consume, which is what spirals when ephemeral storage runs unbounded.

Verifying the complete EKS mirror.

Let’s confirm the whole stack is up:

# 💻 Mac
kubectl get nodes -o wide
# NAME       STATUS   ROLES           VERSION
# cp01       Ready    control-plane   v1.34.x
# worker01   Ready    <none>          v1.34.x
# worker02   Ready    <none>          v1.34.x

kubectl get pods -A
# Cilium/Calico, CoreDNS, istiod-1-26, MetalLB, Crossplane, Vault injector - all Running

Local vs production — what’s the same and what differs:

The only meaningful differences are the CNI (because of OrbStack’s VM capabilities, as we covered in Part 4) and the LoadBalancer implementation. Everything else is identical in configuration. The mental model from this lab transfers directly to the production cluster, and vice versa.

In the final article: How to stop and start the lab without losing state, the CKS exam scenarios this cluster was purpose-built for, and the shell aliases that make the whole thing pleasant to live with.

← Part 5: How I Practise Istio Upgrades Locally Before Touching Production EKS | Part 7: The Day 2 Reality of Running a Kubernetes Lab on Your Mac: Stop/Start, CKS Scenarios, and What I Learned Building It →

I’m Noah Makau, a DevSecOps engineer based in Nairobi. I run a small DevOps consultancy and hold CKA, CKAD, and the AWS Solutions Architect Professional certifications , currently preparing for CKS. I write about Kubernetes, Vault, Crossplane, and the day-to-day of running platforms that actually have to stay up.
originally published at blog.arkilasystems.com

DEV Community

The Disk-Pressure Incident That Taught Me to Always Set LimitRanges and Other Lessons from Mirroring EKS Locally.

Vault Kubernetes auth.

Installing the Vault agent injector.

Configuring K8s auth on Vault.

Testing K8s auth.

Crossplane.

Installation:

Installing the AWS provider

💻 Mac

NAME INSTALLED HEALTHY PACKAGE AGE

provider-aws-ec2 True True xpkg.upbound.io/upbound/provider-... 60s

A minimal composition:

In a real setup, you create a IRSA ( IAM Role for Service Account) to authenticate and give the provider permission to create and monitor resources. For local validation, the provider installs, and the compositions can be validated structurally without ever calling AWS.

The LimitRange story.

The `ephemeral-storage` max is the part that specifically addresses the disk-pressure failure mode — it bounds how much scratch space a container can consume, which is what spirals when ephemeral storage runs unbounded.

Verifying the complete EKS mirror.

Local vs production — what’s the same and what differs:

Top comments (0)

Vault Kubernetes auth.

Installing the Vault agent injector.

Configuring K8s auth on Vault.

Testing K8s auth.

Crossplane.

Installation:

Installing the AWS provider

💻 Mac

NAME INSTALLED HEALTHY PACKAGE AGE

provider-aws-ec2 True True xpkg.upbound.io/upbound/provider-... 60s

A minimal composition:

In a real setup, you create a IRSA ( IAM Role for Service Account) to authenticate and give the provider permission to create and monitor resources. For local validation, the provider installs, and the compositions can be validated structurally without ever calling AWS.

The LimitRange story.

The ephemeral-storage max is the part that specifically addresses the disk-pressure failure mode — it bounds how much scratch space a container can consume, which is what spirals when ephemeral storage runs unbounded.

Verifying the complete EKS mirror.

Local vs production — what’s the same and what differs:

The `ephemeral-storage` max is the part that specifically addresses the disk-pressure failure mode — it bounds how much scratch space a container can consume, which is what spirals when ephemeral storage runs unbounded.