In this series I walk through several different open source offerings for performing chaos testing / engineering within your Kubernetes clusters.
In K8s Chaos Dive: Chaos-Mesh Part 1 I covered an introduction to Chaos-Mesh in which we installed the toolkit into our cluster and used a
PodChaos experiment to kill off a random percentage of Nginx pods on a schedule.
Killing pods can be a great exercise for validating resiliency to pod death, something that can happen for a list of reasons in Kubernetes. Solutions tend to revolve around horizontal scaling (dependent on your target SLA):
- Ensure you have sufficient replicas for your application to handle < 100% failure - e.g. Node restarts, Node network disruption causes health-checks to fail. For those wanting to achieve a ~99.9% uptime SLA (depending on cloud provider).
- Ensure you have a multi-cluster setup across availability zones to handle 100% failure in a single data-centre - e.g. data-centre power-outage. For those wanting to achieve a ~99.99% uptime SLA (depending on cloud provider).
- Ensure you have a multi-region setup to handle 100% failure in a single region. For those wanting to achieve a ~99.999% uptime SLA (depending on cloud provider).
However, short of planned restarts by yourself or your cloud provider, generally it is good to try and avoid your pods dying everywhere! Pod death is generally a secondary effect of some other root cause in the cluster, and it may be that finding and fixing a root issue may save you from having to pay for excessive redundancy in compute.
In this post we're going to take a look at
StressChaos experiments in the Chaos-Mesh toolkit to see how we can validate our CPU and memory resourcing.
Chaos-Mesh offers two main supported forms of stress chaos:
cpu-burn- Simulate pod CPU stress.
memory-burn- Simulate pod memory stress.
For example, to generate a
StressChaos which will burn 100% of 1 CPU for 30 seconds, every 5 minutes, for one of your pods in the
my-app namespace, you could write:
apiVersion: chaos-mesh.org/v1alpha1 kind: StressChaos metadata: name: burn-cpu namespace: chaos-mesh spec: mode: one selector: namespaces: - my-app stressors: cpu: workers: 1 load: 100 duration: "30s" scheduler: cron: "@every 5m"
Equally, to generate a
StressChaos which will continually grow memory usage for 30s, every 5 minutes, for one of your pods in the
my-app namespace, you could write:
apiVersion: chaos-mesh.org/v1alpha1 kind: StressChaos metadata: name: burn-memory namespace: chaos-mesh spec: mode: one selector: namespaces: - my-app stressors: memory: workers: 1 duration: "30s" scheduler: cron: "@every 5m"
You can further customize your stressors as follows:
- You can provide both a
memoryand stressor in your config.
- Both the
memorystressors support an additional
optionsfield where you can pass additional flags / arguments to the
- Alternatively, you can define fully custom stressors by providing a
stressngStressorsstring instead of the
stressorsobject. This expects a string of
stress-ngflags / arguments.
Note that providing custom options (last two points) are not fully tested so should be seen as experimental!
So without further ado, let's try out some stress chaos!
Here we'll walk through setting up and executing the following two tests:
- A CPU stress test using Kubernetes manifest files.
- A Memory stress test using Kubernetes manifest files.
Please refer to previous tutorials for detailed instructions! Ultimately we run the following two commands:
minikube start --driver=virtualbox minikube addons enable metrics-server
If you are following on from part 1 of the Chaos-Mesh series then you may not have to do anything! We will re-use our Nginx and Chaos-Mesh deployments from before.
If you've skipped over the previous tutorial then I recommend following it's target application setup and deploying chaos-mesh instructions. If you followed the clean-up in the previous tutorial, then you can reinstate the setup with the following commands:
# Create Nginx Helm charts helm create nginx # Deploy Nginx to the cluster kubectl create ns nginx helm upgrade --install nginx ./nginx \ -n nginx \ --set replicaCount=10 # Check the Nginx deployment is successful helm ls -n nginx kubectl get pod -n nginx # Pull down the Chaos-Mesh repo git clone https://github.com/chaos-mesh/chaos-mesh # Apply Chaos-Mesh Custom Resource Definitions kubectl apply -f ./chaos-mesh/manifests/crd.yaml # Deploy Chaos-Mesh to the cluster kubectl create ns chaos-mesh helm upgrade --install chaos-mesh ./chaos-mesh/helm/chaos-mesh \ -n chaos-mesh \ --set dashboard.create=true # Check the Chaos-Mesh deployment is successful helm ls -n chaos-mesh kubectl get pods -n chaos-mesh -l app.kubernetes.io/instance=chaos-mesh # Load the Chaos-Mesh dashboard minikube service chaos-dashboard -n chaos-mesh
Let's start by creating a new
stress-cpu.yaml file with the following contents:
apiVersion: chaos-mesh.org/v1alpha1 kind: StressChaos metadata: name: stress-cpu namespace: chaos-mesh spec: mode: all selector: labelSelectors: "app.kubernetes.io/name": nginx namespaces: - nginx stressors: cpu: workers: 1 load: 100 duration: "30s" scheduler: cron: "@every 5m"
This will add a single CPU stress worker to each of our
nginx pods that will load the pods 100% of the scheduled time. Check out the
stress-ng manpage for the
--cpu-load flags for further details.
These workers will be scheduled every 5 minutes and run for 30s in the pod before being stopped, meaning we should expect to see our Nginx pods' CPU spike for 30s and then drop back to near 0 for the remaining 4.5 minutes.
Let's install our experiment and see what happens!
$ kubectl apply -f stress-cpu.yaml stresschaos.chaos-mesh.org/stress-cpu created
If we check the Chaos-Mesh dashboard we can see that the experiment has been deployed and is running:
minikube service chaos-dashboard -n chaos-mesh
The dashboard shows us that the test is successfully running for 30 seconds every 5 minutes 🎉.
Let's say we don't trust the dashboard? Let's check out what the CPU levels are actually like in the cluster!
Firstly in a terminal window install the
watch command if you don't already have it and run:
watch kubectl top pods -n nginx
This should output something like:
Every 2.0s: kubectl top pods -n nginx c-machine: Tue Aug 25 14:16:40 2020 NAME CPU(cores) MEMORY(bytes) nginx-5c96c8f58b-2l85d 0m 25Mi nginx-5c96c8f58b-btws8 0m 2Mi nginx-5c96c8f58b-c28ws 0m 2Mi nginx-5c96c8f58b-gmnkc 0m 2Mi nginx-5c96c8f58b-hzgx5 0m 2Mi nginx-5c96c8f58b-jzk2p 0m 2Mi nginx-5c96c8f58b-vhmqm 0m 2Mi nginx-5c96c8f58b-vs2cr 0m 21Mi nginx-5c96c8f58b-zc9qf 0m 2Mi nginx-5c96c8f58b-zjfc4 0m 2Mi
Which will update every 2 seconds. There can be a bit of a lag getting metrics back from the cluster's metrics-server, but we should see that every 5 minutes these values jump up for ~30 seconds and then return back to 0. Waiting for the next scheduled stress test we can see this happening:
Every 2.0s: kubectl top pods -n nginx c-machine: Tue Aug 25 14:19:11 2020 NAME CPU(cores) MEMORY(bytes) nginx-5c96c8f58b-2l85d 110m 25Mi nginx-5c96c8f58b-btws8 80m 2Mi nginx-5c96c8f58b-c28ws 94m 2Mi nginx-5c96c8f58b-gmnkc 89m 2Mi nginx-5c96c8f58b-hzgx5 70m 2Mi nginx-5c96c8f58b-jzk2p 132m 2Mi nginx-5c96c8f58b-vhmqm 63m 2Mi nginx-5c96c8f58b-vs2cr 115m 21Mi nginx-5c96c8f58b-zc9qf 80m 2Mi nginx-5c96c8f58b-zjfc4 94m 2Mi
Awesome! The single CPU worker in each pod seems to use up around 100m (100 millicores or 0.1 CPU cores) 💥.
If you wanted to apply more load to your pods, you could just up the number of workers in the
StressChaos yaml accordingly 🙃.
If you want to go deeper, we can even check out what is happening in the pod itself! 😲 First we find a target Nginx pod:
$ kubectl get pods -n nginx NAME READY STATUS RESTARTS AGE nginx-5c96c8f58b-2l85d 1/1 Running 0 23m nginx-5c96c8f58b-btws8 1/1 Running 0 23m nginx-5c96c8f58b-c28ws 1/1 Running 0 23m nginx-5c96c8f58b-gmnkc 1/1 Running 0 23m nginx-5c96c8f58b-hzgx5 1/1 Running 0 23m nginx-5c96c8f58b-jzk2p 1/1 Running 0 23m nginx-5c96c8f58b-vhmqm 1/1 Running 0 23m nginx-5c96c8f58b-vs2cr 1/1 Running 0 23m nginx-5c96c8f58b-zc9qf 1/1 Running 0 23m nginx-5c96c8f58b-zjfc4 1/1 Running 0 23m
We'll use the first in the list
nginx-5c96c8f58b-2l85d. Next we can actually open a shell inside the pod as follows:
$ kubectl exec -it nginx-5c96c8f58b-2l85d -n nginx -- bash root@nginx-5c96c8f58b-2l85d:/#
From here we can install some utility packages:
root@nginx-5c96c8f58b-2l85d:/# apt-get update && apt-get install -y procps
Which will allow us to use the
top commands to interrogate the processes running inside the pod. If we wait until load is happening, we can see a
stress-ng processes start running inside the container with:
root@nginx-5c96c8f58b-2l85d:/# watch ps aux Every 2.0s: ps aux nginx-5c96c8f58b-9npqq: Tue Aug 25 13:48:56 2020 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.2 32656 5204 ? Ss 13:36 0:00 nginx: master process nginx -g daemon off; nginx 6 0.0 0.1 33112 3072 ? S 13:36 0:00 nginx: worker process root 27 0.0 0.1 18136 3284 pts/0 Ss 13:46 0:00 bash root 355 0.1 0.1 11104 2520 pts/0 S+ 13:48 0:00 watch ps aux root 383 0.0 0.1 18700 3696 ? SL 13:48 0:00 stress-ng --cpu 1 --cpu-load 100 root 384 21.0 0.2 19348 4668 ? R 13:48 0:01 stress-ng --cpu 1 --cpu-load 100 root 397 0.0 0.0 11104 652 pts/0 S+ 13:48 0:00 watch ps aux root 398 0.0 0.0 4280 772 pts/0 S+ 13:48 0:00 sh -c ps aux root 399 0.0 0.1 36636 2804 pts/0 R+ 13:48 0:00 ps aux
Or equally we can use
top to see our processes and their resource usage with a slightly nicer UI:
From this command we can see a process list which shows a
stress-ng parent process and a
stress-ng-cpu worker process which consumes around 18% CPU. After 30s these processes disappear and CPU returns to near 0%.
Let's stop our experiment by removing the
StressChaos Kubernetes object we added earlier:
kubectl delete -f stress-cpu.yaml
So there we have it, we can successfully stress our pods with increased CPU levels! 🍾
Now let's perform a similar experiment, but this time we will load the memory of a single pod.
Create a new
StressChaos manifest called
apiVersion: chaos-mesh.org/v1alpha1 kind: StressChaos metadata: name: stress-memory namespace: chaos-mesh spec: mode: one selector: labelSelectors: "app.kubernetes.io/name": nginx namespaces: - nginx stressors: memory: workers: 1 duration: "10s" scheduler: cron: "@every 2m"
This will add a single memory stress worker to one of our
nginx pods that will run for 10s every 2 minutes. The worker will grow it's heap by reallocating memory at a rate of 64K per iteration. Check out the
stress-ng manpage for the
--bigheap flag for further details.
Let's install our experiment and see what happens!
$ kubectl apply -f stress-memory.yaml stresschaos.chaos-mesh.org/stress-memory created
Watching our Nginx pods, we can see that every 5 minutes one of the pods has an outburst in memory usage:
$ watch kubectl top pods -n nginx Every 2.0s... c-machine: Tue Aug 25 15:12:52 2020 NAME CPU(cores) MEMORY(bytes) nginx-5c96c8f58b-5tvz5 0m 3Mi nginx-5c96c8f58b-6fb9h 0m 2Mi nginx-5c96c8f58b-bjzd2 0m 2Mi nginx-5c96c8f58b-gfqww 0m 3Mi nginx-5c96c8f58b-jtn5f 0m 2Mi nginx-5c96c8f58b-rdlgk 145m 3Mi nginx-5c96c8f58b-sgx4s 0m 2Mi nginx-5c96c8f58b-szffx 0m 3Mi nginx-5c96c8f58b-ttlhr 65m 621Mi nginx-5c96c8f58b-xgh7j 0m 2Mi
Similarly, if we're quick and exec on a targetted pod and install
top we can see that the memory worker very quickly starts to consume a lot of memory!
The image above shows our targeted Nginx pod hitting 30.4% of the available memory. Given that we haven't set any resource limits on our Nginx deployment, that is actually 30.4% of the cluster's memory allocated to Minikube!!
This is why in this experiment I've set the
one and not
all. Doing so will likely consume all of the memory resources in your cluster and very potentially break it - I tried this initially and had to completely delete the Minikube cluster and start again as it became 100% non-responsive! 😱
Let's remove our experiment:
kubectl delete -f stress-memory.yaml
Let's clean-up and remove everything we've created today:
helm delete chaos-mesh -n chaos-mesh kubectl delete ns chaos-mesh kubectl delete crd iochaos.chaos-mesh.org kubectl delete crd kernelchaos.chaos-mesh.org kubectl delete crd networkchaos.chaos-mesh.org kubectl delete crd podchaos.chaos-mesh.org kubectl delete crd podnetworkchaos.chaos-mesh.org kubectl delete crd stresschaos.chaos-mesh.org kubectl delete crd timechaos.chaos-mesh.org helm delete nginx -n nginx kubectl delete ns nginx minikube stop minikube delete
So what have these two experiments taught us today?
Our first experiment showed that it is very simple to apply additional CPU strain to our pods, but ultimately 1 worker didn't apply too much pressure. Given we weren't actually trying to use our Nginx pods in any way the experiment is a little moot and there is likely little to be gained.
However considering CPU is very important in a cluster, and for many applications the CPU profile can directly impact key performance indicators such as latency. By having the tooling to generate artificial strain on your pods (and cluster) you can measure how your applications work when under the kind of pressure you see during a peak load period.
Given the impact on latency, CPU stress can also be a good way to validate your deployment's healthchecks, where the aim should be to ensure that your application is highly available and that traffic is routed to happily working pods opposed to ones having a CPU meltdown! This could be tested using one of the percentage based modes available.
Monitoring your application under pod CPU stress can also be a useful way to validate your pod resource and limit values. Applying too small a resource value for CPU will likely result in CPU throttling which can have devastating impacts on applications, particularly multi-threaded applications.
Our second experiment showed that memory stressor is certainly quite aggressive! In fact, leaving it running too long on too many pods could result in quite the flurry of out of memory killer jobs, if not entire cluster resource exhaustion and failure!
Fine-tuning the memory stressor using the
options value is probably advised, but a key take away is that no one pod (or collection of pods) should be allowed to consume an entire cluster's memory allocation! Strict memory resource request and limits should be set on deployments to ensure that they are killed before they have the chance to disrupt other pods or fundamental cluster processes 🔥.
Thanks for reading folks!
Enjoy the tutorial? Have questions or comments? Or do you have an awesome way to run chaos experiments in your Kubernetes clusters? Drop me a message in the section below or tweet me @CraigMorten!
Till next time 💥