DEV Community

Cover image for How to Deploy and Scale Strapi on a Kubernetes Cluster 2/2
Strapi for Strapi

Posted on • Originally published at strp.cc

How to Deploy and Scale Strapi on a Kubernetes Cluster 2/2

The goal of this series of articles, divided into 2 parts, is to give a comprehensive guide on integrating Strapi with Kubernetes.
It will cover the journey from building your image to deploying a highly available, and robust application.
The first part, the previous article, focuses on the building blocks as well as an intermediate deployment.
While the second part focuses on a highly-available deployment and tries to cover more advanced topics.

Strapi is the leading open-source headless CMS based on NodeJS, and its projects can vary a lot between themselves, but also Kubernetes provides a lot of flexibility. Therefore, it's worth investing some time in the best practices to integrate them.

Pre-requisites

Please refer to the first part of this blog series for the prerequisites.

Source Code

You can check out the source code of this article on GitHub.
The code is separated into two folders, one for each part.

Advanced Deployment

So until now, we have been highly focused on deploying to K8s, reusing yaml files, and making everything work smoothly.
All of this is nice for a Hello World, POC, or a simple setup, but real-life setups are way more different and complex.
Usually, the requirements can revolve around robustness, availability, security, and other guarantees for "Tier 1" services.
Let us dig into some of these areas to improve our deployments.

Probes

K8s relies on probes to measure the status of an application.
This means that K8s, by default, has no clue if the application is running unless we instruct it how to measure this.
The most common probe for web services is to a dedicated endpoint that evaluates the health of the application (e.g., /ping) or the root path (/, which is not so recommended, but it's better than not having probes).
But not all applications are web services, so K8s provides different mechanisms to achieve this.

Normally, K8s is designed (via controllers) to watch the status of all objects and react to that.
More specifically, probes are important since K8s can react based on the status of your app.
In the end, K8s will try to keep your application running and the more information it has, the better.
Let us dig into the probes of each deployment.

Remember that probes have many available options for configuration, which you can use, e.g., initialDelaySeconds, periodSeconds, timeoutSeconds, etc.
But for this article, we'll try to use the default values.

App probes

This is considered a web application, so we can use the traditional httpGet mechanism.
Ideally, the app should have proper liveness and startup endpoints that are different, and they only need to respond with the HTTP status code 200 if everything is alright.
Keep in mind that these endpoints can vary depending on your setup.
For example, if your application relies on an external DB, an external Redis and, AWS S3, you may want to check that you have connectivity to those services for the startup probe.
But for the liveness probe, you might want to focus more on the status of your app.

For the simplicity of this article, we'll rely on the root endpoint (/), but remember that it's better to have a specific endpoint for this other than the root endpoint.
In the app.yaml file, add the following:

# ~/strapi-k8s/helm/app.yaml
# ...
livenessProbe:
  httpGet:
    path: /
    port: http

readinessProbe:
  httpGet:
    path: /
    port: http
Enter fullscreen mode Exit fullscreen mode

Important: the probes should be developed for K8s, this means that there is no need for a response body.
The only thing that the probe will look (in the case of the httpGet) at is the HTTP response code.
So using a path that returns HTML, JSON, or things like that it's not ideal, it will work but in an inefficient way.

To update our deployment, run the following command:

# from the helm folder
helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

DB probes

Things are a little more complicated for our database because is not really a web application.
K8s provides us with other mechanisms to probe non-web applications.
And luckily for us, MySQL has some tools we can use to validate that it's running.

In the file db.yaml, add the following:

# ~/strapi-k8s/helm/db.yaml
# ...
livenessProbe:
  exec:
    command: ["bash", "-c", "mysqladmin --user=$MYSQL_USER --password=$MYSQL_PASSWORD ping"]
  initialDelaySeconds: 30
  timeoutSeconds: 5

readinessProbe:
  exec:
    command: ["bash", "-c", "mysql --host=127.0.0.1 --user=$MYSQL_USER --password=$MYSQL_PASSWORD -e 'SELECT 1'"]
  initialDelaySeconds: 5
  periodSeconds: 2
Enter fullscreen mode Exit fullscreen mode

To update our deployment, run the following command:

# from the helm folder
helm upgrade mysql strapi-chart -f db.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

If this command, hangs or gets stuck, this is expected. MySQL will be trying to use the same disk, which contains runtime information and won't be able to move forward unless we manually kill the other pods. Let's fix this!

We need to add a different rollout strategy for this deployment. In the templates of the chart, in the deployment.yaml file, add the strategy conf around the top of the file, between spec.replicas and spec.selector (the following code has the "surrounding" code for references):

# ~/strapi-k8s/helm/strapi-chart/templates/deployment.yaml
# ...
  {{- if not .Values.autoscaling.enabled }}
  replicas: {{ .Values.replicaCount }}
  {{- end }}
  strategy:
    type: {{ .Values.strategy | default "RollingUpdate" }}
  selector:
    matchLabels:
      {{- include "strapi-chart.selectorLabels" . | nindent 6 }}
# ...
Enter fullscreen mode Exit fullscreen mode

In the templates of the chart, in the file values.yaml, add after replicaCount, the following:

# ~/strapi-k8s/helm/strapi-chart/values.yaml
# ...
strategy: RollingUpdate
# ...
Enter fullscreen mode Exit fullscreen mode

Now, in the file db.yaml, add the following:

# ~/strapi-k8s/helm/db.yaml
# ...
strategy: Recreate
Enter fullscreen mode Exit fullscreen mode

Finally, update our deployment by running the following command:

# from the helm folder
helm upgrade mysql strapi-chart -f db.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

Ok, so now, our application is more robust since it uses probes that help K8s "learn" about our application, and it will try to keep it running based on that information.
Therefore, if our app faces an unexpected issue and the probes stop replying, K8s will restart it automatically based on the probe configuration.

Assets

Strapi has 4 main folders, config, database, public and src. As we discussed earlier in the "Development flow" section, the src, database and, config should be packaged with the image and have proper semversioning for proper tracking and stability.
But the public folder poses some issues and challenges in a K8s environment.

If we have 1 Pod running, this is not a problem, but most real-world scenarios would require more than 1 Pod, especially in production environments. Let's discuss our options, the current PersistentVolume could be a solution.

The problem is that our current setup won't allow flexibility and high availability for our deployment.
If we look at our DB deployment, we used the only StorageClass available at our cluster, which is restricted to ReadOnlyOnce access mode.
This means that it can't be accessed by multiple Pods at the same time. Plus, it will be available in only one Node simultaneously.

This is because the underlying driver has decided to implement the access mode this way. For caches, temporary folders and other ephemeral data, it might not matter, but for "truly persistent" data, it's not a good long-term solution. Therefore, we need to look at other storage classes to improve this.

A better solution, but still not good enough, is to rely on storage from a cloud operator. If you look at the documentation for StorageClasses you will find some solutions.
These require you to install a controller that handles the volumes' creation, attachment, and lifecycle.

Some examples from the most common can be:

  • AWS Elastic Block Storage (EBS)
  • Azure Disk
  • GCE Persistent Disk These controllers are outside the scope of this article, but you should read about them in case you need this kind of storage. And it's worth mentioning that these 3 examples can only be accessed from 1 Pod, so it's still not the solution we are looking for.

Looking for a better solution, we have the one and only: Network File System (A.K.A. NFS). This is a long-running protocol and solution in the technology spectrum. And, it can be configured in K8s as a storage plugin. Deploying, managing, and/or taking care of an NFS server is outside the scope of this article. But, there are some very good and recommended options that are managed solutions like:

NFS Server (temporary)

For testing purposes, you could use an already configured NFS server at your disposal.
You could also set up an NFS server in your local machine, for macOS follow these instructions and for Linux follow these instructions.
Or you can also follow your favorite guide for setting up an NFS server.

Now, for the purposes of this article, in case you don't have an NFS server available, we will use a simple NFS Server Provisioner, which we'll use only for example purposes. As mentioned before, using a managed solution from a cloud provider or a properly configured HA NFS server in your infrastructure is highly recommended. We'll install not the most up-to-date solution, but it should work for example purposes.
We will follow the Quickstart found in the repo, mixed with this repo which does some small tweaks to make it work with K3d, which is summarized in the following commands run from the helm folder:

# create the rbac rules needed
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner/master/deploy/kubernetes/rbac.yaml
# create the base pvc
kubectl apply -f https://raw.githubusercontent.com/kbristow/k3d-nfs-dynamic-volumes/main/pvc.yaml
# create the deployment
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner/master/deploy/kubernetes/deployment.yaml
# patch the deployment to use the pvc instead of the default hostPath volume
kubectl patch deploy nfs-provisioner --type=json -p='[{"op": "replace", "path": "/spec/template/spec/volumes", "value": [{ "name": "export-volume", "persistentVolumeClaim": {"claimName": "ganesha-pvc"}}]}]'
# create the storage class
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner/master/deploy/kubernetes/class.yaml
Enter fullscreen mode Exit fullscreen mode

Once again, we are not intending to use this in a long term but more for a practical solution for example purposes of this article, so this should work for the purposes of this article.
Ideally, once again, you should rely on a cloud provider solution where you can automate encryption and backups through its proper CSI driver, ergo, its proper StorageClass.
But for now, we have an example and working NFS server, let's make use of it.

App conf

Now, we can add some storage configuration for our app.yaml, by adding the following at the end of the file:

# ~/strapi-k8s/helm/app.yaml
# ...
storage:
  claim:
    enabled: true
  capacity: 1Gi # it's not really relevant when using an NFS disk
  accessModes:
    - ReadWriteMany
  storageClassName: example-nfs
  mountPath: "/opt/app/public/uploads"
Enter fullscreen mode Exit fullscreen mode

We can update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

To see it working, we can navigate our browser to http://localhost:1337/admin/plugins/upload. From there, click on Add new assets, and add a couple of images. This will get stored in our NFS drive, in the case of our K3D cluster, it will get stored in the node's path /var/lib/rancher/k3s/storage, which translates to our local machine path /tmp/k3d. So if you are curious about these files, you can ls into your local machine and look for the underlying folder structure.

You can start by running:

ls -la /tmp/k3d
Enter fullscreen mode Exit fullscreen mode

From there, you can navigate the different PVC folders to understand how it's being stored under the hood.
Once again, this should be stored in a proper NFS drive with HA, encryption and backups.

This will use our previously created StorageClass (example-nfs) configured to use NFS, and we are mounting this disk in /opt/app/public/uploads. If we use a proper NFS setup, as mentioned before, we can keep our data efficiently. By doing this, we are adding the capability for this specific path to be accessed in read/write mode by many Pods simultaneously. This is fundamental for high availability setups, which we will discuss next.

High Availability

We currently have our DB and app running with a single replica and nothing that provides high availability.
K8s, with the help of the liveness probes, will keep our DB and app running, but this doesn't guarantee zero downtime, nor it reduces the probability of issues.

For the DB, it's highly recommended that, once again, a cloud provider solution is used or a HA DB self-hosted.
For the cloud provider solutions, some potential options can be:

For the app, we still have some work to do since we are managing it.

Replicas

An initial step for HA is and will always be: let's have more replicas of our application. For simple web applications, this might be trivial, but for more complex apps, this can become quite challenging. In our case, one of the trickiest parts is the shared assets, which we already solved using an NFS disk. So to increase our replicas, in the app.yaml file, we can make use of the Helm chart property replicaCount, so add this line to the top of the file:

# ~/strapi-k8s/helm/app.yaml
replicaCount: 3
# ...
Enter fullscreen mode Exit fullscreen mode

Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

Now, to check that it's working, let's start by checking how many pods are running for our app by running the following command:

kubectl get pods -l app.kubernetes.io/instance=strapi
# the output should be similar to this:
NAME                                   READY   STATUS    RESTARTS   AGE
strapi-strapi-chart-7679457c49-5qqnc   1/1     Running   0          21m
strapi-strapi-chart-7679457c49-h825x   1/1     Running   0          99s
strapi-strapi-chart-7679457c49-7tt7j   1/1     Running   0          99s
Enter fullscreen mode Exit fullscreen mode

This is nice, but is it working? To figure out this, let's use the CLI tool stern, this is to easily query the logs of multiple Pods.

Let's run the command:

stern strapi-strapi-chart -n default
Enter fullscreen mode Exit fullscreen mode

You should see the logs of the 3 Pods, and probably a lot of http: GET index.html messages. Which is normal, and it's due to the liveness probes. But now, go to a browser and navigate to http://localhost:1337/admin/ (or any other page) and hard-refresh a few times the page.
You should see the page dispatched by different or multiple Pods every time. Additionally, if you navigate to the "Media Library", you will notice that all assets are shown regardless of which Pods is dispatching your request. This is because all Pods are reading the same shared volume via NFS.

Affinity

We currently have 3 replicas running, but what if 1 node fails? We shouldn't worry because we have 2 other nodes, right? Not really.
The K8s scheduler will make sure to schedule our Pods in the best possible way, but by default, it doesn't guarantee that the Pods are scheduled in different nodes, which impacts the overall availability.
We can modify this and inform the scheduler how we want our Pods to be scheduled, for this we will use Affinities, but more specifically "Node anti-affinity".
Our Helm chart, already has a property called affinity, let's add it to our app.yaml configuration by adding the following block:

# ~/strapi-k8s/helm/app.yaml
# ...
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        podAffinityTerm:
          labelSelector:
            matchExpressions:
              - key: app.kubernetes.io/instance
                operator: In
                values:
                  - strapi
          topologyKey: kubernetes.io/hostname
Enter fullscreen mode Exit fullscreen mode

Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

Please note that we are using the anti-affinity pattern preferredDuringSchedulingIgnoredDuringExecution, which means it will try to comply with it. Still, if it can't, it will schedule the Pods either way in the best possible way.
You can also use requiredDuringSchedulingIgnoredDuringExecution, but this is stricter and if for any reason the cluster can't comply with this, your Pods won't get scheduled.
So you need to be careful, and if you want to use the strict approach, you need to have something else in place to help you guarantee that your workloads will get scheduled (e.g. a cluster autoscaler).

To verify if this is working, you can get the pods but use the wide output option to see the node where they are running:

kubectl get pods -l app.kubernetes.io/instance=strapi -o wide
Enter fullscreen mode Exit fullscreen mode

All pods should be scheduled in a different node.

(Bonus) Topology spread

If you are running in a cloud provider environment, your nodes should have the labels topology.kubernetes.io/region and topology.kubernetes.io/zone.
By relying on the zone, you can ensure that your Pods are spread through all your available zones in that given region. So you can use a configuration like the following, remember that you need to properly add it to the Helm chart, values and app.yaml:

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway
    labelSelector:
      matchLabels:
        app.kubernetes.io/instance: strapi
Enter fullscreen mode Exit fullscreen mode

Resources

Currently, our app and DB are running as the King and Queen of the cluster without any resource restrictions. But this can be a terrible idea in a shared environment. K8s will try to be democratic in the usage of the available resources. But this also means that it will "shift" resources as the Pods demand it. For the CPU this might work because it's a compressible resource, but for memory, it's a non-compressible resource, and therefore, our app can crash (OOMKill). So we need to inform K8s about our resource usage, so it can reserve and respect the minimum resources that our Pod needs to operate.

Following the Hardware and software requirements from Strapi docs, we can see that we need as minimum 1 CPU core and 2GB of RAM and ideally 2+ CPU cores and 4GB+ of RAM. Remember that these requirements are meant for servers and not containers, but they are a starting point. So let's translate that to our Helm chart, we will use the already existing property resources, and let's add it to our app.yaml configuration by adding the following block:

# ~/strapi-k8s/helm/app.yaml
# ...
resources:
  requests:
    cpu: 100m
    memory: 2Gi
  limits:
    cpu: 2000m
    memory: 4Gi
Enter fullscreen mode Exit fullscreen mode

You might notice that, instead of 1 core, we are configuring 100m cores. This is just to over-provision our cluster, which might change from cluster to cluster, so be mindful of these values. In some cases, it might be desired to have a lower CPU, so you can accommodate more workloads by over-utilizing the infrastructure, but it will depend on the nature of the workloads. For the memory, since we mentioned before that it's non-compressible, we need to be more strict about it.

Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

Keep in mind that the app's resources can change as the app evolves. Sometimes, some application code updates can make the app more efficient and decrease resource usage. But it can also happen the opposite, as the app evolves and more functionality is added, it can get more complex and increase resource usage. The best recommendation is to constantly monitor and optimize it accordingly as new versions are rolled out. But how do we monitor it?

(Detour) Prometheus and Grafana

We need a way of monitoring our cluster, the most recommended stack (but not the only one) is Prometheus, Alert Manager and Grafana.
Prometheus is a monitoring system, and Grafana is a visualization system that can work as the frontend for Prometheus. The Prometheus community provides and easy way to install these tools via the Kube Prometheus Stack. This chart has many parameters you can configure, so we will not get into much detail here. The easiest way to install it is the following:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack --namespace prometheus --create-namespace --set grafana.service.type="NodePort" --set grafana.service.nodePort="30082" 
Enter fullscreen mode Exit fullscreen mode

Now you can navigate your browser to http://localhost:8902 to access Grafana. The user should be admin and the password should be prom-operator (which should be changed). Once inside Grafana, you can navigate to "Dashboards", under the "General" folder, and look for the "Kubernetes / Compute Resources / Namespace (Workloads)" dashboard. Then make sure that on the top of the page, the default Data Source is selected, the default namespace is selected, and the deployment type is selected. Keep in mind that it might take some minutes for data to start showing up. In this dashboard, you can see the resource usage of all deployments in the default namespace.
You can also drill down by clicking on the workloads (e.g., click on "strapi-strapi-chart").

So going back to our Resources topic, this is how you can monitor the resource usage over time and tune it accordingly.

Pod disruption budget (PDB)

K8s provides us with a way of telling it how to handle our workloads whenever there is a node disruption/rotation (if any) by specifying disruption budgets. As the phrase suggests, it's a mechanism to say how much disruption we can tolerate. But when is this needed? Wasn't the deployments meant to happen with a rolling update strategy?

Pods can be created/deleted for different reasons and not only due to deployments. Whenever you deploy a Pod, it will indeed follow the strategy configured, which in most cases is a rolling update one. But when there is a node disruption/rotation, caused by auto-scaling, node maintenance, node failure, and/or node version update, this is not a deployment, so it doesn't follow the deployment strategy. For these scenarios, we use the Pod Disruption Budget (PDB).

K8s will try to do either way the node rotations while complying with any pod disruption budgets. This means it will not delete a node until all workloads have been moved to a new node that complies with the PDB. Consequently, sometimes node rotation can take longer due to this, but it's better to protect our workloads.

Let's create a file under the templates folder of our Helm chart with the name pdb.yaml, with the following content:

# ~/strapi-k8s/helm/strapi-chart/templates/pdb.yaml

{{- $replicaInt := .Values.replicaCount | int }}
{{- if ge $replicaInt 2 }}
{{- if .Values.podDisruptionBudget.enabled }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: {{ include "strapi-chart.fullname" . }}
  labels:
      {{- include "strapi-chart.labels" . | nindent 4 }}
spec:
  maxUnavailable: {{ .Values.podDisruptionBudget.maxUnavailable }}
  selector:
    matchLabels:
      {{- include "strapi-chart.selectorLabels" . | nindent 6 }}
{{- end }}
{{- end }}
Enter fullscreen mode Exit fullscreen mode

Then, let's add at the end of the values.yaml file in our Helm chart:

# ~/strapi-k8s/helm/strapi-chart/values.yaml
# ...
podDisruptionBudget:
  enabled: true
  maxUnavailable: 1
Enter fullscreen mode Exit fullscreen mode

Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

To test this, it's kind of tricky with k3d, but from a managed solution by a Cloud provider, you can update your cluster version and this will trigger a node rotation.

Security

Up to this point, we haven't mentioned security. This is a very important area to cover on your K8s journey or in any app journey at some point. This topic is big, it's always evolving, and it will always have many approaches depending on the context. This article is not a security article. Therefore we will only try to cover some basics. But remember that you should devote a good amount of time to analyze, audit, and secure your application.

We will focus on the K8s Deployment security, but we won't go into securing your K8s cluster or the surrounding components. K8s provides basic but useful mechanisms for securing our application. We will focus on configuring a Security Context for our app.

Our Helm chart already contains by default some configurations that will help us (you can find them in the values.yaml file of our chart).
We have the podSecurityContext and securityContext. Some configurations can be set in both, but not all of them, and the container configuration (securityContext) will take precedence if the configuration is in both.

In the app.yaml file, add the following:

# ~/strapi-k8s/helm/app.yaml
# ...
securityContext:
  runAsNonRoot: true
  runAsUser: 2001
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  seccompProfile:
    type: RuntimeDefault
  capabilities:
    drop:
      - ALL
Enter fullscreen mode Exit fullscreen mode

Let's analyze these options with their current values:

  • runAsNonRoot: true: doesn't allow running this container with the root user (or user id 0).
  • runAsUser: 2001: runs the container using the user with id 2001.
  • readOnlyRootFilesystem: true: mounts the root (host) filesystem as read only.
  • allowPrivilegeEscalation: false: blocks the container's process from gaining more privileges than its parent process (runtime).
  • seccompProfile: ...: defines the seccomp(secure computing mode, designed to control which syscalls can the process use) profile to use.
  • capabilities: ...: removes all linux capabilities that the process can access.

Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

If everything worked as expected, this should have failed. This is because our Dockefile was built using the root user. Docker, by default, will build the images with the root user unless otherwise specified. So we need to update our Dockerfile, go to the file Dockerfile.prod and add the following:

# ~/strapi-k8s/Dockerfile.prod
# ...
# ... RUN rm -rf /var/cache/apk/*
RUN addgroup --gid 2000 strapi \
    && adduser --disabled-password --gecos "" --no-create-home \
    --uid 2001 --ingroup strapi strapi
USER strapi
# ... ARG NODE_ENV=production
# ...
Enter fullscreen mode Exit fullscreen mode

And rebuild the image with the following command (don't forget to increase the version):

docker build -t mystrapiapp-prod:0.0.2 -f Dockerfile.prod .
docker tag mystrapiapp-prod:0.0.2 localhost:5050/mystrapiapp-prod:0.0.2
docker push localhost:5050/mystrapiapp-prod:0.0.2
Enter fullscreen mode Exit fullscreen mode

Now, update the image tag in the app.yaml file to the new 0.0.2 version.
Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

If everything worked as expected, this should have failed once again. If you look into our app logs (in a new terminal tab while Helm is trying to apply our upgrade), by running the following command:

kubectl logs --selector app.kubernetes.io/instance=strapi --tail=10 --follow
# the output should be similar to this:
warning Skipping preferred cache folder "/home/strapi/.cache/yarn" because it is not writable.
warning Skipping preferred cache folder "/tmp/.yarn-cache-2001" because it is not writable.
warning Skipping preferred cache folder "/tmp/.yarn-cache" because it is not writable.
Enter fullscreen mode Exit fullscreen mode

This is related to our readOnlyRootFilesystem configuration. But this is good, this is a sign that our configuration is working. To solve this, we need to make these specific folders writable, but only this one. For this, we need to do, once again, some updates to our Helm chart. First, in values.yaml, add the following at the end:

# ~/strapi-k8s/helm/strapi-chart/values.yaml
# ...
volumes: {}
Enter fullscreen mode Exit fullscreen mode

Then, in app.yaml, add the following at the end:

# ~/strapi-k8s/helm/app.yaml
# ...
volumes:
  yarn-cache:
    mount: /tmp
    definition:
      emptyDir: {}
Enter fullscreen mode Exit fullscreen mode

This will be a map of volumes that we will mount with the path and definition.

Second, in the file deployment.yaml, let's modify the section (including the surrounding if) for spec.template.spec.volumes, with the following:

# ~/strapi-k8s/helm/strapi-chart/templates/deployment.yaml
# ...
      {{- if or .Values.storage.claim.enabled .Values.volumes }}
      volumes:
        {{- if .Values.storage.claim.enabled }}
        - name: {{ include "strapi-chart.fullname" . }}-storage
          persistentVolumeClaim:
            claimName: {{ include "strapi-chart.fullname" . }}-pvc
        {{- end }}
        {{- range $key, $val := .Values.volumes }}
        - name: {{ $key }}
          {{- toYaml $val.definition | nindent 10 }}
        {{- end }}
      {{- end }}
Enter fullscreen mode Exit fullscreen mode

Finally, in the file deployment.yaml, modify the section (including the surrounding if) for spec.template.spec.containers[0].volumeMounts, with the following:

# ~/strapi-k8s/helm/strapi-chart/templates/deployment.yaml
# ...
          {{- if or .Values.storage.claim.enabled .Values.volumes }}
          volumeMounts:
            {{- if .Values.storage.claim.enabled }}
            - name: {{ include "strapi-chart.fullname" . }}-storage
              mountPath: {{ .Values.storage.mountPath }}
            {{- end }}
            {{- range $key, $val := .Values.volumes }}
            - name: {{ $key }}
              mountPath: {{ $val.mount }}
            {{- end }}
          {{- end }}
# ...
Enter fullscreen mode Exit fullscreen mode

This will allow us to pass volume specs to our container if needed. Once again, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

This should work. Luckily for us, yarn has a list of options for its cache, instead of using the first option, we are using the other options since it seems safer to use /tmp. But you could change this to the first option (/home/strapi/.cache) as well.

As mentioned before, the security aspect of K8s can be broad, but here are some articles that can give you a lead to where to continue with the topic of security:

Autoscaling

To conclude with our app, we need to add the capability to scale depending on the load. This is another of those critical items to review for an app before going to production. Sadly, similar to the topic of security, this is a vast topic, and it can vary widely depending on the requirements, the code itself, the infrastructure, and many other variables. It's safe to agree that regardless of the variables involved, the process is pretty much the same.

We need to start monitoring our app, which we already started doing with Prometheus and Grafana. Ideally, we should add more monitoring to our app. We are currently relying on "core K8s metrics," which are CPU, memory and networking. But the app should report relevant metrics, which it's hard to do a recommendation because it will vary on the app itself. As a starting point, you can check the Strapi Prometheus plugin, but nonetheless, you should export your own metrics as well.

Then you need some traffic to identify how your app behaves under a given load. Once again, how much traffic you need will depend on the requirements, purpose and expectations of the app. Plus, don't forget the other part, which is traffic to where? So, we need to identify how much traffic we need to best model our production use and which are our critical endpoints. This can vary and change over time, so it's highly recommended to re-analyze this occasionally. Before going to production, it's also recommended to exaggerate all your predictions.

For example, if you determine that you will receive 100 requests per second and that your critical endpoints are /api/restaurants and /api/categories. Then, do some stress tests for these values but also try for 200, 400, 600, 1000 requests per second and add some other endpoints to better understand. Once you go to production, it might become easier to do this process because you will have actual data, which will always be better than predictions. Surprises can always occur, and you should be as prepared as possible for them.

Now, we need to run our stress tests. Luckily, this is the easiest part, to some extent. It's worth mentioning that whenever you do these tests, you should do them in a similar infra for production or the actual infra for production. Since, running the cluster, the app and the tests in the same machine could create some "invisible" conflicts that will result in useless data.

But for the purposes of this article, given that we want to show the process, we will run everything on the same computer. We will use the tool vegeta, and if you want to read more about it, you can read this article. Our Strapi project is based on the Getting started guide, therefore we have restaurants and categories, so we will use these endpoints.

First, go to Grafana Dashboards (http://localhost:8902/dashboards), and search for the dashboard Kubernetes / Compute Resources / Workload. Then at the top of it, make sure you select the proper namespace and workload, it should be default and strapi-strapi-chart respectively. For convenience, you can enable auto-refresh for the Dashboard, in the upper-right corner, next to the "Refresh" button. Now let's run a simple test, for 60 seconds, let's request an endpoint 5 times per second and check the results.

Run the following command and wait for a minute to see some results in the CLI and in Grafana:

echo "GET http://localhost:1337/api/restaurants" | vegeta attack -duration=60s -rate=5 | vegeta report --type=text
# the output should be similar to this:
Requests      [total, rate, throughput]         300, 5.02, 5.02
Duration      [total, attack, wait]             59.807s, 59.801s, 6.298ms
Latencies     [min, mean, 50, 90, 95, 99, max]  5.964ms, 9.857ms, 9.284ms, 13.749ms, 15.139ms, 20.418ms, 27.374ms
Bytes In      [total, mean]                     584700, 1949.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:300
Error Set:
Enter fullscreen mode Exit fullscreen mode

If we take a moment to analyze this, you will see that in the 99 percentile of latencies, we are at 20ms, which is very good. But, this value might be so low because all of our infra is running locally, so these are the kind of situations that might result in some shady data.
But if we go to Grafana, we should see some spikes, mostly in the CPU and networking. I recommend you run this test, a couple of times just for consistency, the values should technically remain similar. But if not, you will start to see some patterns, irregularities, or other interesting behavior.

Assuming our results based on our Getting started app, this might have barely moved the CPU. Maybe there were some spikes but nothing outstanding. So let's increase the rate of our tests and the duration.

Run the following commands multiple times and feel free to increase the duration:

echo "GET http://localhost:1337/api/restaurants" | vegeta attack -duration=60s -rate=30 | vegeta report --type=text
echo "GET http://localhost:1337/api/restaurants" | vegeta attack -duration=60s -rate=60 | vegeta report --type=text
echo "GET http://localhost:1337/api/restaurants" | vegeta attack -duration=60s -rate=100 | vegeta report --type=text
echo "GET http://localhost:1337/api/restaurants" | vegeta attack -duration=60s -rate=500 | vegeta report --type=text
Enter fullscreen mode Exit fullscreen mode

This can be quite entertaining, so you can keep playing with the duration and the rate to learn more about your app's performance.
Keep in mind that if you are running everything locally, this might reach an inflection point given your machine's resources.

Eventually, when you cross-check these results, your expectations and requirements, you can decide on a property and a threshold to use for autoscaling. As mentioned before, we are now focusing only on CPU, memory and networking, but ideally, you should rely on more properties from your app (e.g., requests, latencies, etc.). In our case, for our Getting started app, we will start with a CPU threshold of 250m.

But before, let's fix resources, after the tests, in our case, it seemed that the CPU never exceeded 1000m (or 1 core). So let's change that accordingly, in the file app.yaml, change the resources.limits.cpu and the resources.requested.cpu, the resources block should look like this:

# ~/strapi-k8s/helm/app.yaml
# ...
resources:
 requests:
   cpu: 300m
   memory: 2Gi
 limits:
   cpu: 1000m
   memory: 4Gi
Enter fullscreen mode Exit fullscreen mode

These new values are to reflect better our resource utilization and to add autoscaling to it. Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

Now that our resources are better adapted, we can create a Horizontal Pod Autoscaler (HPA). Basically, it's a K8s object that will create more replicas of our Pod based on a defined property such as CPU or memory. We will rely on the CPU, and it's important to highlight that the HPA uses a percentage of the equivalent of the requested resources.

Therefore, it was also important to update the requests and limits of our resources. Our helm chart, by default, includes the proper configuration to create an HPA, so let's use it. In the file app.yaml, add the following at the bottom:

# ~/strapi-k8s/helm/app.yaml
# ...
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 100
  targetCPUUtilizationPercentage: 80
Enter fullscreen mode Exit fullscreen mode

Then, update our deployment by running:

helm upgrade strapi strapi-chart -f app.yaml --atomic
Enter fullscreen mode Exit fullscreen mode

This HPA, will create more replicas whenever the CPU exceeds 80% of the average CPU utilization of all Pods. Test it by running the stress tests again. And that should be it.

Custom metrics

The HPA will use by default only basic metrics like CPU and memory. If you implement your own metrics, you might want to switch the autoscaling to use those metrics instead. To achieve this, the most recommended tool would be to rely on the KEDA controller. This grants you more advanced options for horizontal autoscaling. To use it, you should install it via Helm, and then you can use it by replacing the HorizontalPodAutoscaler object with their ScaledObject custom object.

Conclusions

Kubernetes is a great tool, but at the same time, it can be overwhelming all the things that we need to do, install, and prepare to make the most out of it. Nonetheless, the horizon seems amazing, since these tools are actively being improved with many great features coming.

If you are interested in keep pushing forward your K8s journey, here are some recommendations:

  • Helmfile, Helm by itself is great, but eventually, when you use a lot of charts, it can become difficult to keep track of all of them. Either they are charts for your applications or your controllers. So Helmfile is a way to help you organize that with a declarative spec.
  • FluentBit, is a logging processor that can help you to push all of your application logs to a central location like an ElasticSearch or OpenSearch cluster.
  • Karpenter, is a cluster autoscaling solution that, for the moment, only works with AWS infrastructure, but helps you increase the number of nodes automatically depending on your Pods requirements.
  • Cluster Autoscaler, is a cluster autoscaling solution designed for any cloud provider.
  • Falco, is a security project that can help you detect threats from within your cluster.
  • OpenTelmetry, a tracing solution that can help increase your observability.
  • Jaeger, another tracing solution that can work in tandem with OpenTelemetry.
  • ArgoCD, a declarative tool to continuous delivery your applications via the GitOps way to K8s. Argo has other projects like Workflows or Rollouts very interesting as well.
  • LitmusChaos, is a platform that helps you to run Chaos Engineering in your cluster to identify weaknesses and improvement opportunities.

Good luck on your K8s journey!

Top comments (0)