Collect logs and Kubernetes resources when Pod is in crashloopbackoff state

#kubernetes #opensource #showdev #devops

Projectsveltos is a Kubernetes add-on controller that simplifies the deployment and management of add-ons and applications across multiple clusters. It runs in the management cluster and can programmatically deploy and manage add-ons and applications on any cluster in the fleet, including the management cluster itself. Sveltos supports a variety of add-on formats, including Helm charts, raw YAML, Kustomize, Carvel ytt, and Jsonnet.

Projectsveltos though goes beyond managing add-ons and applications across a fleet of #Kubernetes Clusters. It can also proactively monitor cluster health and provide real-time notifications.

To detect events in managed cluster and evaluate health, Lua language is used. Create an EventSource instance can be created to define what an event is. HealthCheck instance can be created to define what an health rule is.

Send Slack Notification when Pod is in crashloopbackoff state

We already described here how to instruct Sveltos to detect a Pod in crashloopbackoff state and send a Slack notification when the happens.

Deploy add-ons and applications when event happens

In this article we will describe how to configure Sveltos to detect an event and respond to it by deploying a new set of add-ons and applications.

Projectsveltos has two custom resource definitions to achieve this goal:

EventSource defines what an event is. It accepts a #lua script. Sveltos's monitoring capabilities extend to all Kubernetes resources, including custom resources, ensuring comprehensive oversight of your infrastructure.

EventBasedAddOn defines in which clusters events need to be detected and what add-ons and applications deploy in response.

apiVersion: lib.projectsveltos.io/v1alpha1
kind: EventSource
metadata:
 name: crashing-pod
spec:
 group: ""
 version: "v1"
 kind: "Pod"
 collectResources: true
 script: |
  function evaluate()
     hs = {}
     hs.matching = false
     hs.message = ""
     if obj.status.containerStatuses then
        local containerStatuses = obj.status.containerStatuses
        for _, containerStatus in ipairs(containerStatuses) do
          if containerStatus.state.waiting and containerStatus.state.waiting.reason == "CrashLoopBackOff" then
            hs.matching = true
            hs.message = obj.metadata.namespace .. "/" .. obj.metadata.name .. ":" .. containerStatus.state.waiting.message
            if containerStatus.lastState.terminated and containerStatus.lastState.terminated.reason then
              hs.message = hs.message .. "\nreason:" .. containerStatus.lastState.terminated.reason
            end
          end
        end
     end
     return hs
  end

apiVersion: lib.projectsveltos.io/v1alpha1
kind: EventBasedAddOn
metadata:
 name: hc
spec:
 sourceClusterSelector: env=fv
 eventSourceName: crashing-pod
 oneForEvent: true
 stopMatchingBehavior: LeavePolicies
 policyRefs:
 - name: k8s-collector
   namespace: default
   kind: ConfigMap

The ConfigMap referenced by EventBasedAddOn instance contains all resources that will be deployed in each cluster where a Pod in crashing state is found.
In this case:

PersistentVolumeClaim will be created
A Job containing a Kubernetes collector instance. This job will collect logs and Kubernetes resources and save those in the corresponding volume
A ConfigMap contains the Kubernetes collector configuration (which logs and resources to collect).

All YAMLs used in this example can be found here.

The Kubernetes collector used in this example can be found here.

👏 Support this project

If you enjoyed this article, please check out the Projectsveltos GitHub repo.

You can also star 🌟 the project if you found it helpful.

The GitHub repo is a great resource for getting started with the project. It contains the code, documentation, and examples. You can also find the latest news and updates on the project on the GitHub repo.
Thank you for reading!

DEV Community

Collect logs and Kubernetes resources when Pod is in crashloopbackoff state

Send Slack Notification when Pod is in crashloopbackoff state

Deploy add-ons and applications when event happens

👏 Support this project

Top comments (0)

Read next

Understanding LinkedIn Authwall: How it Works, Benefits, and Implementing it on Your Website

LitmusChaos is joining Kubecon + CloudNativeCon North America 2024!

LINUX CHALLENGE, DAY 4

A Comparative Analysis between RK3588 and RK3576 Chips: Unveiling the Technological Distinctions