Tools for Kubernetes Monitoring - Part 1: What is Kube-state-metrics?

Monitoring forms the backbone of a healthy Kubernetes environment. As applications scale, maintaining visibility into cluster performance and object health becomes essential. This is where Kube-state-metrics plays an important role. It focuses on the state, configuration, and health of resources, enabling DevOps teams to make informed decisions.

This article is the first part of the Kubernetes Monitoring Tools series. It explains what Kube-state-metrics is, how it works, its key features, and the benefits it brings to Kubernetes monitoring.

What is Kube-state-metrics?

Kube-state-metrics is a service that collects and exposes metrics about the state of Kubernetes objects. Unlike other Kubernetes monitoring tools that focus on resource usage such as CPU, memory, or network consumption, it specializes in tracking the state and configuration of resources. It gathers detailed information about deployments, pods, nodes, config maps, secrets, and other objects by watching the Kubernetes API server.

The service exports these details as metrics in a format compatible with Prometheus. These metrics help track whether the current state of Kubernetes objects matches the desired configuration, making it a valuable component for ensuring the stability and reliability of applications running in the cluster.

How Kube-state-metrics Works?

Kube-state-metrics functions by continuously watching the Kubernetes API server. It collects metadata and state information for resources, then exposes it via an HTTP endpoint in a format that Prometheus can scrape.

The working process is simple:

Watches Kubernetes API objects such as pods, nodes, and deployments.
Extracts state-related data like desired vs. actual replicas, pod phase, or node readiness.
Exposes metrics on a /metrics endpoint in Prometheus exposition format.
Prometheus scrapes and stores these metrics for alerting and visualization through tools like Grafana.

This service does not require any persistent storage or database because it only collects and exposes live state information.

Key Features of Kube-state-metrics

Let’s explore the key features that make Kube-state-metrics a reliable and essential tool for monitoring Kubernetes object states.
1. State-focused Metrics Collection
Kube-state-metrics specializes in tracking the state and configuration of Kubernetes objects. For example, it provides details about how many replicas of a deployment are running compared to the desired number, or whether a node is ready to accept new pods. This information is crucial for identifying potential misconfigurations or failures.
2. Lightweight and Read-only
The service operates in a read-only mode. It does not change or interact with any Kubernetes resources, making it safe to run in production clusters. It consumes minimal resources and does not impact cluster performance.
3. Prometheus-friendly Format
Kube-state-metrics exposes all metrics in Prometheus exposition format. This design makes integration with existing monitoring stacks easy and efficient. Prometheus scrapes these metrics directly without requiring additional configuration.
4. Granular Object-level Insights
The service provides fine-grained details for each Kubernetes object. Examples include kube_pod_status_phase for pod state, kube_deployment_status_replicas for deployment replicas, and kube_node_status_condition for node health. These metrics help teams understand the exact state of every resource in the cluster.
5. Easy Deployment and Integration
Kube-state-metrics runs as a simple Kubernetes deployment. With a few configuration steps, teams can integrate it with Prometheus and Grafana. This ease of deployment makes it a preferred choice for DevOps teams looking to enhance their monitoring stack.

Examples of Important Kube-state-metrics

Some commonly used metrics include:

kube_pod_status_phase - Shows whether a pod is running, pending, succeeded, or failed.
kube_deployment_status_replicas - Displays the number of replicas currently running in a deployment.
kube_node_status_condition - Indicates node conditions such as ready, memory pressure, or disk pressure.
kube_persistentvolumeclaim_status_phase - Shows the status of persistent volume claims.
kube_pod_container_status_restarts_total - Counts the total number of container restarts in a pod.

These metrics play a significant role in detecting issues and improving observability.

Benefits of Using Kube-state-metrics

Kube-state metrics offer several advantages that make it an essential part of a Kubernetes monitoring strategy. Let’s explore how they help improve observability, streamline troubleshooting, and maintain cluster reliability.

1. Improved Cluster Visibility
By monitoring the state of resources, teams gain better visibility into cluster health. It becomes easier to identify which deployments, pods, or nodes are facing issues.
2. Faster Issue Detection and Alerting
Kube-state-metrics works with Prometheus Alertmanager to trigger alerts when a resource is in an undesired state. For example, an alert can notify when a deployment does not meet the desired replica count or when a node is not ready.
3. Better Capacity Planning
Understanding the state of scheduled workloads helps in planning capacity efficiently. Teams can track how resources are being utilized and make decisions on scaling based on the observed state.
4. Enhanced Troubleshooting
Detailed state information reduces the time spent identifying the root cause of failures. For instance, if a pod fails repeatedly, metrics such as container restart counts can guide the troubleshooting process.
5. Compliance and Auditing Support
Historical state data, when stored by Prometheus, helps track changes in resource configuration. This information is useful for auditing and ensuring compliance with operational policies.

Conclusion

Kube-state-metrics is a powerful tool for monitoring the state of Kubernetes objects. Its lightweight design, easy integration with Prometheus, and ability to provide granular state information make it a must-have component in any Kubernetes observability stack. It helps maintain the reliability and performance of applications running in a cluster by enabling better visibility, faster troubleshooting, and accurate alerting. Kubernetes developers can help set up and optimize Kube-state-metrics to ensure that applications run smoothly, resources remain healthy, and potential issues are detected early.

This is the first part of the Tools for Kubernetes Monitoring series. In the next article, we will explore other tool that helps in kubernetes monitoring. Stay tuned for more insights into building a comprehensive Kubernetes monitoring strategy.