DEV Community

irena-sayv
irena-sayv

Posted on

3 1

Google Cloud Monitoring: Monitoring and Alerting on number of kubernetes pod replicas in GKE

Monitoring and Alerting is critical for operations and cost management of business applications.

As part of the Operations team, we were required to create a mechanism to get automated alerts when the pod replicas for a deployment goes down the expected number or a pod is not available for a business critical deployment, or when the count of pod replicas goes up beyond the expected number pointing towards unusual traffic.

There is no built-in metric available for getting the count of replicas for a Kubernetes Pod in Cloud Monitoring. In this blog, we will look at how we can enable alerting policy based on the pod replicas count.

Using container/uptime metric

Image description

Containers are encapsulated in a Pod. We can leverage container Uptime metric for determining the count of pod replicas with below configurations

Alert policy configuration for pod replicas count

We have used resource label container_name along with the cluster name in Filter, you may need additional filters such as namespace to identify a container. The Aggregate function 'count' reduces the multiple timeseries data to a single value giving us the count of containers that are up at the given time

Here, sum is used as Aligner, but any of the available aligner options can be used as we do not care about the intermediate value

Alert policy configuration for pod replicas count

Policy Conditions

Condition for checking when number of containers is below 5
Alert policy configuration for pod replicas count

There will be no record for a container when it is not up. So we need an additional condition that will trigger an alert if a container is missing in the metric output

Alert policy configuration for pod replicas count

Alert policy configuration for pod replicas count

An alert is triggered when any of the above conditions is met

Billboard image

Imagine monitoring that's actually built for developers

Join Vercel, CrowdStrike, and thousands of other teams that trust Checkly to streamline monitor creation and configuration with Monitoring as Code.

Start Monitoring

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Engage with a sea of insights in this enlightening article, highly esteemed within the encouraging DEV Community. Programmers of every skill level are invited to participate and enrich our shared knowledge.

A simple "thank you" can uplift someone's spirits. Express your appreciation in the comments section!

On DEV, sharing knowledge smooths our journey and strengthens our community bonds. Found this useful? A brief thank you to the author can mean a lot.

Okay