Alexandr Bandurchin for Uptrace

Posted on Oct 16 • Originally published at uptrace.dev

Kubernetes Microservices Monitoring and Observability with OpenTelemetry

#microservices #monitoring #architecture #kubernetes

Kubernetes orchestrates containers at scale, but this introduces monitoring challenges that don't exist with traditional deployments. Pods are ephemeral—they start, run, and terminate constantly. When a pod crashes and restarts, its logs disappear unless you capture them elsewhere. IP addresses change with each pod restart, making traditional host-based monitoring ineffective.

Microservices on Kubernetes compound these challenges. A single user request might traverse five services across fifteen pods distributed across multiple nodes. When something breaks, you need to trace that request through constantly changing infrastructure while correlating metrics from pods that might not exist anymore.

Kubernetes Architecture

Kubernetes clusters consist of control plane components and worker nodes. The control plane manages cluster state through the API server, scheduler, and controller manager. Worker nodes run your applications in pods, with each node running kubelet to communicate with the control plane.

Pods are the smallest deployable units in Kubernetes. Each pod contains one or more containers sharing network and storage. Pods are ephemeral—Kubernetes creates and destroys them based on load, health checks, and deployment updates. This means pod names and IPs change constantly.

Services provide stable network endpoints for groups of pods. A service abstracts pod IPs behind a single DNS name and load balances traffic across healthy pods. This decouples applications from pod lifecycle—when pods restart, the service continues routing traffic to new instances.

Pod-Level Observability

Monitoring pods requires tracking both infrastructure metrics and application performance. Infrastructure metrics show resource usage and pod health. Application metrics reveal what the code actually does.

Kubernetes exposes pod metrics through the Metrics API. These include CPU usage, memory consumption, network traffic, and disk I/O. The metrics server collects this data from kubelet on each node and makes it available for queries and horizontal pod autoscaling.

apiVersion: v1
kind: Pod
metadata:
  name: order-service
  labels:
    app: order-service
    version: v1.2.0
spec:
  containers:
  - name: order-service
    image: order-service:v1.2.0
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"

Resource requests tell Kubernetes the minimum resources a pod needs. Limits cap maximum usage. When a pod exceeds its memory limit, Kubernetes kills it with an OOMKilled status. Monitoring these events reveals whether your resource limits match actual usage.

OpenTelemetry on Kubernetes

OpenTelemetry provides automatic instrumentation for applications running in Kubernetes. The OpenTelemetry Operator injects instrumentation into pods without code changes.

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: default-instrumentation
spec:
  exporter:
    endpoint: http://otel-collector:4317
  propagators:
    - tracecontext
    - baggage
  sampler:
    type: parentbased_traceidratio
    argument: "0.1"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest

Annotate your deployments to enable automatic instrumentation:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  template:
    metadata:
      annotations:
        instrumentation.opentelemetry.io/inject-java: "true"
    spec:
      containers:
      - name: payment-service
        image: payment-service:v2.0.0

The operator injects an init container that adds OpenTelemetry libraries to your application. When the pod starts, instrumentation activates automatically, capturing HTTP requests, database queries, and external API calls.

Service Mesh Observability

Service meshes like Istio and Linkerd add a sidecar proxy to each pod. These proxies handle all network traffic, providing observability without instrumenting application code.

The sidecar captures request metrics (rate, error rate, latency), generates distributed traces for every request, and collects access logs with full request context. This gives you network-level observability across all services.

apiVersion: v1
kind: Service
metadata:
  name: order-service
  labels:
    app: order-service
spec:
  ports:
  - port: 8080
    name: http
  selector:
    app: order-service
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
spec:
  template:
    metadata:
      labels:
        app: order-service
        version: v1
    spec:
      containers:
      - name: order-service
        image: order-service:v1.2.0
        ports:
        - containerPort: 8080

Istio automatically injects sidecars when you enable injection on a namespace:

kubectl label namespace production istio-injection=enabled

Sidecars export metrics in Prometheus format. Query these metrics to understand traffic patterns between services, identify slow dependencies, and detect error rate spikes.

Debugging Ephemeral Pods

When a pod crashes, its logs often disappear before you can examine them. Kubernetes provides mechanisms to access logs from terminated containers.

The kubectl logs command retrieves logs from the previous container instance:

kubectl logs payment-service-7d8f9c-hx2k9 --previous

This works until the pod restarts again. For persistent log storage, deploy a log aggregator that ships logs from all pods to centralized storage.

OpenTelemetry Collector handles this. Deploy it as a DaemonSet to run one instance per node:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-collector
spec:
  selector:
    matchLabels:
      app: otel-collector
  template:
    metadata:
      labels:
        app: otel-collector
    spec:
      containers:
      - name: otel-collector
        image: otel/opentelemetry-collector:latest
        volumeMounts:
        - name: varlog
          mountPath: /var/log
          readOnly: true
      volumes:
      - name: varlog
        hostPath:
          path: /var/log

The collector reads logs from all pods on the node and exports them to your observability backend. When pods crash, logs remain accessible.

Container Resource Monitoring

Kubernetes monitors container resource usage through cAdvisor, which runs as part of kubelet. cAdvisor collects CPU, memory, network, and disk metrics for each container.

Monitor memory usage patterns to detect leaks. A container whose memory usage climbs steadily will eventually hit its limit and get killed. Track the container_memory_working_set_bytes metric to see actual memory consumption.

CPU throttling occurs when a container exceeds its CPU limit. Kubernetes throttles the container, making it run slower. The container_cpu_cfs_throttled_seconds_total metric shows cumulative throttled time. Rising throttling indicates your CPU limits are too low.

Health Checks

Kubernetes uses health checks to determine when to restart containers and when to route traffic to pods. Liveness probes check if a container is alive. If a liveness probe fails repeatedly, Kubernetes restarts the container. Readiness probes check if a container can accept traffic. Kubernetes removes pods with failing readiness probes from service endpoints.

apiVersion: v1
kind: Pod
metadata:
  name: payment-service
spec:
  containers:
  - name: payment-service
    image: payment-service:v2.0.0
    livenessProbe:
      httpGet:
        path: /health/live
        port: 8080
      initialDelaySeconds: 30
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /health/ready
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5

The liveness probe checks /health/live every 10 seconds. The readiness probe checks /health/ready every 5 seconds. These endpoints should return HTTP 200 when healthy, 503 when unhealthy.

Implement these endpoints to check actual application health, not just that the process is running. Verify database connections work, required services are reachable, and caches are populated.

Network Policy Observability

Network policies control traffic between pods. When policies block traffic, applications fail with connection timeouts or refused connections. Without visibility into network policy enforcement, these errors look like application bugs.

Service meshes provide network policy observability. Istio generates metrics showing which connections network policies allowed or denied. The istio_tcp_connections_opened_total and istio_tcp_connections_closed_total metrics track connection counts with labels indicating source and destination services.

Query these metrics to understand traffic patterns and identify blocked connections:

rate(istio_tcp_connections_closed_total{
  response_flags="DC"
}[5m])

The DC response flag indicates "downstream connection termination" caused by network policy denials.

Node-Level Monitoring

Nodes provide the infrastructure where pods run. Node failures take down all pods on that node. Monitor node health to detect issues before they cause widespread failures.

Key node metrics include CPU usage, memory usage, disk space, and network bandwidth. The node_memory_MemAvailable_bytes metric shows available memory. When this drops too low, Kubernetes starts evicting pods.

The node_disk_io_time_seconds_total metric tracks disk I/O time. High I/O times indicate disk saturation, which slows all containers on the node.

Monitor node conditions through the Kubernetes API. The Ready condition indicates whether the node can accept new pods. The DiskPressure condition signals low disk space. The MemoryPressure condition signals low available memory.

Distributed Tracing

Distributed tracing shows request paths through microservices. In Kubernetes, traces must handle the dynamic nature of pods—services scale up and down, pods restart frequently, and IPs change constantly.

OpenTelemetry propagates trace context through service calls. When Service A calls Service B, OpenTelemetry injects trace context into HTTP headers or message metadata. Service B extracts this context and continues the trace.

Kubernetes labels and annotations help correlate traces with infrastructure. Add pod name, namespace, and node name to trace attributes:

Span span = tracer.spanBuilder("process-payment")
    .setAttribute("k8s.pod.name", System.getenv("HOSTNAME"))
    .setAttribute("k8s.namespace", System.getenv("K8S_NAMESPACE"))
    .setAttribute("k8s.node.name", System.getenv("K8S_NODE_NAME"))
    .startSpan();

This links traces to specific pod instances. When investigating slow requests, you can identify which pod processed them and check that pod's resource usage at that time.

Horizontal Pod Autoscaling

Horizontal Pod Autoscaler (HPA) scales deployments based on metrics. When CPU usage exceeds a threshold, HPA increases replica count. When usage drops, HPA decreases replicas.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: payment-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Monitor HPA decisions to understand scaling behavior. The kube_horizontalpodautoscaler_status_current_replicas metric shows current replica count. Compare this against kube_horizontalpodautoscaler_status_desired_replicas to see if HPA can achieve its target.

If desired replicas exceeds current replicas for extended periods, you've hit cluster capacity limits or pod scheduling constraints.

Monitoring Best Practices

Use labels consistently across all resources. Add labels for service name, version, environment, and team. This enables filtering and grouping in dashboards and queries.

Set up alerts for critical pod states: CrashLoopBackOff indicates repeated startup failures, ImagePullBackOff signals registry access problems, and OOMKilled shows memory limits are too low. These states require immediate investigation.

Monitor the gap between resource requests and actual usage. Requesting more resources than needed wastes cluster capacity. Requesting too little causes throttling and OOM kills. Track the container_memory_working_set_bytes / container_spec_memory_limit_bytes ratio. Values consistently near 1.0 indicate tight limits.

Implement distributed tracing for all service-to-service communication. This reveals request paths, identifies slow dependencies, and helps diagnose cascading failures.

Uptrace for Kubernetes

Uptrace integrates with Kubernetes through OpenTelemetry and could be deployed on Kubernetes. Deploy the OpenTelemetry Operator and Collector to your cluster. Configure your applications to export telemetry to the collector. The collector forwards data to Uptrace.

Uptrace correlates metrics, logs, and traces across your entire cluster. When a pod crashes, view its final logs alongside traces of the requests it was processing. When latency spikes, see which pods were handling requests and their resource usage at that time.

For Spring Boot microservices on Kubernetes, combine the patterns from Spring Boot monitoring with Kubernetes-native instrumentation. For event-driven systems, check Kafka microservices monitoring.

Getting Started

Start with the basics: deploy the Metrics Server to enable resource metrics. This gives you CPU and memory usage for pods and nodes.

Add OpenTelemetry Operator to enable automatic instrumentation. Annotate your deployments to inject instrumentation without code changes.

Deploy OpenTelemetry Collector as a DaemonSet to collect logs and metrics from all nodes. Configure it to export to your observability backend.

Implement proper health checks on all services. Use liveness probes to detect crashed containers and readiness probes to manage traffic routing.

Set up alerts for pod states that indicate problems—CrashLoopBackOff, ImagePullBackOff, OOMKilled. These require immediate action.

Ready to monitor Kubernetes microservices? Start with Uptrace for unified observability across your cluster.

You may also be interested in:

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.