Matheus

Posted on Feb 21 • Originally published at releaserun.com

Kubernetes Events Explained: Types, kubectl Commands, and Observability Patterns

#kubernetes #devops #observability #tutorial

What Are Kubernetes Events?

Every time something happens inside a Kubernetes cluster -- a pod gets scheduled, a container image is pulled, a volume fails to mount -- the control plane records it as an Event. Events are first-class API objects (kind: Event) that provide a running log of what is happening across your nodes, pods, deployments, and other resources.

Unlike application logs, which capture output from your code, Kubernetes events describe the lifecycle of cluster objects themselves. They answer questions like: Why is this pod stuck in Pending? Why did that node go NotReady? Why was my container OOM-killed?

Events are stored in etcd alongside other API objects and are accessible through the Kubernetes API. They are namespaced resources, meaning each event belongs to a specific namespace (or to the cluster scope for node-level events). Understanding how to read, filter, and export events is one of the most practical debugging skills a Kubernetes operator can develop.

Event Types: Normal and Warning

Kubernetes classifies every event into one of two types:

Normal -- Indicates that something expected happened. A pod was scheduled, a container started, a volume was successfully attached. These events confirm that the system is working as intended.
Warning -- Indicates that something unexpected or potentially problematic occurred. A container crashed, an image pull failed, a node ran out of resources. Warning events are the ones you typically want to monitor and alert on.

Here is an example of what a Normal event looks like when a pod starts successfully:

LAST SEEN   TYPE     REASON      OBJECT          MESSAGE
2m          Normal   Scheduled   pod/web-abc12   Successfully assigned default/web-abc12 to node-3
2m          Normal   Pulling     pod/web-abc12   Pulling image "nginx:1.27"
2m          Normal   Pulled      pod/web-abc12   Successfully pulled image "nginx:1.27" in 1.2s
2m          Normal   Created     pod/web-abc12   Created container nginx
2m          Normal   Started     pod/web-abc12   Started container nginx

And here is a Warning event when something goes wrong:

LAST SEEN   TYPE      REASON             OBJECT          MESSAGE
30s         Warning   FailedScheduling   pod/web-xyz99   0/5 nodes are available: 5 Insufficient memory.

Anatomy of a Kubernetes Event

Each event object contains several fields that together tell you exactly what happened, to which object, and when. Understanding these fields is essential for effective debugging.

Key Event Fields

type -- Either Normal or Warning.
reason -- A short, CamelCase string that categorizes the event. Examples: Scheduled, Pulling, FailedMount, BackOff.
message -- A human-readable description of what happened.
involvedObject -- The API object the event is about, including its kind, name, namespace, and uid.
source -- The component that generated the event (e.g., kubelet, default-scheduler, kube-controller-manager).
count -- How many times this event has occurred. Kubernetes deduplicates repeated events and increments this counter instead of creating new objects.
firstTimestamp -- When the event was first recorded.
lastTimestamp -- When the event was most recently recorded.

Here is a full event object in YAML format:

apiVersion: v1
kind: Event
metadata:
  name: web-abc12.17f3a2b8c9d1e4f6
  namespace: default
  creationTimestamp: "2026-02-16T10:30:00Z"
involvedObject:
  apiVersion: v1
  kind: Pod
  name: web-abc12
  namespace: default
  uid: a1b2c3d4-e5f6-7890-abcd-ef1234567890
reason: BackOff
message: "Back-off restarting failed container nginx in pod web-abc12_default"
source:
  component: kubelet
  host: node-3
type: Warning
count: 5
firstTimestamp: "2026-02-16T10:25:00Z"
lastTimestamp: "2026-02-16T10:30:00Z"

Viewing Events with kubectl

The most common way to inspect events is through kubectl. Here are the commands you will use most often.

List All Events in the Current Namespace

kubectl get events

This returns events in the default namespace. To see events in a different namespace, add -n. To see events across all namespaces:

kubectl get events --all-namespaces

Sort Events by Time

By default, events are not guaranteed to be in chronological order. Sort them by creation timestamp to see the most recent activity:

kubectl get events --sort-by=.metadata.creationTimestamp

This is one of the most useful flags when triaging an incident. It lets you reconstruct a timeline of what happened in the cluster.

Filter Events by Type

To see only Warning events, which are typically the ones that matter during debugging:

kubectl get events --field-selector type=Warning

You can also filter by the involved object. For example, to see events for a specific pod:

kubectl get events --field-selector involvedObject.name=web-abc12

Or combine multiple field selectors:

kubectl get events --field-selector type=Warning,involvedObject.kind=Pod

View Events via kubectl describe

The kubectl describe command shows events at the bottom of its output for any resource. This is often the fastest way to check events for a specific pod:

kubectl describe pod web-abc12

The Events section at the bottom will show recent events related to that pod, sorted chronologically. This is usually the first command you run when a pod is misbehaving.

Wide Output and Custom Columns

For more detail, use wide output or custom columns:

kubectl get events -o wide

Or extract specific fields with JSONPath:

kubectl get events -o jsonpath='{range .items[*]}{.lastTimestamp}{"\t"}{.type}{"\t"}{.reason}{"\t"}{.message}{"\n"}{end}'

Common Warning Events and What They Mean

Certain warning events appear frequently in production clusters. Knowing what they mean and how to respond to them will save you significant debugging time.

FailedScheduling

Warning   FailedScheduling   pod/app-xyz   0/5 nodes are available: 2 Insufficient cpu, 3 Insufficient memory.

The scheduler cannot find a node with enough resources to place the pod. This usually means you need to scale up your node pool, reduce resource requests, or free up capacity by evicting lower-priority workloads. Check your resource requests and limits against actual node capacity.

ImagePullBackOff

Warning   Failed    pod/app-xyz   Failed to pull image "myregistry.io/app:v2.1": rpc error: unauthorized
Warning   BackOff   pod/app-xyz   Back-off pulling image "myregistry.io/app:v2.1"

The kubelet cannot pull the container image. Common causes include incorrect image tags, missing or expired registry credentials (imagePullSecrets), or network connectivity issues to the registry. To debug, verify the image name and tag are correct, confirm that the imagePullSecret exists in the pod's namespace and contains valid credentials, and test registry connectivity from the node with curl or crictl pull.

BackOff (CrashLoopBackOff)

Warning   BackOff   pod/app-xyz   Back-off restarting failed container app in pod app-xyz_default

The container keeps crashing and Kubernetes is applying an exponential back-off delay before restarting it. Check the container logs with kubectl logs app-xyz --previous to see why the application is crashing.

Unhealthy (Liveness/Readiness Probe Failures)

Warning   Unhealthy   pod/app-xyz   Readiness probe failed: HTTP probe failed with statuscode: 503
Warning   Unhealthy   pod/app-xyz   Liveness probe failed: connection refused

The kubelet's health check probes are failing. If the liveness probe fails, Kubernetes will restart the container. If the readiness probe fails, the pod is removed from service endpoints. Review your probe configuration -- the path, port, and timeout values -- and verify that your application is actually healthy on those endpoints.

FailedMount and FailedAttachVolume

Warning   FailedMount         pod/db-abc   Unable to attach or mount volumes: timed out waiting for the condition
Warning   FailedAttachVolume   pod/db-abc   Multi-Attach error for volume "pvc-123": Volume is already attached to node-1

The pod's volume cannot be attached or mounted. This is common with cloud block storage (EBS, Persistent Disk) when a volume is still attached to a previous node after a failover. Some storage backends do not support ReadWriteMany access mode. When you see this event, check the PersistentVolumeClaim status with kubectl get pvc and verify the volume's availability in your cloud provider's console. In many cases, force-detaching the volume from the old node resolves the issue.

OOMKilling

Warning   OOMKilling   pod/app-xyz   Memory cgroup out of memory: Killed process 12345 (java)

The container exceeded its memory limit and was killed by the kernel's OOM killer. Either the memory limit is too low for the workload, or the application has a memory leak. Increase the memory limit or investigate the application's memory usage patterns. For more on diagnosing node-level issues, see our guide to debugging Kubernetes nodes in NotReady state.

NodeNotReady

Warning   NodeNotReady   node/node-3   Node node-3 status is now: NodeNotReady

A node has stopped reporting its status to the control plane. This can be caused by kubelet crashes, network partitions, or the node running out of resources (disk pressure, memory pressure, PID pressure). All pods on the affected node will eventually be rescheduled to other nodes after the pod-eviction-timeout expires (default: 5 minutes). Monitor for this event closely in production -- it often indicates a node that needs investigation or replacement. For a detailed troubleshooting guide, see our article on debugging Kubernetes nodes in NotReady state.

Event Retention and the Default TTL

One of the most important things to understand about Kubernetes events is that they are ephemeral by default. The kube-apiserver has a default event time-to-live (TTL) of 1 hour. After that, events are garbage-collected from etcd.

This means that if you look at events after an incident that happened two hours ago, they will already be gone. This is one of the main reasons teams set up event exporters (covered in the next section). The short default TTL is intentional -- events can be high-volume in large clusters, and storing them indefinitely in etcd would increase storage and memory pressure on the control plane.

Configuring the Event TTL

You can change the default TTL by passing the --event-ttl flag to the kube-apiserver:

# In the kube-apiserver manifest (e.g., /etc/kubernetes/manifests/kube-apiserver.yaml)
spec:
  containers:
  - command:
    - kube-apiserver
    - --event-ttl=6h
    # ... other flags

Increasing the TTL gives you a longer window to inspect events, but it also increases the load on etcd since more objects are stored. For most production clusters, 2-6 hours is a reasonable range. Beyond that, you should be exporting events to an external system.

If you are planning a cluster upgrade, be aware that changes to apiserver flags may need to be reapplied. Our Kubernetes upgrade checklist covers these considerations.

Exporting Events for Long-Term Observability

Since events are garbage-collected after the TTL expires, exporting them to an external logging or observability platform is essential for production clusters. Several tools are available for this purpose.

Kubernetes Event Exporter

The Kubernetes Event Exporter (originally by OpenPolicyAgent, now maintained by Resmo) watches the event stream and forwards events to sinks like Elasticsearch, OpenSearch, Slack, webhooks, or files.

Here is a minimal configuration that forwards Warning events to Elasticsearch:

apiVersion: v1
kind: ConfigMap
metadata:
  name: event-exporter-cfg
  namespace: monitoring
data:
  config.yaml: |
    logLevel: error
    logFormat: json
    route:
      routes:
        - match:
            - receiver: "elasticsearch"
              type: Warning
    receivers:
      - name: "elasticsearch"
        elasticsearch:
          hosts:
            - "http://elasticsearch.monitoring.svc:9200"
          index: kube-events
          indexFormat: "kube-events-{2006-01-02}"

Fluentd and Fluent Bit

If you already run Fluentd or Fluent Bit for log collection, you can configure them to collect Kubernetes events as well. Fluent Bit has a built-in kubernetes_events input plugin:

[INPUT]
    Name              kubernetes_events
    Tag               kube_events.*
    Kube_URL          https://kubernetes.default.svc:443
    Kube_CA_File      /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File   /var/run/secrets/kubernetes.io/serviceaccount/token

[OUTPUT]
    Name              es
    Match             kube_events.*
    Host              elasticsearch.monitoring.svc
    Port              9200
    Index             kube-events
    Type              _doc

Kubernetes Event Router (Heptio/VMware)

The Event Router is a simpler alternative that captures events and writes them to stdout in a structured format. You can then collect that stdout with any log aggregation system (Fluentd, Promtail, Vector, etc.):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: eventrouter
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: eventrouter
  template:
    metadata:
      labels:
        app: eventrouter
    spec:
      serviceAccountName: eventrouter
      containers:
      - name: kube-eventrouter
        image: gcr.io/heptio-images/eventrouter:v0.4
        volumeMounts:
        - name: config-volume
          mountPath: /etc/eventrouter
      volumes:
      - name: config-volume
        configMap:
          name: eventrouter-cm

Prometheus and Alerting

While events themselves are not natively exposed as Prometheus metrics, you can use kube-state-metrics to generate metrics from events. The kube_pod_status_reason and similar metrics can trigger alerts for patterns like repeated OOMKills or CrashLoopBackOffs. You can also build custom Prometheus alerts that fire when specific event patterns appear in your exported event data, creating a bridge between Kubernetes events and your alerting infrastructure.

Events in Modern Kubernetes: events.k8s.io/v1

Historically, Kubernetes events used the core v1 API (apiVersion: v1, kind: Event). Starting with Kubernetes 1.19, a new API group events.k8s.io/v1 was introduced with improvements. As of Kubernetes 1.35, this is the recommended API for working with events. For a full overview of what changed in this release, see our Kubernetes 1.35 release preview.

Key Changes in events.k8s.io/v1

regarding -- Replaces involvedObject. Contains a reference to the primary object the event is about.
related -- A new field that provides a reference to a secondary object. For example, if a pod event is related to a specific node, the node reference goes here.
reportingController -- Replaces source.component. A string identifying the controller that reported the event (e.g., k8s.io/kubelet).
reportingInstance -- Replaces source.host. Identifies the specific instance of the controller.
note -- Replaces message. A human-readable description of the event.
series -- Replaces count, firstTimestamp, and lastTimestamp with a structured EventSeries object that tracks recurring events more efficiently.

Here is what a modern event looks like in the new API:

apiVersion: events.k8s.io/v1
kind: Event
metadata:
  name: web-abc12.a1b2c3d4e5f6
  namespace: default
regarding:
  apiVersion: v1
  kind: Pod
  name: web-abc12
  namespace: default
related:
  apiVersion: v1
  kind: Node
  name: node-3
reason: BackOff
note: "Back-off restarting failed container nginx in pod web-abc12_default"
type: Warning
reportingController: kubelet
reportingInstance: node-3
eventTime: "2026-02-16T10:30:00.000000Z"
action: Restarting
series:
  count: 5
  lastObservedTime: "2026-02-16T10:30:00.000000Z"

Practical Recipes for Event-Driven Debugging

Here are some workflows that combine event inspection with other kubectl commands to quickly diagnose common issues.

Recipe 1: Why Is My Pod Pending?

# Check events for the pending pod
kubectl get events --field-selector involvedObject.name=my-pod --sort-by=.metadata.creationTimestamp

# Look for FailedScheduling reason and read the message
# Common causes: insufficient CPU/memory, node affinity/anti-affinity rules,
# taints without matching tolerations, PVC not bound

# Check node resource availability
kubectl describe nodes | grep -A 5 "Allocated resources"

Recipe 2: Find All Failing Pods in a Namespace

# Get all Warning events in the production namespace, sorted by time
kubectl get events -n production \
  --field-selector type=Warning \
  --sort-by=.metadata.creationTimestamp \
  -o custom-columns=TIME:.lastTimestamp,REASON:.reason,OBJECT:.involvedObject.name,MESSAGE:.message

Recipe 3: Monitor Events in Real Time

# Watch events as they happen (like tail -f for events)
kubectl get events --watch

# Watch only warnings across all namespaces
kubectl get events --all-namespaces --field-selector type=Warning --watch

Recipe 4: Audit Node Stability

# Check events for a specific node
kubectl get events --field-selector involvedObject.kind=Node,involvedObject.name=node-3

# Look for patterns: NodeNotReady, NodeHasDiskPressure, NodeHasMemoryPressure,
# NodeHasInsufficientPID, NodeRebooted

Best Practices for Working with Kubernetes Events

Export events to a durable store. The 1-hour default TTL means events vanish quickly. Use an event exporter, Fluent Bit, or another tool to ship events to Elasticsearch, Loki, or your SIEM.
Alert on Warning events. Set up alerts for high-frequency warnings like OOMKilling, FailedScheduling, and CrashLoopBackOff. Track event counts over time to catch trends.
Use field selectors in scripts. When building automation, use --field-selector to filter events server-side rather than piping through grep. This reduces the load on the API server.
Correlate events with logs and metrics. Events tell you what happened at the orchestration layer. Combine them with container logs (the why) and metrics (the how much) for a complete picture.
Increase the TTL for staging and CI clusters. In environments where you debug after the fact, set --event-ttl=12h or higher to keep events around longer.
Treat events as a first-class observability signal. Events are often overlooked in favor of logs and metrics, but they provide the clearest view into Kubernetes control-plane decisions like scheduling, scaling, and health checking.

Summary

Kubernetes events are the cluster's built-in audit trail. They record every significant lifecycle change -- from pod scheduling to volume attachment to node health transitions. By mastering kubectl get events with field selectors and time sorting, setting up event exporters for long-term retention, and alerting on Warning-type events, you gain deep visibility into what your cluster is doing and why.

The shift to the events.k8s.io/v1 API brings cleaner semantics with regarding/related fields and better deduplication through the series structure. Whether you are debugging a single failing pod or building a comprehensive observability stack, events should be one of the first signals you reach for.

DEV Community