DEV Community

Sadeek M
Sadeek M

Posted on • Edited on

Debugging Kubernetes cluster part 2

Debugging a Kubernetes cluster requires a deep understanding of its components and inter dependencies. Here’s a comprehensive Part 2 guide focusing on advanced debugging techniques for common cluster issues:

  1. Node Issues

A. Node Not Ready

Check Node Status:bash

kubectl get nodes
kubectl describe node <node-name>
Enter fullscreen mode Exit fullscreen mode

Inspect Kubelet Logs:
SSH into the node and review logs for errors:bash

journalctl -u kubelet -l
Enter fullscreen mode Exit fullscreen mode

Possible Causes:
Resource exhaustion (e.g., CPU, memory, disk).
Misconfigured networking (e.g., unable to reach the API server).
Issues with container runtime (Docker, containerd).
B. Node Disk Pressure or Memory Pressure

Check Allocations:bash

kubectl describe node <node-name> | grep Allocated
Enter fullscreen mode Exit fullscreen mode

Clean Up Disk Space:
Remove unused images and logs:bash

docker system prune
Enter fullscreen mode Exit fullscreen mode

Reconfigure Resource Limits:
Adjust resource requests and limits for pods.

  1. Pod Issues

A. Pod Stuck in Pending

Inspect Events:bash

kubectl describe pod <pod-name>
Enter fullscreen mode Exit fullscreen mode

Possible Causes:
Insufficient resources: Check node capacity and pod requests.
Scheduling constraints: Inspect nodeSelector, taints, and tolerations.
Networking issues: Ensure the CNI plugin is functioning correctly.
B. CrashLoopBackOff

View Logs:bash

kubectl logs <pod-name> --previous
Enter fullscreen mode Exit fullscreen mode

Check Events:bash

kubectl describe pod <pod-name>
Enter fullscreen mode Exit fullscreen mode

Debugging Steps:
Ensure the container's entrypoint is correct.
Verify environment variables and mounted volumes.
Test locally using the same image.
C. Container Image Pull Issues

Inspect Events:bash

kubectl describe pod <pod-name>
Enter fullscreen mode Exit fullscreen mode

Common Errors:
Unauthorized: Verify image pull secrets.
Image not found: Confirm the image exists in the registry.

  1. Networking Issues

A. Pods Can't Communicate

Ping Other Pods:bash

kubectl exec -it <pod-name> -- ping <pod-ip>
Enter fullscreen mode Exit fullscreen mode

Check Network Policies:bash

kubectl get networkpolicy -n <namespace>
Enter fullscreen mode Exit fullscreen mode

Debugging CNI Plugins:
Inspect CNI logs:bash

cat /var/log/containers/<cni-plugin-name>*.log
Enter fullscreen mode Exit fullscreen mode

B. Service Not Accessible

Check Service Description:bash

kubectl describe svc <service-name>
Enter fullscreen mode Exit fullscreen mode

Inspect Endpoints:bash

kubectl get endpoints <service-name>
Enter fullscreen mode Exit fullscreen mode

Test Connectivity:
From within a pod:bash

curl http://<service-name>.<namespace>:<port>
Enter fullscreen mode Exit fullscreen mode
  1. API Server Issues

Inspect Logs:bash

journalctl -u kube-apiserver
Enter fullscreen mode Exit fullscreen mode

Test API Server Availability:bash

kubectl get --raw /healthz
Enter fullscreen mode Exit fullscreen mode

Common Causes:
SSL/TLS issues: Check certificates and CA bundle.
Resource bottlenecks: Monitor CPU/memory usage.

  1. Persistent Volume Issues

A. PVC Pending

Inspect Events:bash

kubectl describe pvc <pvc-name>
Enter fullscreen mode Exit fullscreen mode

Common Causes:
No matching StorageClass.
Insufficient storage on nodes.
B. PV Bound But Pod Can't Mount

Inspect Logs:bash

kubectl logs <pod-name>
Enter fullscreen mode Exit fullscreen mode

Debugging Steps:
Verify volume permissions.
Test mounting the volume manually on a node.

  1. Cluster DNS Issues
Test DNS Resolution:bash
Enter fullscreen mode Exit fullscreen mode

kubectl exec -it -- nslookup

Inspect CoreDNS Logs:bash

kubectl logs -n kube-system <coredns-pod-name>
Enter fullscreen mode Exit fullscreen mode

Common Fixes:
Restart CoreDNS pods if unresponsive.
Validate ConfigMap for CoreDNS (kubectl get cm -n kube-system coredns).

  1. Troubleshooting Tools

A. kubectl Debugging Tools

Debug running pods:bash

kubectl exec -it <pod-name> -- /bin/sh
Enter fullscreen mode Exit fullscreen mode

Debug containers with ephemeral containers (Kubernetes v1.18+):bash

kubectl debug -it <pod-name> --image=busybox
Enter fullscreen mode Exit fullscreen mode

B. Third-Party Tools

Lens: GUI for Kubernetes cluster monitoring.
K9s: Terminal-based cluster management.
kubectl-trace: System-level tracing for Kubernetes.
C. Logs Aggregation

Use tools like Fluentd, ELK Stack, or Loki for centralized logging.

  1. Proactive Cluster Monitoring

Implement monitoring systems like Prometheus, Grafana, or Datadog.
Set up alerting for critical metrics (e.g., node health, pod restarts).

Example: Debugging Workflow for a Non-Responsive Service

Check Pod Status:bash

kubectl get pods -n <namespace>
Enter fullscreen mode Exit fullscreen mode

Describe the Service:bash

kubectl describe svc <service-name> -n <namespace>
Enter fullscreen mode Exit fullscreen mode

Inspect Logs:bash

kubectl logs <pod-name> -n <namespace>
Enter fullscreen mode Exit fullscreen mode

Test Connectivity:
From within a cluster:bash

curl http://<service-name>.<namespace>:<port>
Enter fullscreen mode Exit fullscreen mode

From outside:bash

curl http://<external-ip>:<port>
Enter fullscreen mode Exit fullscreen mode

This deeper dive equips you to troubleshoot and resolve complex Kubernetes issues effectively. Let me know if you'd like specific scenarios or additional examples!

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

👋 Kindness is contagious

Immerse yourself in a wealth of knowledge with this piece, supported by the inclusive DEV Community—every developer, no matter where they are in their journey, is invited to contribute to our collective wisdom.

A simple “thank you” goes a long way—express your gratitude below in the comments!

Gathering insights enriches our journey on DEV and fortifies our community ties. Did you find this article valuable? Taking a moment to thank the author can have a significant impact.

Okay