DEV Community

Wycliffe A. Onyango
Wycliffe A. Onyango

Posted on

100 Days of DevOps: Day 59

Fixing a Kubernetes Redis Deployment Failure

Summary

I successfully fixed a critical failure in the redis-deployment on the Kubernetes cluster. The issue was traced to two cascading configuration errors introduced during a recent update: a typo in the ConfigMap name and a typo in the container image tag. By systematically diagnosing the pod events and correcting the Deployment specification, the Redis application was restored to a fully functional state.

Detailed Troubleshooting Steps and Findings

The process followed a standard Kubernetes troubleshooting methodology to identify and correct the specific configuration mistakes.

Phase 1: Initial Diagnosis (Identifying the First Failure)

The initial check of the pods showed the deployment was stuck, with the pod in a non-running state:

thor@jumphost ~$ kubectl get pods
NAME                                     READY   STATUS              RESTARTS   AGE
redis-deployment-6fd9d5fcb-nztfh         0/1     ContainerCreating   0          70s
Enter fullscreen mode Exit fullscreen mode

A detailed inspection of the failing pod immediately revealed the root cause for the stuck ContainerCreating status: a failed volume mount.

Finding 1: ConfigMap Typo

The kubectl describe pod output clearly showed an error in the Events log:

Warning  FailedMount  ... kubelet  MountVolume.SetUp failed for volume "config" : configmap "redis-cofig" not found
Enter fullscreen mode Exit fullscreen mode
  • Problem: The deployment template incorrectly referenced the ConfigMap as redis-cofig instead of the correct name, redis-config.

Phase 2: First Fix (Correcting the ConfigMap)

The deployment was immediately edited to fix the typo in the volume definition.

# Command used for correction
kubectl edit deployment redis-deployment
Enter fullscreen mode Exit fullscreen mode

The volume name was corrected from name: redis-cofig to name: redis-config. This action triggered a rolling update to create a new pod with the corrected configuration.

Phase 3: Secondary Diagnosis (Identifying the Second Failure)

Upon checking the pod status after the first fix, a new pod was being created, but it immediately failed with a different error, indicating the underlying issue was a chain of mistakes:

thor@jumphost ~$ kubectl get pods
NAME                                     READY   STATUS              RESTARTS   AGE
redis-deployment-5bcd4c7d64-8cbx8        0/1     ImagePullBackOff    0          103s
Enter fullscreen mode Exit fullscreen mode

The ImagePullBackOff status pointed directly to a problem with the container image specified in the deployment. Reviewing the deployment YAML confirmed the second typo.

Finding 2: Container Image Tag Typo

The image tag in the deployment template was set to redis:alpin.

  • Problem: The official and correct tag for the lightweight Redis distribution is redis:alpine (missing the final 'e' in the original configuration).

Phase 4: Second Fix (Correcting the Image Tag)

The deployment was edited a second time to rectify the image tag.

# Command used for correction
kubectl edit deployment redis-deployment
Enter fullscreen mode Exit fullscreen mode

The image was corrected from image: redis:alpin to image: redis:alpine. This change triggered the final rolling update.

Phase 5: Verification (Successful Restoration)

After the second edit and allowing a brief period for the new pod to start, the final checks confirmed the successful resolution.

thor@jumphost ~$ kubectl get pods
NAME                                     READY   STATUS      RESTARTS   AGE
redis-deployment-7c8d4f6ddf-rzp46        1/1     Running     0          24s

thor@jumphost ~$ kubectl get deployment redis-deployment
NAME                READY   UP-TO-DATE   AVAILABLE   AGE
redis-deployment    1/1     1            1           11m
Enter fullscreen mode Exit fullscreen mode

Conclusion

The redis-deployment is now fully operational, with the new pod (redis-deployment-7c8d4f6ddf-rzp46) in the Running state. The incident serves as a reminder of the importance of meticulous configuration checking, especially when applying changes to existing production resources, and highlights the value of using kubectl describe to quickly pinpoint and resolve cascading configuration failures.

Top comments (0)