DEV Community

Lord Jake
Lord Jake

Posted on

AKS node crash and Sonar rebuild

Today we had a strange issue in which a pipeline in Azure devops was failing and on investigation we came to know that , it was failing due to sonar errors which in turn pointed to the sonarqube pod deployed in the cluster getting evicted and the node was under severe memory pressure with lots of evicted pods building up which further pressurized the node. The reason for severe memory pressure was not clear though, but we followed the below steps to get things back up again.

Tried deleting the evicted pods to check if the node was getting recovered.

kubectl delete pods --field-selector=status.phase=Failed --all-namespaces

Since this didn't resolve, we scaled out the cluster, using Azure portal.
Select the cluster -> Node pools -> Select the Nodepool -> Scale pool -> Scale to 2 more nodes ( we had 3 , we scale to 5)

Image description

Meanwhile the affected node was tainted with a Memory Pressure hence no pods were getting allocated to it.

Once the new nodes were started, we drained the problematic node and took it down. This is for safely evicting the pods. More on the link here

Once drained and the node was taken down. Still the sonar pod was not getting scheduled on any nodes.

We checked sonar pods describe to see the events
kubectl describe po sonar -n sonar

Pods were not able to get scheduled due to a volume node affinity conflict as below
Image description

We checked the persistent volume claim (pvc) in sonar namespace

kubectl get pvc -n sonar

Described the sonar-pvc
kubectl describe pv pvc-xxxxxx

Image description

Now the issue got narrowed down to a possibility that we don't have a node in a particular availability-zone in Azure, where the pvc is currently configured to. The sonar pods should be put on the same node to connect with the pvc (all in same availability zone node) hence the error.

So we scaled down the instances to 2 and then scaled up back to 3, and AKS created nodes in each zone which also created a node in
southeastasia-3 , which is now is agreement with the pvc node affinity section.

Image description

And this made the sonar pods to be deployed to the node with affinity and thereby bringing the sonar up resolving the pipeline issue as well.

Top comments (0)