DEV Community

meddlesome
meddlesome

Posted on

Some GKE pods are failing with strange error with Google Auth

Recently, I found a strange Pod behavior in Google Kubernetes Engine - GKE

When Rolling-out deployment. Most pods are healthy and up, but some pod error with CrashLoopBack.

Image description

Digging down to Pod Event, everything good. Except unhealthy. Going more to The log of unhealthy pod. Some hint is showing

Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
    at GoogleAuth.getApplicationDefaultAsync (/usr/src/app/node_modules/google-auth-library/build/src/auth/googleauth.js:183:19)
Enter fullscreen mode Exit fullscreen mode

We found some interesting fix by adding Environment variable to pod.

DETECT_GCP_RETRIES=3
Enter fullscreen mode Exit fullscreen mode

🎉🎉🎉 Now pods are fully up without error.

Image description


Cause of this issue is when rolling deployments, multiple pods made request to GCP for authentication, sometime it timeout without retry then fail.

Top comments (0)