DEV Community

meddlesome
meddlesome

Posted on

1

Some GKE pods are failing with strange error with Google Auth

Recently, I found a strange Pod behavior in Google Kubernetes Engine - GKE

When Rolling-out deployment. Most pods are healthy and up, but some pod error with CrashLoopBack.

Image description

Digging down to Pod Event, everything good. Except unhealthy. Going more to The log of unhealthy pod. Some hint is showing

Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
    at GoogleAuth.getApplicationDefaultAsync (/usr/src/app/node_modules/google-auth-library/build/src/auth/googleauth.js:183:19)
Enter fullscreen mode Exit fullscreen mode

We found some interesting fix by adding Environment variable to pod.

DETECT_GCP_RETRIES=3
Enter fullscreen mode Exit fullscreen mode

🎉🎉🎉 Now pods are fully up without error.

Image description


Cause of this issue is when rolling deployments, multiple pods made request to GCP for authentication, sometime it timeout without retry then fail.

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

đź‘‹ Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay