Discussion on: Let's talk about Health Checks

View post

Thanks for the article. I appreciate it.

One thing to consider is that health checks should be used to drive the behavior orchestation platform (e.g. kubernetes). If it fails then k8s will act on that failure (typically a restart). This is very useful as it starts to "self heal" outages but it also means that your health check should include only things that recoverable and can benefit from a restart.

Redis is actually a good example of this. Maybe your application will operate just fine without it's cache (albeit slower). In that case a restart isn't best and a 200 is acceptable. I typically will just use errors in the log files to handle with monitoring alerts. Really a case by case but I usually will only add checks to the health endpoints that are recoverable and also owned by the microservice that is hosting the healthcheck endpoint.

Just my .02