When you are deploying your code in the cloud, you must add probes on your application that allow your cloud provider to know the status of your application and restart or send traffic to it. It also prevent you to deploy a broken or misconfigured version.
Let's dive into the three probes, mainly inspired by the Kubernetes probes.
All applications don't have the same "morning routine" when starting: some build internal caches, some construct dependency injection graph before starting, some are instantly ready for business.
Before checking others vitals parameters of your application, you must check if your application is awake. This kind of probe can be anything that have an affirmative response when application is open to business.
Checking HTTP listen port can be a good example, it's often the last thing started on your app, after all loadings. Here is a good startup probe in a Kubernetes syntax.
Once your application is awakened, you must now check continuously and indefinitely that it's alive. If the liveness probe fails, your cloud provider may restart your instance after some retries, thinking that your application is stuck.
You must have a very deterministic way to ensure your application is live.
In many times, I've seen a HTTP probe on a
/health route to check if an application is live. But, if your app is overwhelmed by a ton of requests, adding a request for checking if it's alive just add suffer to the pain. The HTTP probe often have a very short timeout and if you don't answer in time, your app can be considered as "broken" although it's answering to business requests.
You must think of it as a heartbeat: as long as your heart is beating, you're alive. If instead of a "heartbeat check", it's a question "are you okay?" every 5 seconds, under high effort, you may not have breathe or time to answer... but your heart is still beating.
I recommend using a simple open port check, as for the startup probe: it's unharmful for your application, it's fast and will always be positive as long as your HTTP server is running. Here is a good liveness probe.
Last probe is the trickier to configure. If we take back our example of the heartbeat on a marathon runner. The heart is beating, so liveness probe is good. In the morning, there are enough body resources, the rights shoes, everything is in perfect condition. At the end of the run, energy level is very low, the readiness probe is not okay, but the liveness is, hopefully, still okay: the heart is still beating.
It's the same concept on your application: the readiness probe is the one that tells to the load-balancer "I'm ready, give me requests!". Being ready means your database connection is ok, your messaging also, enough space to work, you're not in surge of too many requests, etc...
For checking your application is not overwhelmed, it's now a good time to call a
/ready HTTP endpoint on the same port that serve regular traffic: if your server can't handle a low-consumption request in time, it's not a good idea to send heavier requests.
On that endpoint, you may need to perform a simple ping on your dependencies, and answering a valid HTTP code if everything is okay, or HTTP/503 otherwise. Here is a good readiness probe.
When checking your dependencies, don't create readiness loops! Don't check others API statuses unless they are relevant for your business, and ensure they are not checking you either: it will create a chicken-egg problem otherwise.
The previous behavior is only valid if you application can handle to reconnect when dependencies are not available: if your database disconnect, can you reconnect on-the-fly when it becomes available or not? If not, you may change your liveness probe by adding that kind of check, to restart your application.
Being able to reconnect is a good feature because it limits amount of work to do during outage.
If your database is down, and all your applications restart in loop until the database is back, it may have a lot of side effets.
If you have configured reconnection, when your database is down the applications are not ready, but as soon the database comes back, everything recovers almost instantly.
Last, but not least, with the readiness probe you can handle graceful shutdown. When you receive the signal (often
SIGTERM) that your application is going to shutdown (because of a new version coming, because of downscale, etc), you can start answering the unavailable status, wait a graceful time and then shutdown server completely.
This way, the load-balancer knows that you don't want traffic anymore and ensure no request will be lost when the HTTP server will be effectively shutdown.
A special note on performance. Liveness and readiness probes are called very often, to ensure correct reactivity if things goes wrong. Probes must answer quickly and without allocating too much ressources. It's like the "ping" on the server: it's the first thing to do to check if "basic connectivity" is right.
All images are from Unsplash