Building a “maintenance mode” in your application makes it much easier to manage

#devops #softwaredevelopment #coding #programming

Image generated by AI

Do you have an on-and-off switch for your applications?

I build one by default for all my applications, and they’ve saved me on more than one occasion! 😅

In the past, I’ve discussed health checks and readiness probes and how to use them to redirect traffic from unhealthy application instances.

Another technique that goes along with this concept is maintenance toggles.

The idea behind a maintenance toggle is that it acts as a switch that allows you to put your application in “maintenance mode.”

What that maintenance mode does will depend on the application in question.

For an event-driven application, this might mean the instance is running but not subscribed to any topics/queues.

A way to implement this with REST-based services is to have the maintenance toggle trigger readiness probe failures.

Flipping this switch will leave the service running but move traffic away.

So why would we want this? 🤨

Having the ability to control traffic manually can be a lifesaver.

In a recent situation, I had a service consuming a lot of CPU. I wanted to troubleshoot the issue but didn’t want to impact production traffic.

So, when I flipped the maintenance toggle, traffic switched to another availability zone, but my instances stayed up and running, leaving me with something to debug.

And more importantly, time to troubleshoot. ⏱️

When production traffic is at risk, troubleshooting becomes more challenging, but if you can just flip a switch, and traffic will divert, then you’ve got a lot more time on your hands.

The key to maintenance toggles is ensuring they are dynamic and updated without restarting the service.

The easiest way is to leverage a distributed configuration system like etcd or Consul.

Building these system toggles is a piece of cake when your config is dynamic.