Humans are pretty good heuristic machines. But sometimes the heuristics we naturally develop work against us.
Consider the last time you went on vacaction. How many bags did you pack for yourself?
If you’re like most travelers, it depends on how long you were planning to be gone. If it was just a long weekend trip, maybe you stuffed 3 changes of clothes into a single carry-on bag along with a toothbrush and called it good. But if you were traveling for 2 weeks, you likely took a much larger bag, and probably had to check your luggage at the airport.
It seems to be human nature to pack more for a longer trip. Especially if you’ll be in multiple climates. There’s a wider possibility of circumstances you may face, so you compensate with more planning, and more stuff.
What if you were going to travel for 2 or 3 years?
I did just that.
Can you guess how many bags packed?
I had one carry-on bag, and one small day pack for my laptop. That’s it.
Of course traveling for 3 years with only a carry-on bag plus a day pack required some planning. But more important, it required a different mindset.
We fight the same problem with software delivery.
When we discover a bug in production, the natural tendancy is often to respond with more planning and preparation in the future. Our list of items to test before the release grows bit by bit. We might even hire more testers to help run through the checklist before every release.
The sad irony is that type of behavior actually increases risk.
Think about the traveling analogy again. Imagine you’ve just arrived in Cancún and want enjoy some underwater photography. Which is the bigger risk:
- You successfully planned ahead for snorkeling, and packed your snorkeling equipment in one of your large bags.
- You didn’t plan for snorkeling, and have only a single carry-on bag. And a wallet.
At first blush, you might think that having your snorkeling equipment in the bag is the better way to go. But consider: What if that bag is stollen? Or the airline loses it for a couple of days? Or your goggles get cracked in handling. Or what if there’s a tropical storm on snorkeling day, and you have to change your plans for something inland.
A much simpler, less risky approach is to travel as light as possible, and if snorkeling comes up, you can rent the equipment you need for the day.
The same works with software delivery. We can either:
- Try to reduce the number or frequency of failures, with more planning and preparation for every release. Sometimes called optimizing for Mean Time Between Failures (MTBF)
- Travel as light as possible, so that we’re prepared to respond to any circumstance immediately. Sometimes called optimizing for Mean Time To Recover (MTTR)
Now here’s the real irony:
When we optimize for MTBF—that is, we try to plan for every possible eventuality—we actually make the problem worse, by making the process slower, more burdensome, and riskier.
And on the other hand: When we optimize for MTTR—that is, we travel light, and nimble, so we can respond to whatever circumstances arise—we actually reduce the number and frequency of failures, or MTBF, as well.
If you enjoyed this message, subscribe to The Daily Commit to get future messages to your inbox.
Top comments (0)