As a Cloud Solution Architect, I help Microsoft's partners build solutions on Azure. Often the way forward is not clear and using a checklist can help point us in the right direction. Here's a discussion of what checklists are and why they're useful.
Here are some interesting health care stats that will set the stage.
- Hospital doctors evaluate 250 different disease and conditions per year.
- Intensive Care Unit (ICU) patients require, on average 178 daily tasks and about half of these patients experience a complication.
- In one study of 41,000 trauma patients:
- Unique injury diagnoses: 1,224.
- Combination of diagnoses: 32,261.
- In 2006 there were 230 Million surgeries globally
- 3% - 17% had complications.
- 7 million people died.
What do these stats have to do with checklists and cloud architecture?
First, all of these statistics come from the excellent book "The Checklist Manifesto" by Atul Gawande.
When you look at these stats, you can see how complicated it is to diagnose and treat healthcare issues.
It's also very complicated to design cloud architectures.
Now, to be clear, I AM NOT suggesting that cloud architecture is as complicated as diagnosing and treating healthcare issues. Nor am I suggesting the problems of cloud architecture are as serious or important as saving people's lives. I am merely suggesting that when things get complicated a checklist can come in handy.
Here's an eye chart of the Cloud Native ecosystem from the CNCF.
Think about this eye chart and then think about every other technology domain – security, data, identity, monitoring, etc. They have their own eye charts. There are thousands of options in near infinite combinations. An architecture, especially an existing/legacy one, is going to be a Frankenstein of all these different services. And, as an architect, it’s your job to design and/or validate these architectures for your cloud platform.
And just think about how complicated that cloud platform is on its own! I work at Microsoft and Azure itself has 100s of different services. Check out this eye chart from Azure Charts.
So, how can a checklist help with all this complexity?
Let's go back to those healthcare examples and look what happens when simple checklists were implemented.
- In 2001, John Hopkins Hospital implemented a 5 question checklist for ICU doctors. After 15 months they prevented 43 infections, 8 deaths and saved $2 Million.
- In 2009, 8 hospital piloted a 2-minute, 19 question surgery checklist. After 3 months, surgery complications fell 36% and deaths fell 47%.
Let's be clear: 5 questions saved a hospital two million dollars over 15 months. Spending 2 minutes at the beginning of a surgery cut the number of deaths in half!
Again, these impressive stats come from "The Checklist Manifesto." A checklist is just asking a few question before starting a process. It forces you to check things every time you start a process. Without a checklist, you might rely on just remmebering to check these things. But if you forget to do this thing just once, it can have a tremendously bad impact. Checklists are used in a lot of other place besides healthcare: Airplane pilots use them as well as non-Cloud architects and investors.
Checklists are a great tool for avoiding errors and for lowering stress. There's a great Freakonomics podcast episode featuring Atul Gawande, the author of The Checklist Manifesto, if you want to learn more about the power of checklists. The book is also a great quick read!
Last year I worked with my colleagues at Microsoft, who are among some of the best partner architects, to compile the checklists we use when working with Microsoft Partners. We shared our Solution Architecture Questions (SAQ) on Github. Anyone is free to use this list, share it and contribute to it. Some of the questions are Azure partnership specific, but they can be used when designing architectures in any cloud environment.
I also started working on a web app for finding these questions, asking them and recording the answers. You can find it at https://saq.monster. Once you create an account, this web app runs completely in the browser. It leverages Blazor Web Assembly and Azure Static Web Apps. So whatever data you put into the app remains in your browser only. Eventually I may build a cloud sync service behind it, but for now it remains completely client-only.
What are your thoughts on checklists and the list we built? Are there strong opinions out there about using checklists in the field of cloud architecture? Do individuals have their own lists? I would love to hear your thoughts on this!