The Hawaii Missile Alert Was the Software Developer's Fault

Ben Halpern on January 15, 2018

The employee who accidentally triggered the missile alert on Saturday pushed the wrong button. It was a disastrous mistake that sent Americans pa... [Read Full]
markdown guide
 

Not an user error? Not even a double check on the selected option? 🤦

And as for the design I agree, but I also bet they didn't consulted or allowed to do, a proper UX specialist, most usually working with government apps is highly strict.

 

Yes, I would say "not a user error". One of the guiding principles of design is empathy. If a human makes an error, it is not their mistake EVER. The fact that most people facepalm at this statement is the very reason why most designs out there lack this factor.

A mistake made by a user is a shortcoming in the design. It could be because of lack of budget or motivation. If a designer fails to consider this as the axiomatic guiding principle, they are not a good designer to begin with.

Most often than not, users blame themselves for nor being savvy enough, and making mistakes, but good design makes sure that any user, no matter how unaware, is gently nudged in the right direction.

 

Another case where the how-not-to-do-an-ejector-seat principal is alive and well:

 

Great picture. Really drives home the idea because who wants to eject a puppy out of a jet?!?

 
 

Firstly, I totally agree, this was far more a failure of design than operator.

Secondly, it wouldn't surprise me in the least if the design failure wasn't the fault of the software vendor. I work with fedgov groups regularly, and requirements & acquisitions can be very broken. It's even possible that the contracting office required it to be designed this way (I'm not kidding).

This is one of the reasons good product management is so important. There is an acute distinction between "Buyer Requirements" and "User Requirements", and if a requirement involves UI/UX, it really needs to be done by someone who knows how to design such things :)

 

One can only hope we learn from this, but I could see a possibility where this could make requirements even more rigid in the ways you've described. We need checks and balances that account for actual risk and a design process that can improve with feedback and testing.

It seems like some organizations are hamstrung into no-win scenarios.

 

One of the things FedGov has always struggled with is how to enact "checks and balances" that don't ultimately become /rigid requirements/. There are a lot of complicating issues in government acquisition, and political maneuvering can completely crater otherwise great ideas (this is true at companies, too, but I would argue is less prevalent).

It's worth noting that the National Defense Authorization Act of 2018 actually has a section devoted to piloting an Agile development & acquisition process for defense software systems to address a slew of issues, including the one we are discussing, here.

All of Subtitle H in the NDAA-2018 is relevant for people with an interest in how the FedGov handles software, but Section 873 specifically lays out the Agile Development initiative:

congress.gov/bill/115th-congress/h...

 
 

Universal Principles of Design - Confirmation:

"Confirmation is a technique used for critical actions, inputs, or commands. It provides a means for verifying that an action or input is intentional and correct before it is performed. Confirmations are primarily used to prevent a class of errors called slips, which are unintended actions."

So my question is, was there any form of confirmation? The use of confirmation is found in many applications of very critical steps, e.g, twist-and-turn, lift-and-twist, lift-and-drag, etc. I remember manual cars had a pull-up-and-pull-back thing for going reverse, a sure way to NOT mistakenly break your engine when at full speed by initiating reverse.

Therefore, if there was a second step confirmation, saying: "Are you really sure you want to initiate inbound missile broadcast? |Enter CONFIRM to proceed", whoever responsible would have known, the drill is going down, for real, and no joke!

 

If they were both part of the same "dropdown", I'd think that the whole form might have contained an "are you sure" message. So if the form always has an "are you sure" message, it's easy for that to become an ignored message.

Regardless of the details we may never be sure about, it's a reminder/wakeup call to get this sort of stuff right.

 

Agreed. The interface apparently had a confirmation, but if the confirmation is the same for "Test Alert" and "Actual Alert", then that is a failure of design.

The confirmation for the Test Alert should be boring and grey.

The confirmation for the Actual Alert should be hella loud and striped.

Loud and striped is never the solution. It should simply be distinct enough from the test scenario, and require additional and unique steps to perform.

"require additional and unique steps to perform."

I completely disagree. There should be a single difference between the actual and a drill.

Is this a test or a drill? Yes/No

If the test steps and the real steps are different, then what's the point of a drill?

 

I find this type of article incredibly disappointing as it peddles the myth that software engineering is a deterministic and perfect process and that this type of error is because somebody as "absolutely negligent". That smacks of an utter lack of professional respect and some classic results-oriented thinking. It also reeks of a tonne of hubris, unless the author has first-hand experience of the system in question and not simply the crisis-management sound-bite that came from the HEMA spokesperson.

Consider:

1) The author and readers don't know the level of authority that the user has and what training came with it, so they don't know the context that the engineers expected when the user was presented with these options.
2) The author and readers don't know the UX flow leading to the drop-down options, so they don't know what context was established by the interface in the user's mind when they clicked the buttons.
3) The author and readers don't know what the drop-down menu looks like, nor the size of the screen that the user had at the time. Were they using a supported browser?
4) The author and readers don't know what the user saw after they clicked the button. Did they gloss over the contents of prompts put before them? Did they confirm their selected action when they should have cancelled it?
5) The author and readers don't know the state of the system when this drop down was initially added. Were the options added at the same time? Was one added before the other? Were they added by the same engineer?
6) The author and readers don't know the constraints that the engineering and QA team(s) were under when the options were added? Were they added under an extreme time crunch? Did project management direct them to skimp on testing? Does the engineering team even have authority to prevent a release without losing their job?
7) The author and readers don't know what requirements are in place around how simple it has to be to send out a real alert? If the article had, instead been "Missile launch alert delayed due to insufficient permissions for user", would we be having the exact same conversation with a few changes here and there?

It's disrespectful to cast about absolute claims about the quality of the team and to parlay this situation in to a simplistic call to "write better software". As an engineer, you should know better than to suggest that it's as simple as that... there are definitely lessons to be learned here, but sniping from the sidelines is not the way to learn them.

 

I agree with Halpern but believe he may have over-simplified the issue by assuming that the developer/s knew what the system would be used for at design time. Consider the possibility that the software vendor sold Hawaii a configurable system- one which allows admins to add/remove options from menus (such as the drop-down missile alert menu).

With a configurable system, it is difficult for the developers to ensure that proper precautions are taken before executing an action because they may not know what actions the system will be capable of executing.

This possibility complicates the issue by taking some blame off the developers and putting it on system admins.

In this hypothetical situation, whose to blame?

Does the system allow admins to add additional warning/confirmation dialogues to actions?

Did the Hawaii purchasing agency specify to the vendor that they would use the system for critical/impactful alerts?

There are too many unanswered questions to point the finger at one party.

 

User behavior, trust, and UX issues aside, it should be impossible to push a production notification from a dev/test/qa environment.

 

Unless production is your only environment... If they indeed had a complete staging setup, they would not have the need of both options in the dropdown :)

It's freaking horrifying.

 

Saying that an event was someone's ("the software developer's") fault isn't helpful. I'd even say it's harmful. We shouldn't be looking at individuals, but processes. Were there requirements for confirmation? If not, why? If so, were they implemented? How did this get past various reviews of requirements, designs (software and UI), and implementation? Was this raised as a concern early? These are the types of questions we should be asking. We shouldn't be using words like "fault" and "blame".

 

Per devops practices: "blame is a useless emotion that has no place in business"

 

I've never heard it in the context of DevOps, but in various Lean methodologies there's a similar idea. Blame and finger-pointing hinders continuous improvement.

 
 

"Software developers cannot ship interfaces that make this sort of human error possible." - Exactly this!
I can't get it into my head why there wasn't at least an additional confirmation dialog with a bright red warning! 🤦🏻‍♂️

 

I used to work at a company that placed a lot of people on a project that was written in C#, these people had no experience of C#. The project was for a government/public sector project. I am not surprised at all that these things happen.

 

I'm sorry, but you can't blame C#. It's like a surgeon blaming a unfamiliar scalpel manufacturer for a surgical mistake. This could be one of those if-it's-not-in-the-Acceptance-Criteria-,-even-if-it-makes-all-the-sense-in-the-world-,-don't-do-it kinds of things. Or, someone didn't do their due-diligence when writing them up.

 

not blaming c#, blaming the company on putting people with no skill in c# on a c# project

Kinda washes. But bad design is bad design no matter the language. Having to deal with a unfamiliar language is a distraction for sure, but that's a productivity not a quality issue.

Of course it is, if your unfamiliar with a language and you can't expect the deadlines to be met with perfect design? a lot of time wasted on learning, which leads to rushed/panicked work and ultimately bad design. I don't think putting anyone with programming skills on any role is a solution or okay. You're assuming everyone has a great knowledge of programming and you end up putting some people under unnecessary stress. This could all be resolved by hiring/putting people on the team with the right skills.

It still doesn't matter how unfamiliar with a language one is. UI blunders are language agnostic. In fact, they're programmer agnostic. Even an expert in C# wouldn't necessarily be responsible for a UI decision. As programmers, we design software to accomplish a job and that doesn't always include usability. While some programmers are better at catching this stuff and maybe even have an eye for UI, most of the time (particularly with UI), we are told what the UI is supposed to look like. On the other side of things, I often catch UI blunders like this and am told to shut up an color. A non-technical person should have reviewed this an identified the potential problem.

"Uh hey... I realized that we asked for this, but is it really such a great idea to store the hydrochloric acid in an unmarked mountain dew bottle in the fridge next to the mountain dew?"

This was a management failure, not a programmer failure.

Thats a very romanticised view, especially when it comes to big contracting companies. I was on many projects where things needed to be done yesterday and there was minimal input if at all from a design team. Yes this is a management problem. However if your unfamiliar with a language and you have to hit a deadline and you're unfamiliar with how to create different ui components you're going to build you know. In this instance a drop down menu with two options. Often people reviewing these UI's aren't design/UX savvy, and just think "does it work", and it gets through.

These people are as stressed as each other and are pressured to get things done. There is a difference between being unfamiliar and not knowing anything at all about a language.

You're example is a little exaggerated, so lets keep to the point. A drop down menu might seem okay to some people and not to others, point proven by this article. I am willing to bet that this was done to get the job done, not to be perfectly designed. My example was to illustrate that people in charge of these contracts don't care to much about quality but how fast they can get a job done. If they have hot bodies on seats, then they can work on this project, not caring what they are skilled in.

 

The discussion here seems focused on the responsibilities of developers and designers, so I feel obliged to make a different point:

The fault for causing panic and fear over an incoming nuclear attack lies with the existence of a system in which one must fear an incoming nuclear attack. It's technology's biggest mistake as an industry and a collection of people that we look at this situation and conclude that, in an ideal world, the design of the nuclear warning system would prevent accidental false alarms. In an ideal world there would be no nuclear warning system and no nuclear weapons. Let's stop asking how we can make terrible, monstrous things more user-friendly and instead ask why we are building terrible, monstrous things.

 

You don't know for sure if it was the developer's fault or not. Not everything a developer or designer suggests makes it into the final product. The client (the government) could have declined the confirmation feature you're recommending. Or the client may have turned down repeated suggestions to update the design. Without more information, blaming this on the developer alone is speculation only.

 

If the mistake can be made, it will eventually be mdade. We call this type of error "fat fingering" and it happens all the time. Can't decide if those two sentences were intentionally side by side...

 
 

Putting a Dev (and only the dev) to blame is quite arrogant and maybe also a bit ignorant of all the circumstances.

Especially for such a scenario its contra productive to try putting someone to blame. There was not only the Dev, or the people who created the specification, there was also an uncountable number of people who used it already and did not insist on a change.

There is never a way to create an application to prevent all user errors. And the examples you told about, they were not there from the beginning but were created as a consequence of a user who did something wrong in the beginning.

Just that their software is probably like 20 or 30 years old, do you remember the state of blog and forum software back then?

Also you completely ignore the main usecase, which is of firering a warning in the fastest possible way.
Having a false positive from time to time is a lot less of a problem, then failing to send out the notice because of to many safety guards.

 

This is kinda standard on my city, i worked in some places that we had buttons for very critical processes, but they didn't had a warning message or something like that, it's pretty funny how the end user really blames the devs for that but when we suggest them about a confirmation, in most of the cases the client or project managers says "NO" we don't need that. It's kinda like a loop on this.

I really can blame the dev, maybe the client said "****" the confirmation, or the UI designer we don't really know, but one thing can be said, that was a freaking bad mistake.

 

I agree, +1 design error. Lotta repercussions if the wrong switch is thrown. If there wasn't a lead or manager overseeing the development of this, that's bad. If there wasn't any QA involved, that's bad. Short story: the individual atop this software development effort gets 10 lashes with a wet noodle.

 

Please let me add that all that was discussed here is possible and probable, however, in this particular situation, a latter explanation was similar to what Pee Wee Herman said when he road and "popped wheelies" on his bicycle too fast, he flipped, flew up in the air and landed back on the seat. When his audience looked amazed he said "I meant to do that", in his Pee Wee voice. So according to The New York Times article of Jan 30th, 2018, the person who pushed the button "meant to do that". Why? He really thought there was a missile heading to Hawai'i.

 

I'm just going to assume the team(s) that designed and developed this had reservations about the design but were powerless to change anything. That doesn't absolve them of anything, but this is definitely a lesson learned for me since I'm new to coding and development.

 

This is an important reminder that development is comprised of many roles, and they all need to work together to create a usable and safe product. Some of the comments I see blaming UX, or some kind of design, but not the programmers, are unfortunate. Divisiveness will not get problems like these fixed.

If you contribute to a software product, then you're a software developer. There's an exceptional overlap between all roles, from programming, to UX, to graphics, that it's counter-productive to put hard-walls between them.

 

To be fair, I'd rather that the UX design err on the side of getting the alert out than not issuing the alert at all.

A false positive in this case is much better than a false negative. A few minutes/hours of temporary fear is better than thousands dead because they didn't get to shelter.

As such I'm grateful for the current design.

 

I struggle with two sentences, Ben.

First one: "that is not a human error, that is a software design error."
I think it's best to say "it's not a user error" (software still designed by humans ;-) ).

Second one, the title: "The Hawaii Missile Alert Was the Software Developer's Fault". This is pure speculation on my part, but here it goes: I believe the software developer would have done it differently if he/she had a choice. But in these days of Product Managers, Product Owners and Product Managers contradicting each other, of budget restrictions and bitter discussions between customer and provider about the cost estimation of each Change Request, ... in such environment with such flawed and poisonous work processes, common sense cannot be exercised by the ones ultimately doing the job, i.e. the sofware developer in this case. Customer pays for a drop down menu, customer gets a drop down menu. End of the story. Dare you not put in a few more hours to design a better UI, or else next time the customer asks for something, he/she will expect the "drop down menu" price for someting that requires more effort.

I bet my hat something like this actually happened.

 

I dunno, I've been in many situations where I've personally pointed out defects like, "this allows a user to create orders for other users," or, "there are credit card numbers being dumped in the stack Trace" only to have any form of suggested remediation panned by decision makers because they didn't consider it high risk.

You don't know that the system wasn't designed to have high and low risk messages, the contents of which were not generally known until runtime but we're adequately handled, and the person managing the application chose to create the two message entries as low priority messages, breaking process.

I'm going to treat your perspective as, "if this scenario is the reality, the dev messed up," while understanding that in the scale of things that get budget or priority, PEBKAC Byzantine correction layers are generally filed as nice to have by the people signing your check, and doing work on things which were not explicitly approved is a good way to end up without a paycheck.

 

And let's not forget the other half of the problem, there was no cancel or takeback, or way to send an "Oops, it was a drill" message using the same broadcast method, so they had to resort to Twitter and highway signs to try to get the follow-on message out.

 

Not sure what development practices you do, but the whole team including product owner/manager and stakeholder (s) is responsible for that UI. Not just the guy who added the line of code. Read XP about collective ownership.

 

then no. it was ! the "Software Developer's" fault. it was that of the UX/UI (or lack thereof) contribution.

 
 

Ah, but what was in the Business Requirement Document?

code of conduct - report abuse