DEV Community

Jonathan Hall
Jonathan Hall

Posted on • Originally published at jhall.io on

Has Facebook outgrown "Move fast and break things"?

Facebook’s famous motto, “Move fast and break things” officially left the buliding in 2014, but that hasn’t stopped a large number of people on social media from criticizing Facebook for this old motto in light of recent major outages.

Since Facebook no longer lives by this motto, it’s a bit of a straw-man argument to begin with. But I want to defend this straw man anyway, with a bit of a contrarian view.

The assumption being made by those criticizing the “move fast and break things” motto is that the cost of a 6-hour outage is too high—that too many things broke.

Is this justified?

I don’t know. And neither do you, unless perhaps you work closely with Mark Zuckerberg.

You see, this type of judgement is the result of an ad-hoc ROI calculation. The problem with this armchair quarterbacking is that the public is only privy to one piece of data necessary for that calculation: We know (or can estimate) the cost of a single failure.

In ROI terms, that is part of the investment variable. Let’s simplify with round numbers, since we’re defending a straw man anyway, and say that a single outage costs US$1 billion.

“Oh my stars! That’s too expensive! It’s obviously a bad idea!” some might say (are saying).

If it’s not clear yet, the problem with this is that we have no idea of the return.

If Facebook can lose $1B in, it stands to reason that their earnings potential is also astronomical. What if the fast moving that caused the broken things also earned $50B that would not have been earned by acting more cautiously?

In this light, a $1B “investment” (in the form of an outage) to earn $50B seems like a bargain.

In mature businesses, outages are expected. There’s a failure or outage budget. Management works to keep reliability at expected levels. Not too unreliable, because then business suffers, but importantly: also not too reliable because reliability is expensive.

Jown Allspaw and Paul Hammond make this point in their famous presentation 10+ Deploys Per Day by referencing World of Warcraft’s at-the-time dismal uptime:

If the business requires that the site go down every 2 weeks, even though you’re the largest online gaming platform and you have millions of paying customers, those paying customers might be quite fine for you to have availability of 97%.


If you enjoyed this message, subscribe to The Daily Commit to get future messages to your inbox.

Discussion (0)