DEV Community

Phuoc Nguyen Dang
Phuoc Nguyen Dang

Posted on • Originally published at youtube.com

One File Satisfies 8.5 Million — The CrowdStrike Disaster

On July 19, 2024, CrowdStrike pushed a routine config file update to 8.5 million Windows machines.

The file had 21 data fields. The parser expected 20. No bounds check. No integration test.

Result: the largest IT outage in history. $5.4 billion in Fortune 500 losses. Hospitals postponing surgeries. 911 systems going dark. Airlines grounded worldwide.

Three things struck me about this story:

  1. The CEO had been here before. George Kurtz was CTO of McAfee in 2010 when an identical disaster happened — faulty update crashing millions of machines into boot loops. He founded CrowdStrike to do it better. 14 years later, same failure, 10x scale.

  2. The fix required physical presence. In an era of cloud computing, the solution was: walk to each machine, boot Safe Mode, delete a file, reboot. But BitLocker encryption keys were stored on servers that were also down. Catch-22.

  3. The most security-conscious organizations got hit hardest. Fortune 500 companies. Government agencies. Major hospitals. The ones who paid premium for protection. Meanwhile Southwest Airlines — famously running old tech — flew normally.

The root cause was a missing array bounds check. Not a cyberattack. Not a zero-day. The most basic software engineering failure you can imagine.

Sometimes the biggest threat to your systems isn't outside the walls. It's the update you trust.

Full story on CodeLore: https://www.youtube.com/watch?v=2Ko30Tmy31c

Top comments (0)