DEV Community

Cover image for How a Metric Mix-Up Burned $327 Million
Alex P
Alex P

Posted on • Originally published at yoursec.substack.com

How a Metric Mix-Up Burned $327 Million

A spacecraft traveled 670 million kilometers, but died at the very end because of a misunderstanding between two lines of code

In September 1999, an ambitious NASA mission to Mars ended in a disaster that went down in history as a huge management failure. The most expensive “lost in translation” mistake in history turned a science probe into a spectacular but useless meteor

The main problem with the Mars Climate Orbiter (MCO) was a communication gap between two teams of developers. The contractor, Lockheed Martin, built the thruster control software (SM_FORCES) using the imperial system (pound-force seconds). However, the navigation team at NASA JPL expected the data to be in the metric system (newton-seconds), as stated in the Software Interface Specification (and in the science!)

Because of this, sending raw numbers without converting them led to completely unmet expectations in the code. The navigation software assumed the data was in the correct metric format, but the values were fundamentally unequal and desperately needed to be converted

It is ironic and dangerous that the United States is one of the few countries in the world (along with Liberia and Myanmar) that has not officially adopted the metric system. While the global science community has used the metric system for decades, using pounds and inches in an interplanetary mission showed a severe lack of system awareness

Just a small reminder from a school:

Physics made the situation even worse. The spacecraft’s asymmetric solar array created a twisting force from solar wind pressure. To stay stable, the system had to fire its thrusters much more often than originally planned. Every time it did this, it sent bad data to the navigation computer, where the force was 4.45 times weaker than it should have been (the ratio of a pound to a newton). This error, which might have gone unnoticed, built up to critical levels because the maneuvers happened so often

# Python pseudo-code showing the fatal out-of-sync issue 
# between Lockheed Martin (ground software) and NASA JPL (navigation)

# ---------------------------------------------------------
# Contractor module (Lockheed Martin) - Uses imperial system
# ---------------------------------------------------------
def calculate_small_forces(thrust_time_sec):
    # Developer uses pound-force (lbf)
    thrust_lbf = get_raw_thruster_force() 

    # data without conversion for data_from_amd_file
    return thrust_lbf * thrust_time_sec 

# ---------------------------------------------------------
# Navigation module (NASA JPL) - Expects metric system (SI)
# ---------------------------------------------------------
def update_spacecraft_trajectory(data_from_amd_file):
    # JPL is sure it gets data exactly in newton-seconds (N-s)
    impulse_newton_seconds = data_from_amd_file 

    # 1 lbf = 4.45 N, so navigation thinks the thrust is 4.45 times weaker
    current_trajectory.altitude -= apply_orbital_mechanics(impulse_newton_seconds)

    """
    RESULT: Ground computers heavily underestimated the braking force
    In reality, the spacecraft was flying lower and lower towards the planet
    """
Enter fullscreen mode Exit fullscreen mode

Catastrophic Consequences

On September 23, 1999, the spacecraft began its manoeuvre to enter orbit

The navigators expected it to pass at an altitude of 226 km, with the absolute minimum survival height being 80 km

But because of the built-up error, the probe hit the atmosphere at just 57 km

The spacecraft was instantly destroyed by aerodynamic stress and burned up

Damage in numbers and facts:

  • $327.6 million - the total cost of the MCO program (including $193.1M to build it, $91.7M to launch it, and $42.8M for operations)

  • Science failure: The MCO probe was supposed to act as the main data relay for the next spacecraft, the Mars Polar Lander (MPL). Sadly, just a couple of months later, the MPL also crashed during landing due to a different software bug (the engines shut off too early). This made 1999 one of the worst years in the history of Mars exploration

Lessons for the IT Industry

  1. Strict typing for physical values and clear naming Never use raw types (double, float) for measurements. Use Value Objects (like NewtonSecond or PoundSecond classes) so the compiler stops you from mixing units. For example, some programmers in financial banks create their own Money types or separate UserId and CompanyId types so they don’t accidentally add different currencies or confuse a user with a company. If strict types are impossible, arguments and methods must clearly state the unit in their name: for example, calculateThrustInNewtons instead of calculateThrust
  2. Integration testing is not just a bunch of Unit tests Each team tested their own code perfectly, but nobody checked how the data flowed together in a real end-to-end scenario (E2E). E2E tests and long simulations with real data exchange files are the only ways to catch hidden errors that build up over time and don’t show up in short, isolated tests
  3. Investigate all anomalies deeply Any time a system acts differently than expected, it must be studied immediately. JPL navigators actually noticed strange things a week before the crash: the spacecraft needed path corrections too often, and its approach speed didn’t perfectly match the model. These “quiet alarms” were ignored. Remember: if reality differs from your monitoring by even a fraction of a percent, it is a reason to pull the emergency brake, not just hope for the best

Space requires perfect interfaces. The story of the MCO is a painful reminder that even the most complex and advanced system will fall apart if its parts speak different languages

Top comments (0)