DEV Community

Cover image for Priority Inversion and the NASA Mars Pathfinder Bug
Shivam Rai
Shivam Rai

Posted on

Priority Inversion and the NASA Mars Pathfinder Bug

Priority inversion happens when a high-priority task is forced to wait because a low-priority task is holding a resource it needs. The problem becomes worse when a medium-priority task keeps running and prevents the low-priority task from finishing, which indirectly blocks the high-priority task.

The most famous real-world example occurred on NASA’s Mars Pathfinder mission in 1997. The rover had several tasks running at different priority levels. A low-priority task was responsible for collecting data from the rover’s instruments. A high-priority task was responsible for handling critical operations that needed to run frequently. A medium-priority task performed communications and other background work.

At one point, the low-priority task acquired a shared resource (a mutex) to collect data. Before it could release the resource, the high-priority task needed it. Normally the low-priority task would finish quickly, but the medium-priority task kept running and pre-empted it. This meant the high-priority task was stuck waiting for a low-priority task that was unable to run. The rover’s system saw this as a timing failure and triggered repeated system resets.

The engineers fixed the issue using a mechanism called priority inheritance. With this technique, when a low-priority task holds a resource needed by a high-priority task, the system temporarily boosts the low-priority task’s priority. This allows it to finish its work quickly and release the resource, removing the blockage.

This incident is a classic lesson in real-time systems, even simple concurrency issues can cause major failures if priority inversion is not handled correctly.

Top comments (1)

Collapse
 
iantepoot profile image
Ian Tepoot

This was a very interesting read, thank you. I used to love following Pathfinder, but I actually never knew this little tidbit.