Have you ever wondered how your 6-core MacBook is able to run more than 6 programs at once? Each core is executing one instruction at a time, it’s bizarre to me that it’s able to run more than 6 applications. That’s the magic of the operating system.
The most fundamental piece of hardware in a computer is its CPU. The CPU (central processing unit) executes instructions given to it and performs logic computations very quickly. In this article, we’ll talk about how the operating system makes it seem like all the programs on your computer have exclusive access to the CPU resource, when in reality, it’s shared across all of them.
To do this, the operating system uses an abstraction called a process. A process is an executing instance of a program. Each process is given some time to use the CPU and then that access is revoked and another process is given some time to use the CPU. This happens so quickly, humans don’t notice a pause when using these programs.
So how does a program get run in the first place? A program is just a file with data in it. The data is stored on a computer’s disk. To run a program, the operating system must first take the data on disk and move it to memory since a CPU cannot directly access the disk. For a CPU to interact with the disk, it must initiate an I/O request which, at a high level, involves sending messages back and forth between itself and the disk’s controller. The I/O is complete once the data is moved from disk into memory. Additionally, the operating system will allocate some stack and heap memory for the process to use during run-time.
Once a process has been created it is put into the READY state. The operating system groups processes into different states which can be generically categorized as READY, RUNNING, or BLOCKED. READY implies that the operating system can schedule it to be run on a CPU. Once the operating system has scheduled a program to run it is put in the RUNNING state as its instructions are executed on the CPU. Now, say the program wants to interact with disk, instead of waiting for the disk I/O to complete, which can take awhile, the process is put in a BLOCKED state in which it is descheduled from the CPU. During this time, another READY process can run on the CPU. Once disk I/O is completed, the process is put back into the READY state where the operating system can choose to schedule it again.
Wait a minute...if the operating system schedules a process to run on the CPU and the CPU is running the instructions of that process, how is the operating system able to tell the CPU to start executing the instructions of a different process? What if that process runs in an infinite-loop, does that mean no other processes on the operating system will be able to run? After all, the operating system is also just software running on the CPU. The answer to this question is a hardware-based timer-interrupt. This timer-interrupt runs at a predetermined interval and when it is run, the CPU jumps to a specific location in memory and starts executing instructions at that memory address. This memory address is the operating systems timer interrupt handler. At a high-level, the interrupt handler will determine whether or not the currently running process has been running too long and needs to be descheduled in favor of another READY process. If that’s the case, then the operating system will make the switch.
But what does “make the switch” even mean? The operating system has to take the state of the currently running process, save it into the operating system’s memory, take the state of the to-be-scheduled process from the operating system’s memory, load it and then run it on the CPU. The state that we’re referring to here is the CPUs registers. When a CPU does logical operations, it operates on values stored in registers on the CPU. Things like the memory address of the currently executing instruction are stored in a register.
One thing we didn’t talk about in detail is how the operating system decides what program should be run next. Scheduling is a widely studied discipline but we will only touch a brief part of it. The goals of an operating system scheduler are generally to minimize the amount of idle time (the amount of time a process isn’t running) while also ensuring fairness (all processes get an equal chunk of time to run). Let’s look at a couple scheduling policies and understand their pros and cons.
First-in First-Out This implies that a process that is scheduled will run to completion before another process is scheduled on the operating system. The benefit is that there is no overhead of context switching (saving registers into memory and then loading them again later) so it may minimize the amount of time necessary for all programs to reach completion. However, it’s not necessarily fair. What happens if program A is constantly running and never stops? Program B which was added after program A will never run!
Round Robin Round robin works by always switching the scheduled process during a timer interrupt. It gives each process a time slice and once that time slice has expired, the next process in line is scheduled. This scheduling policy is the most fair but the constant context switches could drastically degrade performance.
Now that we understand how the physical resource of the CPU is virtualized by the operating system, it should make more sense how the operating system is able to give the illusion that all processes have exclusive access to the CPU. One burning question you may have is: how does the operating give each process the illusion that it each has its own address space?