The Linux Programming Interface - Multiprocessing

#linux #c #lowcode #github

Let's talk about multiprocessing mechanism in Linux!

Before diving into multiprocessing, let me explain the core terms that you must know.

First is that you have to understand that what program and process are and how these work. A program is a file containing the describes how to construct a process at run time. From this explanation, you can say that a process is an instance of an running program.

From low-level perspective, a program file contains the following:

Binary format identification: Each program file includes metainformation about the format of the executable file. Nowadays, the ELF (Executable and Linking Format) is the standard format.
Machine-language instructions: These are the machine code generated from source file using compiler. These instructions reside at the .text area of process's virtual memory address.
Program entry-point address: This identifies the location of the instruction at which execution of program should commence.
Data: The program file also includes the data used in program. If you inspect the virtual memory layout of a program, you will see two different sections: .data and .bss. In .data section, compiler places the initialized global (or static) data and in .bss, the uninitialized global (or static) data. There is also a section named .rodata. In there, the const variables are put.
Symbol and relocation tables: These are used for a variety of purposes, including debugging and run-time symbol resolution (dynamic linking).
Shared-library and dynamic-linking information: The program file includes fields listing the shared libraries that the program needs to use at run time and the pathname of the dynamic linker that should be used to load these libraries.

From the kernel's point of view, a process is an abstract entity to which system resources are allocated in order to execute a program. You can see the last sentence as alternative defination of what process is.

Once a program is run and then the kernel allocates required resources, the virtual memory of it also is created. In process virtual memory, there are many sections. I've talked about .text, .data, .bss, .rodata. Apart from these, there are two more important section areas: .stack, .heap. .stack is a growing (to lower memory address) and shrinking (to upper memory address) segment containing stack frames. Each stack frame consists of currently executing machine instruction like called function, function parameters, local variables, return keyword and so on. .heap is an area from which memory (for variables) can be dynamically allocated at run time. The top end of the heap is called the program break (look at brk() system call).

After understanding the core lesson about program and process, it's time to discuss the multiprocessing.

Multiprocessing means that running multiple program simultaneously and the running multiple program is the key topic of this post.

Any process (parent) can fork (creates) a new sub-process (child) just using fork() system call. It's the formal and recommended way to create child process. The newly created child process (mostly) is the duplication of its parent.

pid_t fork(void)

fork() system call returns:

-1, if any error has occurred like EAGAIN, ENOMEM, ENOSYS
0, when called from inside child process
child process ID

After forking the child process, it is used (mostly) to execute new program. This is called as execing. For this purpose, execve() system call (and derived ones like execv(), execle(), execl() or so on) is called.

int execve(const char *pathname, char *const argv[], char *const envp[])

This system call replaces the child process's virtual memory layout with the program that will be executed from child. So the child process memory layout will be completely different according to its parent.

The parent process generally want to check and track the its child. This is done by wait() (derived from wait4() system call and similar ones like waitpid(), wait2(), wait3()). The parent retrieves the exit status (like killed by a signal or exited abnormally) of its child.

pid_t wait(int *wstatus);
pid_t waitpid(pid_t pid, int *wstatus, int options);

Lastly, both parent and child process need to exit. This is done over exit() (by calling _exit() system call). There are two different exit status. These are defined in GLIBC as EXIT_SUCCESS (0) and EXIT_FAILURE (1).