What are zombie processes, where do they come from and how do we get rid of them?
First we need to understand how to create a new process in a Unix operating system and how processes relate to each other.
Parent-child Relationship of Processes
To create a new process in a Unix operating system you fork(2) or clone(2) the current process, which becomes the parent of the new child process. This parent-child relationship between processes can be traced all the way up to the root, which is the process with id 1, typically /sbin/init
, although on macOS this is /bin/launchctl
.
We can see this in the output of ps -ax -o pid,ppid,cmd
:
$ ps -ax -o pid,ppid,cmd
PID PPID CMD
1 0 /sbin/init
...
297 1 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
...
615 297 sshd: julien [priv]
641 615 sshd: julien@pts/0
642 641 -bash
793 642 ps -ax -o pid,ppid,cmd
You can trace the ps
command with process id 793 to its parent bash
642, to sshd
641 and so on until init
1.
Creating a Zombie
Now that we know how a process is created and who its parent is, what happens when a process exits?
When a process exits its resources are cleaned up by the kernel, but the process table entry is kept until the process's parent reads the termination status via the wait(2) system call. Such a process that has exited but is still present in the process table is called a zombie process.
From the exit(3) man page:
[...] the child becomes a "zombie" process: most of the process resources are recycled, but a slot containing minimal information about the child process (termination status, resource usage statistics) is retained in process table. This allows the parent to subsequently use waitpid(2) (or similar) to learn the termination status of the child; at that point the zombie process slot is released.
If we look at the running processes while the following C program runs we can see this in action.
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
int main(void)
{
int stat;
pid_t pid, wait_pid;
if ((pid = fork()) < 0)
{
perror("fork error");
exit(1);
}
else if (pid == 0)
{
/* Child: wait 3 seconds and then exit */
sleep(3);
exit(0);
}
else
{
/* Parent: wait 5 seconds before calling wait. After 3 seconds the child will become a zombie */
sleep(5);
if ((wait_pid = wait(&stat)) < 0)
{
perror("wait error");
exit(1);
}
/* the child is now removed from the process table */
sleep(5);
}
}
At first we can see the parent (5084) and child (5085) processes running, or rather sleeping as indicating by the S
in the ps
output:
$ ./zombie &
[1] 5084
$ ps -ax -o pid,ppid,state,cmd
5084 642 S ./zombie
5085 5084 S ./zombie
Once the child processes exits it is still present in the process table, but now in the Z
state. This is also reflected its name [zombie] <defunct>
:
$ ps -ax -o pid,ppid,state,cmd
5084 642 S ./zombie
5085 5084 Z [zombie] <defunct>
Finally, after the parent processes calls wait, the child process is removed from the process table:
$ ps -ax -o pid,ppid,state,cmd
5084 642 S ./zombie
Reparenting
If we remove the wait
call from the parent process and exit from the parent process before the child exits, pid 1 aka init
will become the new parent. One of the jobs of init
is to always calls wait
when one of its children terminates. This ensures that zombies are eventually cleaned up, even if a program doesn't wait
on its child processes before terminating.
Top comments (0)