Linux Multiprocessing

Process 0, 1 and 2

There are three initial processes in Linux:

  • process 0: idle process, the first process created by the system. It loads the whole system.
  • process 1: systemd process initializes the system. It’s the ancestor process of all the user
    processes.
  • process 2: kthreadd process coordinates all the processes

Every process in Linux has a unique non-negative process id. At any given time, there can’t be
different processes with the same id. We can use ps -ef | grep process_name to check the
details of the target process.

  • UID is the user who started the process
  • PID is the process id
  • PPID is the parent process id
  • C is the CPU usage
  • STIME is the process start time
  • TTY is the terminal device that started the process, which is not that important for now
  • TIME is the total uptime for the process
  • CMD is the command used to start this process

We can see clearly that the parent process of process 1 and 2 are process 0. Process 1 and 2
then invoke all the other processes. In other words, all processes can be traced back to
process 1 and 2, and ultimately to process 0.

For example, in the below processes:

Command ps -ef has a process id of 20110, its parent process is 20074 which is bash, whose
parent process is 20073 which is sshd, whose parent process is 20069, whose parent process
is 1190, whose parent process is 1. So any process can be traced back to process 1 or process 2.

Similarly, if we check the process running book, we can also trace it back to process 1.

PID

There are two C functions to get process ids.

  • getpid: Get the process id
  • getppid: Get the parent process id

Let’s illustrate with a simple program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  printf("getpid()=%d\n", getpid());
  printf("getppid()=%d\n", getppid());

  sleep(50);
}

Compile and run:

Let’s check the running process

How to create a process in a program

  • To create a new process, we may invoke the fork function in an existing process, the new
    process is the child process and the original process is the parent process.
  • Both processes will continue with the rest of the code.
  • The fork function is invoked once and return twice. The child process returns 0, the
    parent process returns the child process’ pid.
  • The child process is a copy of the parent process, obtaining the parent process’ data spaces, stack, heap, aka they don’t share those data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  printf("aaa=%d\n", getpid());
  sleep(10);
  printf("bbb=%d\n", getpid());

  fork();

  printf("ccc=%d\n", getpid());
  sleep(30);
  printf("ddd=%d\n", getpid());
}

At the beginning there’s only 1 process. Then the fork function is invoked to create a child process. Finally, both parent and child process will continue the rest of the program.

fork function returns twice. Let’s check it

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  printf("aaa=%d\n", getpid());
  sleep(10);
  printf("bbb=%d\n", getpid());

  int pid = fork();
  printf("pid = %d\n", pid);
  sleep(2);

  printf("ccc=%d\n", getpid());
  sleep(30);
  printf("ddd=%d\n", getpid());
}

Compile and run:

fork returned twice, the parent process returns the child process pid and the child process returns 0. Then we can divide the program and run relevant work for both processes:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    sleep(20);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    sleep(30);
  }
}

For the above example, we invoked the process successfully, however it might fail and returns -1, which might be caused by not enough resources etc.

The child process and parent process don’t share data

We can illustrate it with the example below:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  int ii = 1;
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    printf("The child process ii=%d\n", ii++); sleep(1);
    printf("The child process ii=%d\n", ii++); sleep(1);
    printf("The child process ii=%d\n", ii++); sleep(1);
    printf("The child process ii=%d\n", ii++); sleep(1);
    printf("The child process ii=%d\n", ii++); sleep(1);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    printf("The parent process ii=%d\n", ii); sleep(1);
    printf("The parent process ii=%d\n", ii); sleep(1);
    printf("The parent process ii=%d\n", ii); sleep(1);
    printf("The parent process ii=%d\n", ii); sleep(1);
    printf("The parent process ii=%d\n", ii); sleep(1);
  }
}

The variable ii changes over time in the child process, but the value remains unchanged in the parent process.

A Note on File descriptor

The file descriptor opened by the parent process is also copied into the child process.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  FILE *fp = fopen("/tmp/tmp.txt", "w+");
  fprintf(fp, "I'm a kick-ass coder.\n");
  fflush(fp); // flush out the memory cache
  int ii = 1;
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    fprintf(fp, "child process: clean code coder.\n");
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    fprintf(fp, "parent process: deep dive.\n");
  }
  fclose(fp);
}

Compile and run the program

The child and parent processes are independent to each other. If the file is closed in the child process, it won’t affect the parent process, vice versa.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  FILE *fp = fopen("/tmp/tmp.txt", "w+");
  fprintf(fp, "I'm a kick-ass coder.\n");
  fflush(fp); // flush out the memory cache
  int ii = 1;
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    fclose(fp);
    fprintf(fp, "child process: clean code coder.\n");
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    fprintf(fp, "parent process: deep dive.\n"); sleep(1);
    fprintf(fp, "parent process: clean code.\n"); sleep(1);
    fprintf(fp, "parent process: deep dive.\n"); sleep(1);
    fprintf(fp, "parent process: clean code.\n"); sleep(1);
    fclose(fp); // placed here to avoid memory error    
  }
}

In the example above, we can’t write anything to the child process, but we can write to the file in the parent process.

Orphan Process

An orphan process is a process whose parent process has finished or terminated, though it remains running itself.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    sleep(10);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    sleep(5);
  }
}

When the parent process ends, the child process is still running but its parenet process will be process 1 now. The orphan process is not as harmful as zombie processes.

Zombie Process

A zombie process, or defunct process, is a process that has completed execution but still has an entry in the process table.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    sleep(5);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    sleep(10);
  }
}

Compile and run the program:

To avoid zombie process, there are three methods:

Method 1: Ignore SIGCHLD signal

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>

int main()
{
  signal(SIGCHLD, SIG_IGN);
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    sleep(5);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    sleep(10);
  }
}

In this case, when the child process ends, the parent process will not ignore the child process signal, there won’t be a defunct process in the process table.

Method 2: Add waiting child process codes in the parent process

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <sys/wait.h>

int main()
{
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    sleep(5);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    int sts;
    wait(&sts);
    sleep(10);
  }
}

The disadvantage of this method is that the parent process will get stuck until wait finishes.

Method 3: Set signal processing function for child process

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>

void func(int sig)
{
  int sts;
  wait(&sts);
}

int main()
{
  signal(SIGCHLD, func);
  int pid = fork();

  if (pid == 0)
  {
    printf("This is the child process %d, it will run the child process tasks\n", getpid());
    sleep(5);
  }
  if(pid > 0)
  {
    printf("This is the parent process %d, it will run the parent process tasks\n", getpid());
    sleep(10);
    sleep(10);
  }
}

   Reprint policy


《Linux Multiprocessing》 by Isaac Zhou is licensed under a Creative Commons Attribution 4.0 International License
  TOC