Then I checked the running processes.
- Why the hell I have 30K Tasks ?
- Ups it's 30997 #zombie process !
2/n
Then I checked the running processes.
- Why the hell I have 30K Tasks ?
- Ups it's 30997 #zombie process !
2/n
Definition: A #zombie process is a process that has completed execution but still has an entry in the process table.
Causes: Zombie processes occur when child processes have completed execution, and their exit status needs to be read by the parent process.
Effects: Zombie processes can cause resource leaks by consuming memory and holding file descriptors.
The presence of a few zombie processes is usually harmless, but having too many can indicate a bug in the parent process
3/n
Let's kill them !
I can't: #Zombie processes cannot be killed using regular signals like `SIGKILL` since they are already dead.
It explains their name: The term 'zombie process' is metaphorical, comparing it to an 'undead' person that has not been 'reaped'.
To remove zombie processes, the parent process should be signaled (e.g., SIGCHLD) to read the child's exit status, or the parent process can be terminated if it is unresponsive.
4/n
`ps -A -ostat,pid,ppid | grep -e '[zZ]' | tail -10`
I used tail to avoid listing the 30k processed and ppid to list the parent process id
Then I kill the parents, I kept one for investigation
`sudo kill -9 240816 236637`
5/n
Time for investigation.
I checked one of the parent and found [ssl_client] <defunct>
I checked a second parent and found [wget] <defunct>
This reminded me that the last change I made was to enable #https using #letsencrypt certificate for most services
#wget is used in the healthcheck section inside #dockerCompose but it doesn't explain the zombie, or may be ?
6/n
I checked that I set the parameter to not check the certificate as i'm using 127.0.0.1 instead of the FQDN and #letsencrypt don't provide #certificate for IP addresses
I exec inside the #docker container to run manually the `wget --no-check-certificate`
It's working correctly
When I remove the healthcheck section in #DockerCompose there is no more #zombie process
Root cause found: It's the wget used by the healthcheck that create the zombies !
7/n
I summarize: I have #zombie processes created by #wget command when doing an https request in the #healthcheck section of #DockerCompose
Zombie processes occur when child processes have completed execution, and their exit status needs to be read by the parent process.
A process in a #container is still a process on the host, so it takes up a PID on the host. Whatever you run in a container is PID 1 which means it has to install a signal handler to get that signal.
8/n
The first thing to understand is an init process doesn't magically remove zombies. A (normal) init is designed to reap zombies when the parent process that failed to wait on them exits and the zombies hang around. The init process then becomes the zombies parent and they can be cleaned up.
9/n
Next, a #container is a #cgroup of processes running in their own PID namespace. This cgroup is cleaned up when the container is stopped. Any zombies that are in a container are removed on stop. They don't reach the hosts init.
10/n
Third is the different ways containers are used. Most run one main process and nothing else. If there is another process spawned it is usually a child of that main process. So until the parent exits, the zombie will exist. Then see point 2 (the zombies will be cleared on #container exit).
11/n
The other role an #init process can provide is to install signal handlers so signals sent from the host can be passed onto the container process. PID 1 is a bit special as it requires the process to listen for a signal for it to be received.
If you can install a SIGINT and SIGTERM signal handler in your PID 1 process then an init process doesn't add much here.
!!! Those explanations come from this superb article in #stackoverflow
https://stackoverflow.com/questions/49162358/docker-init-zombies-why-does-it-matter !!!
12/n
The solution to avoid #zombie in the #container is to use an #init process. This is included by default in #docker thanks to #tini
https://github.com/krallin/tini
13/n
The syntaxe in #DockerCompose is:
init: true
What is advantage of #tini ? https://github.com/krallin/tini/issues/8
14/n