Usually, the main Apache process will reap any of its workers that exit. However, on a heavily-loaded server like you describe, the main process may not have enough time available to it to do this reaping. Unreaped (“zombie”) processes show up as
However, an alternative possibility is that your worker processes are dying abnormally. This can happen if your web application (or its engine) has a bug which causes the worker process to crash. You should look in your Apache error log to see if there are any serious error messages in it.
The below explains what
Normally, when a process exits (normally or abnormally), it enters a state known as “zombie” (which in top appears as Z). Its process ID stays in the process table until its parent waits on (or “reaps”) it. Under normal circumstances, when the parent process fully expects its child processes to exit, it sets up a signal handler for SIGCHLD so that, when the signal is sent (upon a child process's exit), the parent process then reaps it at its convenience.
If the parent process has hung for some reason, such as if it's suspended, or is too busy, or is deadlocked, then child processes that exit will not be reaped (until the parent process resumes again). This can cause serious problems if there are many child processes, occupying slots in the process table that will not be freed.
In that case, one solution (if the parent process is unrecoverable, say), is to kill the parent process. Then, the child processes will be reparented to the init process (process ID 1), which will reap them. (If the init process is stalled, then you have much, much bigger problems than child processes not being reaped. In fact, a crashed init process will usually cause a kernel panic.)
No comments:
Post a Comment