ProcessReaper: single thread reaper

Martin Buchholz martinrb at google.com
Thu Apr 17 07:07:19 UTC 2014


Many possible solutions eventually fail because whatever we do cannot take
ownership of any global resource.  Calling waitid on all child processes,
even with NOWAIT and NOHANG changes global state (what if another
subprocess library in the same process is trying to do the same thing?)


On Wed, Apr 16, 2014 at 3:34 PM, David M. Lloyd <david.lloyd at redhat.com>wrote:

> On 04/16/2014 02:15 PM, Martin Buchholz wrote:
>
>> On Mon, Apr 14, 2014 at 1:57 PM, Peter Levart <peter.levart at gmail.com
>> <mailto:peter.levart at gmail.com>> wrote:
>>
>>
>>     There's already such a race in current implementation of
>>     Process.terminate(). It admittedly only concerns a small window
>>     between process exiting and the reaper thread managing to signal
>>     this state to the other threads wishing to terminate it at the same
>>     time, so it could happen that a KILL/TERM signal is sent to an
>>     already deceased PID which was re-used, but it doesn't happen in
>>     practice since PIDs are not re-used very soon typically.
>>
>>     But I agree, waiting between listing children and sending them
>>     signals increases the chance of hitting a reused PID.
>>
>>
>> We do rely on the OS not reusing a PID _immediately_.  We used to have
>> bugs in this area where Process.destroy would send a signal to a pid
>> that may have deceased arbitrarily long ago.
>>
>
> It seems to me that the key to avoiding this is to ensure that waitpid()
> is not called until we know the PID is ready to be cleaned.  As long as
> waitpid() has not yet been called, we can be certain that the process still
> exists and is ours.  So the real question is, how can we know a process is
> dead without actually calling wait() (thereby making that knowledge
> useless)?
>
> The aforementioned /proc trick seems like one good way to do so without,
> say, spawning a plethora of threads (though at one additional FD per
> thread, it is not free either).  Unforunately /proc is not ubiquitous, and
> even where it does exist, it's not standardized (thus its behavior probably
> cannot be relied upon absolutely).
>
> A simple solution may be to use a synchronized set of child PIDs, and set
> a SIGCHLD handler or waiter which, when triggered, locks the set and
> performs a series of waitid() operations with WNOHANG, processing all the
> process status updates.  The signalling APIs would be required to
> synchronize on the set to determine if the process in question is owned by
> the parent process.  Previously unknown processes can be "adopted" into
> this area by acquiring the synchronization and calling "waitpid()"+WNOHANG
> on the PID in question, and using the result to determine whether the PID
> should be added to the set (or whether we just reaped it - or whether it
> doesn't belong to us at all).
>
> As long as the process API is restricted to managing direct children, this
> should work and be safe across all POSIX-ish environments.  Note the
> potential downside that all children will be automatically reaped, which is
> possibly somewhat hostile to naïve JNI libraries or embedders. Selectively
> enabling the /proc trick can mitigate this downside on platforms which
> support it however.
>
> --
> - DML
>



More information about the core-libs-dev mailing list