ProcessReaper: single thread reaper
David M. Lloyd
david.lloyd at redhat.com
Thu Apr 17 15:15:17 UTC 2014
On 04/17/2014 09:43 AM, Peter Levart wrote:
> On 04/17/2014 09:07 AM, Martin Buchholz wrote:
>> Many possible solutions eventually fail because whatever we do cannot
>> take ownership of any global resource. Calling waitid on all child
>> processes, even with NOWAIT and NOHANG changes global state (what if
>> another subprocess library in the same process is trying to do the
>> same thing?)
>
> waitid(P_ALL, ..., NOWAIT | NOHANG) does not reap the child. It can be
> repeated multiple times. It can be used as a precursor to real
> waitid/waitpid which reaps a child, but only if it is "ours". The
> problem with this approach is what to do in the following scenario: the
> precursor waitid(P_ALL, ..., NOWAIT | NOHANG) returns a child that is
> not "ours" so we don't reap it. The "owner" of that child (JNI-library)
> does not do prompt reaping of their children. We loop, repeatedly
> getting the same child as a result, not seeing any other children that
> have exited in the meanwhile...
Maybe it would be a good idea to create a process group for JDK-managed
subprocesses? Otherwise, it seems that the only other choice is to take
over all child process management.
>
> Regards, Peter
>
>>
>>
>> On Wed, Apr 16, 2014 at 3:34 PM, David M. Lloyd
>> <david.lloyd at redhat.com <mailto:david.lloyd at redhat.com>> wrote:
>>
>> On 04/16/2014 02:15 PM, Martin Buchholz wrote:
>>
>> On Mon, Apr 14, 2014 at 1:57 PM, Peter Levart
>> <peter.levart at gmail.com <mailto:peter.levart at gmail.com>
>> <mailto:peter.levart at gmail.com
>> <mailto:peter.levart at gmail.com>>> wrote:
>>
>>
>> There's already such a race in current implementation of
>> Process.terminate(). It admittedly only concerns a small
>> window
>> between process exiting and the reaper thread managing to
>> signal
>> this state to the other threads wishing to terminate it at
>> the same
>> time, so it could happen that a KILL/TERM signal is sent to an
>> already deceased PID which was re-used, but it doesn't
>> happen in
>> practice since PIDs are not re-used very soon typically.
>>
>> But I agree, waiting between listing children and sending them
>> signals increases the chance of hitting a reused PID.
>>
>>
>> We do rely on the OS not reusing a PID _immediately_. We used
>> to have
>> bugs in this area where Process.destroy would send a signal to
>> a pid
>> that may have deceased arbitrarily long ago.
>>
>>
>> It seems to me that the key to avoiding this is to ensure that
>> waitpid() is not called until we know the PID is ready to be
>> cleaned. As long as waitpid() has not yet been called, we can be
>> certain that the process still exists and is ours. So the real
>> question is, how can we know a process is dead without actually
>> calling wait() (thereby making that knowledge useless)?
>>
>> The aforementioned /proc trick seems like one good way to do so
>> without, say, spawning a plethora of threads (though at one
>> additional FD per thread, it is not free either). Unforunately
>> /proc is not ubiquitous, and even where it does exist, it's not
>> standardized (thus its behavior probably cannot be relied upon
>> absolutely).
>>
>> A simple solution may be to use a synchronized set of child PIDs,
>> and set a SIGCHLD handler or waiter which, when triggered, locks
>> the set and performs a series of waitid() operations with WNOHANG,
>> processing all the process status updates. The signalling APIs
>> would be required to synchronize on the set to determine if the
>> process in question is owned by the parent process. Previously
>> unknown processes can be "adopted" into this area by acquiring the
>> synchronization and calling "waitpid()"+WNOHANG on the PID in
>> question, and using the result to determine whether the PID should
>> be added to the set (or whether we just reaped it - or whether it
>> doesn't belong to us at all).
>>
>> As long as the process API is restricted to managing direct
>> children, this should work and be safe across all POSIX-ish
>> environments. Note the potential downside that all children will
>> be automatically reaped, which is possibly somewhat hostile to
>> naïve JNI libraries or embedders. Selectively enabling the /proc
>> trick can mitigate this downside on platforms which support it
>> however.
>>
>> --
>> - DML
>>
>>
>
--
- DML
More information about the core-libs-dev
mailing list