ProcessReaper: single thread reaper

Mon Apr 14 15:54:47 UTC 2014

On 04/14/2014 03:50 PM, roger riggs wrote:
> Hi Peter,
>
> The new API to handle process trees and processes not spawned by the 
> Java process
> also will need a way to wait for exit status and destroy children so 
> I'm not sure the
> issue goes away.  It too will need to co-exist with non-JDK libraries 
> that spawn and handle
> their own children.

Hi Roger,

At some point one has to decide who's responsibility it is to wait on a 
child process. If a process is spawned by some native library, one can 
expect that it's this library's responsibility to reap that child. 
Otherwise you're providing just one half of the story and allowing 
conflicts. Spawning a child and waiting on it's exit (or exiting 
yourself and leaving the orphan to be handled by init) are two 
intrinsically inter-connected actions on UNIX. It's also not possible to 
wait on a grand-child.

Only immediate children are the clean-up responsibility of a parent 
(regardless of which API was used to spawn them). In that light I 
question the need to gracefully destroy anyone besides a direct child. 
It might be convenient to be able to forcibly destroy the whole sub-tree 
rooted at a particular direct child, but gracefull destruction should be 
initiated by sending a TERM signal just to the immediate child. It's 
this child's responsibility to do any clean-up needed, including 
stopping it's children. So I think we need the following capabilities in 
new API:

- enumerate direct children (regardless of which API was used to spawn them)
- trigger graceful destruction of any direct child
- non-blocking query for liveness of any direct child
- trigger forcible termination of any direct child and all descendants 
in one call
- (optionally: obtain a Process object of any live direct child that was 
spawned by Process API)

That's my view on the API that provides access to children/ancestors of 
JVM and has built-in constraints that prevent users to stray from 
recommended practices.

The other possibility would be an API that provides access to any 
process on the system. Such API could enumerate all processes on the 
system and gracefully/forcibly terminate any process that OS allows (an 
API-equivalent for UNIX commands "ps" and "kill"). That would be a 
low-level API.

>
> A selectable implementation may a way to accommodate the needed 
> backward compatibility.

So what do you think of the following internal reaper API (which could 
be used for implementation of existing Process API and to support new 
API obtaining Process objects for children):

interface ProcessReaper {

     // return a suitable implementation of ProcessReaper (selectable by 
system property)
     static ProcessReaper getInstance() { .... }

     // register "our" process that we have just spawned and are 
responsible to wait for.
     // return CompletableFuture that will be completed with the exitstatus
     // when the process terminates
     CompletableFuture<Integer> processStarted(Process process);

     // return the Process object for "our" live process or null if the 
given pid
     // does not represent a live process that we have spawned.
     ProcessgetProcess(int pid);
}

This assumes that Process class will get new public (or at least 
package-private) method: int getProcessId();

Regards, Peter

>
> Roger
>
> On 4/14/2014 5:02 AM, Peter Levart wrote:
>> Hi Martin, Roger,
>>
>> Just a thought. Would it be feasible to have two (ore more) built-in 
>> strategies, selectable by system property? A backwards compatible 
>> tread per child, using waitpid(pid, ...), a single reaper thread 
>> using waitpid(-1, ...), maybe also single threaded strategy 
>> accessible only on Linux/Solaris using waitid(-1, ..., WNOWAIT)... 
>> All packed nicely in a package-private interface (ProcessReaper) with 
>> multiple implementations?
>>
>> Regards, Peter
>>
>> On 04/12/2014 01:37 AM, Martin Buchholz wrote:
>>> Let's step back again and try to check our goals...
>>>
>>> We could try to optimize the one-reaper-thread-per-subprocess thing. 
>>>  But that is risky, and the cost of what we're doing today is not 
>>> that high.
>>>
>>> We could try to implement the feature of killing off an entire 
>>> subprocess tree.  But historically, any kind of behavior change like 
>>> that has been vetoed.  I have tried and failed to make less 
>>> incompatible changes.  We would have to add a new API.
>>>
>>> The reality is that Java does not give you real access to the 
>>> underlying OS, and unless there's a seriously heterodox attempt to 
>>> provide OS-specific extensions, people will have to continue to 
>>> either write native code or delegate to an OS-savvy subprocess like 
>>> a perl script.
>>>
>>>
>>> On Fri, Apr 11, 2014 at 7:52 AM, Peter Levart 
>>> <peter.levart at gmail.com <mailto:peter.levart at gmail.com>> wrote:
>>>
>>>     On 04/09/2014 07:02 PM, Martin Buchholz wrote:
>>>>
>>>>
>>>>
>>>>     On Tue, Apr 8, 2014 at 11:08 PM, Peter Levart
>>>>     <peter.levart at gmail.com <mailto:peter.levart at gmail.com>> wrote:
>>>>
>>>>         Hi Martin,
>>>>
>>>>         As you might have seen in my later reply to Roger, there's
>>>>         still hope on that front: setpgid() + wait(-pgid, ...)
>>>>         might be the answer. I'm exploring in that direction.
>>>>         Shells are doing it, so why can't JDK?
>>>>
>>>>         It's a little trickier for Process API, since I imagine
>>>>         that shells form a group of processes from a pipeline which
>>>>         is known in-advance while Process API will have to add
>>>>         processes to the live group dynamically. So some races will
>>>>         have to be resolved, but I think it's doable.
>>>>
>>>>
>>>>     This is a clever idea, and it's arguably better to design
>>>>     subprocesses so they live in separate process groups (emacs
>>>>     does that), but:
>>>>     Every time you create a process group, you change the effect of
>>>>     a user signal like Ctrl-C, since it's sent to only one group.
>>>>     Maybe propagate signals to the subprocess group?  It's starting
>>>>     to get complicated...
>>>>
>>>
>>>     Hi Martin,
>>>
>>>     Yes, shells send Ctrl-C (SIGINT) and other signals initiated by
>>>     terminal to a (foreground) process group. A process group is
>>>     formed from a pipeline of interconnected processes. Each
>>>     pipeline is considered to be a separate "job", hence shells call
>>>     this feature "job-control". Child processes by default inherit
>>>     process group from it's parent, so children born with Process
>>>     API (and their children) inherit the process group from the JVM
>>>     process. Considering the intentions of shell job-controll, is
>>>     propagating SIGTERM/SIGINT/SIGTSTP/SIGCONT signals to children
>>>     spawned by Process API desirable? If so, then yes, handling
>>>     those signals in JVM and propagating them to current process
>>>     group that contains all children spawned by Process API and
>>>     their descendants would have to be performed by JVM. That
>>>     problem would certainly have to be addressed. But let's first
>>>     see what I found out about sigaction(SIGCHLD, ...), setpgid(pid,
>>>     pgid), waitpid(-pgid, ...), etc...
>>>
>>>     waitpid(-pgid, ...) alone seems to not be enough for our task.
>>>     Mainly because a process can re-assign it's group and join some
>>>     other group. I don't know if this is a situation that occurs in
>>>     real world, but imagine if we have one live child process in a
>>>     process group pgid1 and no unwaited exited children. If we issue:
>>>
>>>         waitpid(-pgid1, &status, 0);
>>>
>>>     Then this call blocks, because at the time it was given, there
>>>     were >0 child processes in the pgid1 group and none of them has
>>>     exited yet. Now if this one child process changes it's process
>>>     group with:
>>>
>>>         setpgid(0, pgid2);
>>>
>>>     Then the waitpid call in the parent does not return (maybe this
>>>     is a bug in Linux?) although there are no more live child
>>>     processes in the pgid1 group any more. Even when this child
>>>     exits, the call to waitpid does not return, since this child is
>>>     not in the group we are waiting for when it exits. If all our
>>>     children "escape" the group in such way, the tread doing waiting
>>>     will never unblock. To solve this, we can employ signal
>>>     handlers. In a signal handler for SIGCHLD signal we can invoke:
>>>
>>>         waitpid(-pgid1, &status, WNOHANG); // non-blocking call
>>>
>>>     ...in loop until it either returns (0) which means that there're
>>>     no more unwaited exited children in the group at the momen or
>>>     (-1) with errno == ECHILD, which means that there're no more
>>>     children in the queried group any more - the group does not
>>>     exist any more. Since signal handler is invoked whith SIGCHLD
>>>     being masked and there is one bit of pending signal state in the
>>>     kernel, no child exit can be "skipped" this way. Unless the
>>>     child "escapes" by changing it's group. I don't know of a
>>>     plausible reason for a program to change it's process group. If
>>>     a program executing as JVM child wants to become a background
>>>     daemon it usually behaves as follows:
>>>
>>>     - fork()s a grand-child and then exit()s (so we get notified via
>>>     signal and waitpid(-pgid, ...) successfully for it's exitstatus)
>>>     - the grand-child then changes it's session and group (becomes
>>>     session and group leader), closes file descriptors, etc. The
>>>     responsibility for waiting on the grand-child daemon is
>>>     transferred to the init process (pid=1) since the grand-child
>>>     becomes an orphan (has no parent).
>>>
>>>     Ignoring this still unsolved problem of possible ill-behaved
>>>     child program that changes it's process group, I started
>>>     constructing a proof-of-concept prototype. What I will do in the
>>>     prototype is start throwing IllegalStateException from the
>>>     methods of the Process API that pertain to such children. I
>>>     think this is reasonable.
>>>
>>>     Stay tuned,
>>>
>>>     Peter
>>>
>>>
>>>
>>
>