ProcessReaper: single thread reaper
Peter Levart
peter.levart at gmail.com
Mon Apr 14 09:02:28 UTC 2014
Hi Martin, Roger,
Just a thought. Would it be feasible to have two (ore more) built-in
strategies, selectable by system property? A backwards compatible tread
per child, using waitpid(pid, ...), a single reaper thread using
waitpid(-1, ...), maybe also single threaded strategy accessible only on
Linux/Solaris using waitid(-1, ..., WNOWAIT)... All packed nicely in a
package-private interface (ProcessReaper) with multiple implementations?
Regards, Peter
On 04/12/2014 01:37 AM, Martin Buchholz wrote:
> Let's step back again and try to check our goals...
>
> We could try to optimize the one-reaper-thread-per-subprocess thing.
> But that is risky, and the cost of what we're doing today is not that
> high.
>
> We could try to implement the feature of killing off an entire
> subprocess tree. But historically, any kind of behavior change like
> that has been vetoed. I have tried and failed to make less
> incompatible changes. We would have to add a new API.
>
> The reality is that Java does not give you real access to the
> underlying OS, and unless there's a seriously heterodox attempt to
> provide OS-specific extensions, people will have to continue to either
> write native code or delegate to an OS-savvy subprocess like a perl
> script.
>
>
> On Fri, Apr 11, 2014 at 7:52 AM, Peter Levart <peter.levart at gmail.com
> <mailto:peter.levart at gmail.com>> wrote:
>
> On 04/09/2014 07:02 PM, Martin Buchholz wrote:
>>
>>
>>
>> On Tue, Apr 8, 2014 at 11:08 PM, Peter Levart
>> <peter.levart at gmail.com <mailto:peter.levart at gmail.com>> wrote:
>>
>> Hi Martin,
>>
>> As you might have seen in my later reply to Roger, there's
>> still hope on that front: setpgid() + wait(-pgid, ...) might
>> be the answer. I'm exploring in that direction. Shells are
>> doing it, so why can't JDK?
>>
>> It's a little trickier for Process API, since I imagine that
>> shells form a group of processes from a pipeline which is
>> known in-advance while Process API will have to add processes
>> to the live group dynamically. So some races will have to be
>> resolved, but I think it's doable.
>>
>>
>> This is a clever idea, and it's arguably better to design
>> subprocesses so they live in separate process groups (emacs does
>> that), but:
>> Every time you create a process group, you change the effect of a
>> user signal like Ctrl-C, since it's sent to only one group.
>> Maybe propagate signals to the subprocess group? It's starting
>> to get complicated...
>>
>
> Hi Martin,
>
> Yes, shells send Ctrl-C (SIGINT) and other signals initiated by
> terminal to a (foreground) process group. A process group is
> formed from a pipeline of interconnected processes. Each pipeline
> is considered to be a separate "job", hence shells call this
> feature "job-control". Child processes by default inherit process
> group from it's parent, so children born with Process API (and
> their children) inherit the process group from the JVM process.
> Considering the intentions of shell job-controll, is propagating
> SIGTERM/SIGINT/SIGTSTP/SIGCONT signals to children spawned by
> Process API desirable? If so, then yes, handling those signals in
> JVM and propagating them to current process group that contains
> all children spawned by Process API and their descendants would
> have to be performed by JVM. That problem would certainly have to
> be addressed. But let's first see what I found out about
> sigaction(SIGCHLD, ...), setpgid(pid, pgid), waitpid(-pgid, ...),
> etc...
>
> waitpid(-pgid, ...) alone seems to not be enough for our task.
> Mainly because a process can re-assign it's group and join some
> other group. I don't know if this is a situation that occurs in
> real world, but imagine if we have one live child process in a
> process group pgid1 and no unwaited exited children. If we issue:
>
> waitpid(-pgid1, &status, 0);
>
> Then this call blocks, because at the time it was given, there
> were >0 child processes in the pgid1 group and none of them has
> exited yet. Now if this one child process changes it's process
> group with:
>
> setpgid(0, pgid2);
>
> Then the waitpid call in the parent does not return (maybe this is
> a bug in Linux?) although there are no more live child processes
> in the pgid1 group any more. Even when this child exits, the call
> to waitpid does not return, since this child is not in the group
> we are waiting for when it exits. If all our children "escape" the
> group in such way, the tread doing waiting will never unblock. To
> solve this, we can employ signal handlers. In a signal handler for
> SIGCHLD signal we can invoke:
>
> waitpid(-pgid1, &status, WNOHANG); // non-blocking call
>
> ...in loop until it either returns (0) which means that there're
> no more unwaited exited children in the group at the momen or (-1)
> with errno == ECHILD, which means that there're no more children
> in the queried group any more - the group does not exist any more.
> Since signal handler is invoked whith SIGCHLD being masked and
> there is one bit of pending signal state in the kernel, no child
> exit can be "skipped" this way. Unless the child "escapes" by
> changing it's group. I don't know of a plausible reason for a
> program to change it's process group. If a program executing as
> JVM child wants to become a background daemon it usually behaves
> as follows:
>
> - fork()s a grand-child and then exit()s (so we get notified via
> signal and waitpid(-pgid, ...) successfully for it's exitstatus)
> - the grand-child then changes it's session and group (becomes
> session and group leader), closes file descriptors, etc. The
> responsibility for waiting on the grand-child daemon is
> transferred to the init process (pid=1) since the grand-child
> becomes an orphan (has no parent).
>
> Ignoring this still unsolved problem of possible ill-behaved child
> program that changes it's process group, I started constructing a
> proof-of-concept prototype. What I will do in the prototype is
> start throwing IllegalStateException from the methods of the
> Process API that pertain to such children. I think this is reasonable.
>
> Stay tuned,
>
> Peter
>
>
>
More information about the core-libs-dev
mailing list