ProcessReaper: single thread reaper

David M. Lloyd david.lloyd at redhat.com
Mon Apr 14 22:04:45 UTC 2014


On 04/14/2014 03:57 PM, Peter Levart wrote:
>
> On 04/14/2014 07:02 PM, David M. Lloyd wrote:
>> On 04/14/2014 11:37 AM, Peter Levart wrote:
>>> On 04/14/2014 04:37 PM, roger riggs wrote:
>>>> Hi,
>>>>
>>>> Jtreg, for example, needs a reliable way to cleanup after tests.
>>>> We've had a variety of problems with stray processes left over because
>>>> there is no visibility nor reliable way to identify and kill them.
>>>>
>>>> Roger
>>>
>>> Hi Roger,
>>>
>>> If you want to reliably get rid of all ancestors then there's only one
>>> way on UNIX:
>>>
>>>
>>> for (Proc c : enumerateDirectChildrenOfJVM()) {
>>>      getRidOfTreeRootedAt(c);
>>> }
>>>
>>> getRidOfTreeRootedAt(Proc p) {
>>>      // if we're not alive any more, then we can't have children -
>>> they are
>>>      // orphans and we can't identify them any more (their parent is
>>> "init")
>>>      if (p.isAlive()) {
>>>          // save list of direct children 1st, since they will be
>>> re-parented when
>>>          // their parent is gone, preventing enumerating them later...
>>>          List<Proc> children = p.enumerateDirectChildren();
>>>          // try gracefull...
>>>          p.terminateGrecefully();
>>>          // wait a while
>>>          if (p.isAlive()) p.terminateForcefully();
>>>          // now iterate children
>>>          for (C : children) {
>>>              getRidOfTreeRootedAt(C);
>>>          }
>>>      }
>>> }
>>
>> I don't think this is a good idea.  If a grandchild process exits, and
>> the parent waits() on it, then by the time we get around to iterating
>> grandchild processes, the OS may have assigned a new process the old
>> PID.  Zombies are pretty much the only reliable way to ensure that the
>> process is the one we think it is, and we can only reliably do that
>> for immediate children AFAICT.
>
> There's already such a race in current implementation of
> Process.terminate(). It admittedly only concerns a small window between
> process exiting and the reaper thread managing to signal this state to
> the other threads wishing to terminate it at the same time, so it could
> happen that a KILL/TERM signal is sent to an already deceased PID which
> was re-used, but it doesn't happen in practice since PIDs are not
> re-used very soon typically.

It seems like it would be trivial enough to introduce a synchronization 
between the reaper thread and whatever API signals child processes.

> But I agree, waiting between listing children and sending them signals
> increases the chance of hitting a reused PID.
>
> Regards, Peter
>
>>
>>>
>>>
>>> - must 1st terminate the parent (hopefully with grace and it will take
>>> care of children) because if you kill a child 1st, a persistent parent
>>> might re-spawn it.
>>> - must enumerate the children before terminating the parent, because
>>> they are re-parented when the parent dies and you can't find them any
>>> more.
>>>
>>>
>>> So my list of requirements for the new API that I submitted in previous
>>> message:
>>>
>>> On 04/14/2014 05:54 PM, Peter Levart wrote:
>>>> - enumerate direct children (regardless of which API was used to spawn
>>>> them) of JVM
>>>> - trigger graceful destruction of any direct child
>>>> - non-blocking query for liveness of any direct child
>>>> - trigger forcible termination of any direct child and all descendants
>>>> in one call
>>>> - (optionally: obtain a Process object of any live direct child that
>>>> was spawned by Process API)
>>>
>>> ...must be augmented:
>>>
>>> - enumerate direct children (regardless of which API was used to spawn
>>> them) of JVM
>>> - enumerate direct children of any child enumerated by the API
>>> - trigger graceful destruction of any ancestor enumerated by the API
>>> - non-blocking query for liveness of any ancestor enumerated by the API
>>> - trigger forcible termination of any ancestor enumerated by the API
>>> - (optionally: obtain a Process object of any live direct JVM child that
>>> was spawned by Process API)
>>>
>>>
>>> Regards, Peter
>>>
>>>
>>>>
>>>>
>>>> On 4/14/2014 10:31 AM, David M. Lloyd wrote:
>>>>> Where does the requirement to manage grandchild processes actually
>>>>> come from?  I'd hate to see the ability to "nicely" terminate
>>>>> immediate child processes lost just because it was difficult to
>>>>> implement some grander scheme.
>>>>>
>>>>> On 04/14/2014 08:49 AM, roger riggs wrote:
>>>>>> Hi Martin,
>>>>>>
>>>>>> A new API is needed, overloading the current Process API is not a
>>>>>> good
>>>>>> option.
>>>>>> Even within Process a new method will be needed to destroy the
>>>>>> subprocess and all
>>>>>> of its children maintain backward compatibility.
>>>>>>
>>>>>> Are there specific OS features that need to be exposed to
>>>>>> applications?
>>>>>> Is the destroy-process-and-all-children abstraction too coarse.
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 4/11/2014 7:37 PM, Martin Buchholz wrote:
>>>>>>> Let's step back again and try to check our goals...
>>>>>>>
>>>>>>> We could try to optimize the one-reaper-thread-per-subprocess thing.
>>>>>>>  But that is risky, and the cost of what we're doing today is not
>>>>>>> that
>>>>>>> high.
>>>>>>>
>>>>>>> We could try to implement the feature of killing off an entire
>>>>>>> subprocess tree.  But historically, any kind of behavior change like
>>>>>>> that has been vetoed.  I have tried and failed to make less
>>>>>>> incompatible changes.  We would have to add a new API.
>>>>>>>
>>>>>>> The reality is that Java does not give you real access to the
>>>>>>> underlying OS, and unless there's a seriously heterodox attempt to
>>>>>>> provide OS-specific extensions, people will have to continue to
>>>>>>> either
>>>>>>> write native code or delegate to an OS-savvy subprocess like a perl
>>>>>>> script.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 11, 2014 at 7:52 AM, Peter Levart
>>>>>>> <peter.levart at gmail.com
>>>>>>> <mailto:peter.levart at gmail.com>> wrote:
>>>>>>>
>>>>>>>     On 04/09/2014 07:02 PM, Martin Buchholz wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     On Tue, Apr 8, 2014 at 11:08 PM, Peter Levart
>>>>>>>>     <peter.levart at gmail.com <mailto:peter.levart at gmail.com>> wrote:
>>>>>>>>
>>>>>>>>         Hi Martin,
>>>>>>>>
>>>>>>>>         As you might have seen in my later reply to Roger, there's
>>>>>>>>         still hope on that front: setpgid() + wait(-pgid, ...)
>>>>>>>> might
>>>>>>>>         be the answer. I'm exploring in that direction. Shells are
>>>>>>>>         doing it, so why can't JDK?
>>>>>>>>
>>>>>>>>         It's a little trickier for Process API, since I imagine
>>>>>>>> that
>>>>>>>>         shells form a group of processes from a pipeline which is
>>>>>>>>         known in-advance while Process API will have to add
>>>>>>>> processes
>>>>>>>>         to the live group dynamically. So some races will have
>>>>>>>> to be
>>>>>>>>         resolved, but I think it's doable.
>>>>>>>>
>>>>>>>>
>>>>>>>>     This is a clever idea, and it's arguably better to design
>>>>>>>>     subprocesses so they live in separate process groups (emacs
>>>>>>>> does
>>>>>>>>     that), but:
>>>>>>>>     Every time you create a process group, you change the effect
>>>>>>>> of a
>>>>>>>>     user signal like Ctrl-C, since it's sent to only one group.
>>>>>>>>     Maybe propagate signals to the subprocess group? It's starting
>>>>>>>>     to get complicated...
>>>>>>>>
>>>>>>>
>>>>>>>     Hi Martin,
>>>>>>>
>>>>>>>     Yes, shells send Ctrl-C (SIGINT) and other signals initiated by
>>>>>>>     terminal to a (foreground) process group. A process group is
>>>>>>>     formed from a pipeline of interconnected processes. Each
>>>>>>> pipeline
>>>>>>>     is considered to be a separate "job", hence shells call this
>>>>>>>     feature "job-control". Child processes by default inherit
>>>>>>> process
>>>>>>>     group from it's parent, so children born with Process API (and
>>>>>>>     their children) inherit the process group from the JVM process.
>>>>>>>     Considering the intentions of shell job-controll, is propagating
>>>>>>>     SIGTERM/SIGINT/SIGTSTP/SIGCONT signals to children spawned by
>>>>>>>     Process API desirable? If so, then yes, handling those
>>>>>>> signals in
>>>>>>>     JVM and propagating them to current process group that contains
>>>>>>>     all children spawned by Process API and their descendants would
>>>>>>>     have to be performed by JVM. That problem would certainly
>>>>>>> have to
>>>>>>>     be addressed. But let's first see what I found out about
>>>>>>>     sigaction(SIGCHLD, ...), setpgid(pid, pgid), waitpid(-pgid,
>>>>>>> ...),
>>>>>>>     etc...
>>>>>>>
>>>>>>>     waitpid(-pgid, ...) alone seems to not be enough for our task.
>>>>>>>     Mainly because a process can re-assign it's group and join some
>>>>>>>     other group. I don't know if this is a situation that occurs in
>>>>>>>     real world, but imagine if we have one live child process in a
>>>>>>>     process group pgid1 and no unwaited exited children. If we
>>>>>>> issue:
>>>>>>>
>>>>>>>         waitpid(-pgid1, &status, 0);
>>>>>>>
>>>>>>>     Then this call blocks, because at the time it was given, there
>>>>>>>     were >0 child processes in the pgid1 group and none of them has
>>>>>>>     exited yet. Now if this one child process changes it's process
>>>>>>>     group with:
>>>>>>>
>>>>>>>         setpgid(0, pgid2);
>>>>>>>
>>>>>>>     Then the waitpid call in the parent does not return (maybe
>>>>>>> this is
>>>>>>>     a bug in Linux?) although there are no more live child processes
>>>>>>>     in the pgid1 group any more. Even when this child exits, the
>>>>>>> call
>>>>>>>     to waitpid does not return, since this child is not in the group
>>>>>>>     we are waiting for when it exits. If all our children
>>>>>>> "escape" the
>>>>>>>     group in such way, the tread doing waiting will never
>>>>>>> unblock. To
>>>>>>>     solve this, we can employ signal handlers. In a signal
>>>>>>> handler for
>>>>>>>     SIGCHLD signal we can invoke:
>>>>>>>
>>>>>>>         waitpid(-pgid1, &status, WNOHANG); // non-blocking call
>>>>>>>
>>>>>>>     ...in loop until it either returns (0) which means that there're
>>>>>>>     no more unwaited exited children in the group at the momen or
>>>>>>> (-1)
>>>>>>>     with errno == ECHILD, which means that there're no more children
>>>>>>>     in the queried group any more - the group does not exist any
>>>>>>> more.
>>>>>>>     Since signal handler is invoked whith SIGCHLD being masked and
>>>>>>>     there is one bit of pending signal state in the kernel, no child
>>>>>>>     exit can be "skipped" this way. Unless the child "escapes" by
>>>>>>>     changing it's group. I don't know of a plausible reason for a
>>>>>>>     program to change it's process group. If a program executing as
>>>>>>>     JVM child wants to become a background daemon it usually behaves
>>>>>>>     as follows:
>>>>>>>
>>>>>>>     - fork()s a grand-child and then exit()s (so we get notified via
>>>>>>>     signal and waitpid(-pgid, ...) successfully for it's exitstatus)
>>>>>>>     - the grand-child then changes it's session and group (becomes
>>>>>>>     session and group leader), closes file descriptors, etc. The
>>>>>>>     responsibility for waiting on the grand-child daemon is
>>>>>>>     transferred to the init process (pid=1) since the grand-child
>>>>>>>     becomes an orphan (has no parent).
>>>>>>>
>>>>>>>     Ignoring this still unsolved problem of possible ill-behaved
>>>>>>> child
>>>>>>>     program that changes it's process group, I started
>>>>>>> constructing a
>>>>>>>     proof-of-concept prototype. What I will do in the prototype is
>>>>>>>     start throwing IllegalStateException from the methods of the
>>>>>>>     Process API that pertain to such children. I think this is
>>>>>>> reasonable.
>>>>>>>
>>>>>>>     Stay tuned,
>>>>>>>
>>>>>>>     Peter
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>


-- 
- DML



More information about the core-libs-dev mailing list