ProcessReaper: single thread reaper
Roger Riggs
Roger.Riggs at Oracle.com
Tue Mar 25 21:59:34 UTC 2014
Hi Peter,
The stream draining code was added in 2011 to address resource exhaustion
due to poor hygiene.
* JDK-6944584 <https://bugs.openjdk.java.net/browse/JDK-6944584>
Improvements to subprocess handling on Unix
Moves the resource 'leak' from the OS kernel/process space to the heap
and allows the process to be cleaned up sooner.
Roger
On 3/25/14 9:50 AM, Peter Levart wrote:
> On 03/25/2014 05:25 PM, Roger Riggs wrote:
>> Hi Peter,
>>
>> Thanks for the comments, I'll fix the races conditions, I intended
>> to go back to the dedicated thread but wanted to check for scalability
>> in the case of a large number of threads and non-trivial termination
>> overheads
>> where the streams had to be drained.
>>
>
> Why does stream need to be drained right away (in reaper thread)?
> Isn't it enough for child pid to be waited upon and let the client of
> the stream pull from the stream until it reaches the EOF or not if it
> doesn't need the data... Is this just to be able to reclaim the file
> descriptor of the pipe right away? Wouldn't FileInputStream's close()
> method do the job (or finalize() if user forgets to call close())?
> There would be a native resource leakage if user forgets to call
> close() AND retains the reference to InputStream, but this is the same
> with all normal streams like FileInputStream...
>
> Regards, Peter
>
>> Thanks, Roger
>>
>>
>>
>> On 3/25/14 7:05 AM, Peter Levart wrote:
>>> On 03/24/2014 10:05 PM, roger riggs wrote:
>>>> Hi Rob, Martin, et.al.
>>>>
>>>> I've prototyped (for 9) a thread reaper[1] that uses a single
>>>> thread to wait for exiting
>>>> processes and calling back to the process with the exit status.
>>>> the interesting part is getting the exit status back to the Process
>>>> that needs it
>>>> It needs more testing and hardening.
>>>>
>>>> I had not considered using a signal handler for SIGCHLD but that's
>>>> an option,
>>>> though we need to be very careful about thread usage.
>>>>
>>>> Roger
>>>>
>>>> p.s. comments on the single thread reaper appreciated (in a new thread)
>>>> [1] http://cr.openjdk.java.net/~rriggs/webrev-waitpid/
>>>>
>>>>
>>> Hi Roger,
>>>
>>> I think I found a little race. Suppose reaper task is still alive
>>> and that all consumers have been serviced (consumerCount is 0). The
>>> reaper task waits for 500 millis for next consumer to be registered,
>>> but times out. Before calling "reaperThread.release()", new consumer
>>> comes around and registers itself, also calling runReaper(), but
>>> since reaperThread.release() has not yet been called by old reaper
>>> task, new reaper task is not submitted to commonPool. The old reaper
>>> task finishes, leaving one consumer on the waitingList with no
>>> reaper task to service it. If no new consumers get registered, the
>>> waiting consumer will never be notified...
>>>
>>> The simplest solution for this race, I think, would be to have a
>>> dedicated long-running thread. It could be spawned lazily, but then
>>> it would never finish.
>>>
>>> Otherwise a nice solution with two lists (exitList/waitList) and
>>> avoidance of race with reversed orders between
>>> - consumer registration: register on waitList 1st then check
>>> exitList, and
>>> - exit event dispatch: register on exitList 1st then check waitList
>>>
>>> ...but the check you use with consumeCount local variable to detect
>>> processes spawned by other means (for purposes of logging only) has
>>> a race:
>>>
>>> thread1: Suppose a new consumer is being registered with
>>> onExitCall(...), is added on the waitList, but before checking
>>> exitList().size() and iterating the exitList, ...
>>> thread2: the reaper task detects that the very same process has
>>> finished (gets it's pid from waitpid) and adds it's pid to exitList.
>>> Then before iterating waitList, ...
>>> thread1: iterates the exitList, finds a match and consumes the pid,
>>> removing the matching entries from both exitList and waitList. Then ...
>>> thread2: the reaper task iterates waitList, doesn't find a matching
>>> entry for exitPid, doesn't increment consumeCount and voila: debug
>>> log("Unexpected process exit for pid:...").
>>>
>>>
>>> That's enough races for today.
>>>
>>> Regards, Peter
>>>
>>>>
>>>>
>>>> On 3/24/2014 12:38 AM, Rob McKenna wrote:
>>>>> Hi folks,
>>>>>
>>>>> Roger Riggs (cc'd) may want to chip in here as he's looking at the
>>>>> reaper thread arrangement in 9 at the moment.
>>>>>
>>>>> On another note, I too support the merging of those files. I
>>>>> didn't think there was much appetite for it at the time so I must
>>>>> admit this fell down my todo list. Looking at this bug did remind
>>>>> me that its something worth trying though. As per Alan's mail, I'm
>>>>> going to tackle it separately if you folks don't mind. I'll have a
>>>>> look at Peter's changes (thanks Peter!) as soon as I can and see
>>>>> about getting them in.
>>>>>
>>>>> -Rob
>>>>>
>>>>> On 23/03/14 22:30, Martin Buchholz wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Mar 23, 2014 at 2:34 AM, Martin Buchholz
>>>>>> <martinrb at google.com <mailto:martinrb at google.com>> wrote:
>>>>>>
>>>>>>
>>>>>> We have also thought about whether having reaper threads is
>>>>>> necessary. The Unix rule is that child processes should be
>>>>>> waited for, and some thread needs to do that. There's no way
>>>>>> to wait for a set of child pids, or to specify a "completion
>>>>>> handler". Well, you might be able to get the newish waitid()
>>>>>> to do what you want, but it looks like it's not sufficient
>>>>>> when java is running inside a process that might do
>>>>>> independent subprocess creation outside of the JVM.
>>>>>>
>>>>>>
>>>>>> Actually, I take it back. With sufficient work, it looks like
>>>>>> you can get SIGCHLD to give you pid information in siginfo_t
>>>>>> si_pid, and that can be used to trigger the reaping. It looks
>>>>>> like waitpid is "async-signal-safe", so we can call it from our
>>>>>> signal handler.
>>>>>>
>>>>>> While we're at it we can fix SIGCHLD handling to do signal
>>>>>> chaining, as with other signals.
>>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list