RFR 8220684 : Process.waitFor(long, TimeUnit) can return false for a process that exited within the timeout

Ivan Gerasimov ivan.gerasimov at oracle.com
Fri Mar 15 07:03:01 UTC 2019


Thank you David!


On 3/14/19 11:01 PM, David Holmes wrote:
> Hi Ivan,
>
> On 15/03/2019 12:02 pm, Ivan Gerasimov wrote:
>> Hi David!
>>
>>
>> On 3/14/19 5:28 PM, David Holmes wrote:
>>> Hi Ivan,
>>>
>>> On 15/03/2019 5:49 am, Ivan Gerasimov wrote:
>>>> Hello!
>>>>
>>>> The default implementation of Process.waitFor(long, TimeUnit) does 
>>>> not check if the process has exited after the last portion of the 
>>>> timeout has expired.
>>>
>>> Please clarify. There is always a race between detecting a timeout 
>>> and the process actually terminating. If the process actually 
>>> terminates while you're deciding to report a timeout that seems 
>>> just  an acceptable possibility. No matter what you do the process 
>>> could terminate just after you decided to report the timeout.
>>>
>>>
>> Current code for waitFor(...) is
>>
>>   212         do {
>>   213             try {
>>   214                 exitValue();
>>   215                 return true;
>>   216             } catch(IllegalThreadStateException ex) {
>>   217                 if (rem > 0)
>>   218                     Thread.sleep(
>>   219 Math.min(TimeUnit.NANOSECONDS.toMillis(rem) + 1, 100));
>>   220             }
>>   221             rem = unit.toNanos(timeout) - (System.nanoTime() - 
>> startTime);
>>   222         } while (rem > 0);
>>   223         return false;
>>
>> So, if the loop has just processed the last 100 ms portion of the 
>> timeout, it ends right away, without checking if the process has 
>> exited during this period.
>
> Ah I see. Yes there should be a final check after what will be the 
> last sleep.
>
>> Not a big deal of course, but it is more accurate to check if the 
>> process has exited at the *end* of the specified timeout, and not 100 
>> milliseconds before the end.
>
> Agreed.
>
> I share Pavel's concern about the implicit overflow in calculating a 
> deadline, rather than comparing elapsed time with the timeout. There 
> has been some effort in core-libs to remove assumptions about how 
> overflows are handled.
>
We only check sign of the remainingNanos, which is initially strictly 
positive timeout.  Over time it gets decreased by elapsed time.
The elapsed time is the difference between System.nanoTime() measured at 
line 217 and 213, so it is always non-negative (well, strictly positive, 
as there's sleep in between).

I don't really see a possibility of overflow for remainingNanos here.

The deadline may overflow, and it is totally fine, as we don't care 
about its sign.

Am I missing something?

With kind regards,
Ivan

> Thanks,
> David
> -----
>
>> With kind regards,
>> Ivan
>>
>>>
>>> David
>>> -----
>>>
>>>> JDK has two implementations of Process (for Unix and Windows) and 
>>>> they both override waitFor(), so it's not an issue for them.
>>>>
>>>> Still, it is better to provide a more accurate default implementation.
>>>>
>>>> I'm not quite certain the regression test needs to be included in 
>>>> the fix.  The test does demonstrate the issue with the unfixed JDK 
>>>> and passed Okay on all tested platforms in Mach5. Yet, I suspect 
>>>> the test can still show false negative results, as there are no 
>>>> guaranties that even such simple application as `true` will finish 
>>>> in 100 ms.
>>>> I can tag the test as @ignored with a comment, or simply remove it 
>>>> from the fix.
>>>>
>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8220684
>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8220684/00/webrev/
>>>>
>>>> Thanks in advance for reviewing!
>>>>
>>>
>>
>

-- 
With kind regards,
Ivan Gerasimov



More information about the core-libs-dev mailing list