RFR 8220684 : Process.waitFor(long, TimeUnit) can return false for a process that exited within the timeout
David Holmes
david.holmes at oracle.com
Fri Mar 15 07:16:21 UTC 2019
On 15/03/2019 5:03 pm, Ivan Gerasimov wrote:
> Thank you David!
>
>
> On 3/14/19 11:01 PM, David Holmes wrote:
>> Hi Ivan,
>>
>> On 15/03/2019 12:02 pm, Ivan Gerasimov wrote:
>>> Hi David!
>>>
>>>
>>> On 3/14/19 5:28 PM, David Holmes wrote:
>>>> Hi Ivan,
>>>>
>>>> On 15/03/2019 5:49 am, Ivan Gerasimov wrote:
>>>>> Hello!
>>>>>
>>>>> The default implementation of Process.waitFor(long, TimeUnit) does
>>>>> not check if the process has exited after the last portion of the
>>>>> timeout has expired.
>>>>
>>>> Please clarify. There is always a race between detecting a timeout
>>>> and the process actually terminating. If the process actually
>>>> terminates while you're deciding to report a timeout that seems
>>>> just an acceptable possibility. No matter what you do the process
>>>> could terminate just after you decided to report the timeout.
>>>>
>>>>
>>> Current code for waitFor(...) is
>>>
>>> 212 do {
>>> 213 try {
>>> 214 exitValue();
>>> 215 return true;
>>> 216 } catch(IllegalThreadStateException ex) {
>>> 217 if (rem > 0)
>>> 218 Thread.sleep(
>>> 219 Math.min(TimeUnit.NANOSECONDS.toMillis(rem) + 1, 100));
>>> 220 }
>>> 221 rem = unit.toNanos(timeout) - (System.nanoTime() -
>>> startTime);
>>> 222 } while (rem > 0);
>>> 223 return false;
>>>
>>> So, if the loop has just processed the last 100 ms portion of the
>>> timeout, it ends right away, without checking if the process has
>>> exited during this period.
>>
>> Ah I see. Yes there should be a final check after what will be the
>> last sleep.
>>
>>> Not a big deal of course, but it is more accurate to check if the
>>> process has exited at the *end* of the specified timeout, and not 100
>>> milliseconds before the end.
>>
>> Agreed.
>>
>> I share Pavel's concern about the implicit overflow in calculating a
>> deadline, rather than comparing elapsed time with the timeout. There
>> has been some effort in core-libs to remove assumptions about how
>> overflows are handled.
>>
> We only check sign of the remainingNanos, which is initially strictly
> positive timeout. Over time it gets decreased by elapsed time.
> The elapsed time is the difference between System.nanoTime() measured at
> line 217 and 213, so it is always non-negative (well, strictly positive,
> as there's sleep in between).
>
> I don't really see a possibility of overflow for remainingNanos here.
>
> The deadline may overflow, and it is totally fine, as we don't care
> about its sign.
It's deadline that I was flagging.
> Am I missing something?
Just a code cleanliness issue. I had thought, but now can't find, these
kinds of overflowing calculations were being cleaned up in core-libs.
But perhaps I'm just mis-remembering.
Cheers,
David
>
> With kind regards,
> Ivan
>
>> Thanks,
>> David
>> -----
>>
>>> With kind regards,
>>> Ivan
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>> JDK has two implementations of Process (for Unix and Windows) and
>>>>> they both override waitFor(), so it's not an issue for them.
>>>>>
>>>>> Still, it is better to provide a more accurate default implementation.
>>>>>
>>>>> I'm not quite certain the regression test needs to be included in
>>>>> the fix. The test does demonstrate the issue with the unfixed JDK
>>>>> and passed Okay on all tested platforms in Mach5. Yet, I suspect
>>>>> the test can still show false negative results, as there are no
>>>>> guaranties that even such simple application as `true` will finish
>>>>> in 100 ms.
>>>>> I can tag the test as @ignored with a comment, or simply remove it
>>>>> from the fix.
>>>>>
>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8220684
>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8220684/00/webrev/
>>>>>
>>>>> Thanks in advance for reviewing!
>>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list