RFR 8220684 : Process.waitFor(long, TimeUnit) can return false for a process that exited within the timeout
Roger Riggs
Roger.Riggs at oracle.com
Fri Mar 15 18:39:21 UTC 2019
Hi Ivan,
The updated code looks fine except for style.
Please put the return and the if's on different lines (by the style
conventions).
Line 210 and 211, and 216.
Thanks, Roger
On 03/15/2019 04:08 AM, Ivan Gerasimov wrote:
> Hi David!
>
> The code could be written like following:
>
> long start = System.nanoTime();
> do {
> ...
> long elapsed = System.nanoTime() - start;
> remainingNanos = timeoutNanos - elapsed;
> } while (remainingNanos > 0);
>
> but then one realizes that this always calculates (start +
> timeoutNanos), both of which are effectively final.
> So, the sum can be calculated in advance, and named, say, "deadline" :)
>
> With kind regards,
> Ivan
>
> On 3/15/19 12:16 AM, David Holmes wrote:
>> On 15/03/2019 5:03 pm, Ivan Gerasimov wrote:
>>> Thank you David!
>>>
>>>
>>> On 3/14/19 11:01 PM, David Holmes wrote:
>>>> Hi Ivan,
>>>>
>>>> On 15/03/2019 12:02 pm, Ivan Gerasimov wrote:
>>>>> Hi David!
>>>>>
>>>>>
>>>>> On 3/14/19 5:28 PM, David Holmes wrote:
>>>>>> Hi Ivan,
>>>>>>
>>>>>> On 15/03/2019 5:49 am, Ivan Gerasimov wrote:
>>>>>>> Hello!
>>>>>>>
>>>>>>> The default implementation of Process.waitFor(long, TimeUnit)
>>>>>>> does not check if the process has exited after the last portion
>>>>>>> of the timeout has expired.
>>>>>>
>>>>>> Please clarify. There is always a race between detecting a
>>>>>> timeout and the process actually terminating. If the process
>>>>>> actually terminates while you're deciding to report a timeout
>>>>>> that seems just an acceptable possibility. No matter what you do
>>>>>> the process could terminate just after you decided to report the
>>>>>> timeout.
>>>>>>
>>>>>>
>>>>> Current code for waitFor(...) is
>>>>>
>>>>> 212 do {
>>>>> 213 try {
>>>>> 214 exitValue();
>>>>> 215 return true;
>>>>> 216 } catch(IllegalThreadStateException ex) {
>>>>> 217 if (rem > 0)
>>>>> 218 Thread.sleep(
>>>>> 219 Math.min(TimeUnit.NANOSECONDS.toMillis(rem) + 1, 100));
>>>>> 220 }
>>>>> 221 rem = unit.toNanos(timeout) - (System.nanoTime()
>>>>> - startTime);
>>>>> 222 } while (rem > 0);
>>>>> 223 return false;
>>>>>
>>>>> So, if the loop has just processed the last 100 ms portion of the
>>>>> timeout, it ends right away, without checking if the process has
>>>>> exited during this period.
>>>>
>>>> Ah I see. Yes there should be a final check after what will be the
>>>> last sleep.
>>>>
>>>>> Not a big deal of course, but it is more accurate to check if the
>>>>> process has exited at the *end* of the specified timeout, and not
>>>>> 100 milliseconds before the end.
>>>>
>>>> Agreed.
>>>>
>>>> I share Pavel's concern about the implicit overflow in calculating
>>>> a deadline, rather than comparing elapsed time with the timeout.
>>>> There has been some effort in core-libs to remove assumptions about
>>>> how overflows are handled.
>>>>
>>> We only check sign of the remainingNanos, which is initially
>>> strictly positive timeout. Over time it gets decreased by elapsed
>>> time.
>>> The elapsed time is the difference between System.nanoTime()
>>> measured at line 217 and 213, so it is always non-negative (well,
>>> strictly positive, as there's sleep in between).
>>>
>>> I don't really see a possibility of overflow for remainingNanos here.
>>>
>>> The deadline may overflow, and it is totally fine, as we don't care
>>> about its sign.
>>
>> It's deadline that I was flagging.
>>
>>> Am I missing something?
>>
>> Just a code cleanliness issue. I had thought, but now can't find,
>> these kinds of overflowing calculations were being cleaned up in
>> core-libs. But perhaps I'm just mis-remembering.
>>
>> Cheers,
>> David
>>
>>>
>>> With kind regards,
>>> Ivan
>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> With kind regards,
>>>>> Ivan
>>>>>
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> JDK has two implementations of Process (for Unix and Windows)
>>>>>>> and they both override waitFor(), so it's not an issue for them.
>>>>>>>
>>>>>>> Still, it is better to provide a more accurate default
>>>>>>> implementation.
>>>>>>>
>>>>>>> I'm not quite certain the regression test needs to be included
>>>>>>> in the fix. The test does demonstrate the issue with the
>>>>>>> unfixed JDK and passed Okay on all tested platforms in Mach5.
>>>>>>> Yet, I suspect the test can still show false negative results,
>>>>>>> as there are no guaranties that even such simple application as
>>>>>>> `true` will finish in 100 ms.
>>>>>>> I can tag the test as @ignored with a comment, or simply remove
>>>>>>> it from the fix.
>>>>>>>
>>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8220684
>>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8220684/00/webrev/
>>>>>>>
>>>>>>> Thanks in advance for reviewing!
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list