RFR: 6313903: Thread.sleep(3) might wake up immediately on windows

David Holmes david.holmes at oracle.com
Wed Sep 4 03:38:36 UTC 2019


Hi Roger,

Thanks for clarifying - yes that change would be okay. I may look at 
making it further down the track with these changes so that we have 
minimal disruption for now.

Thanks again.

David

On 4/09/2019 1:29 pm, Roger Riggs wrote:
> Hi David,
> 
> No argument this code is sensitive;  I generally agree to minimize changes.
> 
> Here's what I had in mind:  (Not compiled)
> It assumes that before the first park(millis) there will be no 
> significant elapsed time that would reduce millis.
> If that's not valid then using the original code will take that into 
> account.
> 
>      jlong prevtime = javaTimeNanos();
> 
>      assert(thread->is_Java_thread(), "sanity check");
>      JavaThread *jt = (JavaThread *) thread;
> 
>      for (;;) {
> 
>        // interruption has precedence over timing out
>        if (os::is_interrupted(thread, true)) {
>          return OS_INTRPT;
>        }
> 
>        if (millis <= 0) {
>          return OS_OK;
>        }
> 
>        {
>          ThreadBlockInVM tbivm(jt);
>          OSThreadWaitState osts(jt->osthread(), false /* not Object.wait() */);
> 
>          jt->set_suspend_equivalent();
>          // cleared by handle_special_suspend_equivalent_condition() or
>          // java_suspend_self() via check_and_wait_while_suspended()
> 
>          slp->park(millis);
> 
>          // were we externally suspended while we were waiting?
>          jt->check_and_wait_while_suspended();
>          
>          // wait complete, compute remaining wait time
>          jlong newtime = javaTimeNanos();
> 
>          if (newtime - prevtime < 0) {
>            // time moving backwards, should only happen if no monotonic clock
>            // not a guarantee() because JVM should not abort on kernel/glibc bugs
>            assert(!os::supports_monotonic_clock(),
>                   "unexpected time moving backwards detected in os::sleep(interruptible)");
>          } else {
>            millis -= (newtime - prevtime) / NANOSECS_PER_MILLISEC;
>          }
>          prevtime = newtime;
>        }
>      }
>      return OS_OK;
> 
> I don't feel strongly about this, if you prefer your original; that's fine.
> 
> $.02, Roger
> 
> 
> On 9/3/19 7:20 PM, David Holmes wrote:
>> Hi Roger,
>>
>> Thanks for taking a look.
>>
>> On 4/09/2019 12:30 am, Roger Riggs wrote:
>>> Hi David,
>>>
>>> In os.cpp:1893, can the double read of javaTimeNanos be avoided on 
>>> the first iteration?
>>
>> Note this is the existing code and I'm deliberately trying not to 
>> change anything in that regard.
>>
>>> Set newtime = prevtime before the loop and re-read it after the call 
>>> to park(:1918)
>>
>> I can't quite see the full change you propose here. The code has to 
>> read the new time, check for timeout, update prevtime - in that order, 
>> before doing the actual wait. If all that is moved after the wait then 
>> we aren't giving interrupt precedence. If we move the interrupt check 
>> after the wait as well then we need to add an additional interrupt 
>> check before the loop for the first time through. (I actually had a 
>> version that did this and decided it was too disruptive as part of 
>> this code relocation.)
>>
>>> Negible nanotime should have elapsed between 1881 and 1893.
>>> Ditto the non-interruptable code in 1936-1947.
>>
>> Note the non-interuptible code will disappear in my next RFE.
>>
>>> (Unless javaTimeNanos is free?).
>>
>> Afraid not.
>>
>> Thanks,
>> David
>> -----
>>
>>> $.02, Roger
>>>
>>>
>>>
>>>
>>> On 9/2/19 2:06 AM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-6313903
>>>> webrev: http://cr.openjdk.java.net/~dholmes/6313903/webrev/
>>>>
>>>> This is a fix to a very long term problem on Windows that really 
>>>> should have been fixed many years ago: the Windows timed-wait 
>>>> routines can return early by up to a clock tick** (which itself 
>>>> could be as long as 16ms). The fix for Thread.sleep is to track the 
>>>> elapsed time (via nanoTime) and simply adjust and repeat the sleep 
>>>> if we were going to return earlier than requested. This is the same 
>>>> as what we have been doing on non-Windows for many years.
>>>>
>>>> ** whether or not you observe this seems highly dependent on 
>>>> hardware - see bug report for observed behaviours.
>>>>
>>>> The additional motivation for fixing this now is to reduce the 
>>>> amount of platform specific sleep/interrupt code and to allow 
>>>> changes to the shared interrupt code. There are some additional 
>>>> RFE's in the pipeline for that.
>>>>
>>>> This fix does the following:
>>>> - moves the os_posix.cpp os::sleep implementation to os.cpp to be 
>>>> shared by all platforms and deletes the os_windows.cpp os::sleep 
>>>> implementation
>>>>   - there are some additional comments added and the JavaThread 
>>>> assertion is moved to the head of the "interruptible" code path
>>>> - updates os_windows.cpp in these additional ways:
>>>>   - updates the commentary about creating the interrupt_event (which 
>>>> is now only used by JDK library code for use by Process.waitFor)
>>>>   - updates os::interrupt to also unpark the _SleepEvent
>>>>   - moves the use of the HighResolutionInterval utility class from 
>>>> the os::sleep code to the PlatformEvent::park code, so that sleep on 
>>>> Windows will still make adjustments to the clock tick if needed (it 
>>>> also means that Object.wait() calls will now make the same 
>>>> adjustments**).
>>>>
>>>> ** This is a change behaviour but I'm not concerned about there 
>>>> being any adverse effects from this. On many systems the code will 
>>>> have no affect as the system will be using the 1ms clock tick all 
>>>> the time anyway (e.g. my laptops do this). On systems with e.g. 15ms 
>>>> tick, what you will now see is Object.wait returning on time rather 
>>>> than much later than requested e.g. before this fix a timed-wait of 
>>>> 1ms returned in 15.14ms, but after the fix it was 2.3ms. (Note that 
>>>> this observation seems contrary to the general statement that 
>>>> windows timed-wait event operations can return early by up to a 
>>>> clock tick - as I said it seems highly hardware dependent.)
>>>>
>>>> Testing: tiers 1-3
>>>>          custom testing to look at sleep and wait times, as per bug 
>>>> report.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>
> 


More information about the hotspot-runtime-dev mailing list