RFR(s): 8214180: Need better granularity for sleeping

Tue Dec 18 09:23:19 UTC 2018

Hi David,

On 11/30/18 5:43 AM, David Holmes wrote:
> Hi Robbin,
> 
> On 28/11/2018 9:27 pm, Robbin Ehn wrote:
>> Hi David,
>>
>> Inc:
>> http://cr.openjdk.java.net/~rehn/8214180/v2/inc/webrev/
> 
> Okay. I still have my doubts/concerns about Windows, but as long as the observed 
> minimum "nanosleep" is no worse than the 1ms "short sleep" that was previously 
> requested by SpinYield::yield_or_sleep() then it should be okay.
> 
> I also have some concerns the test might fail on some versions of Windows or 
> running on particular hardware. Can you try to do a --test-repeat run in mach5 
> for Windows only so we hit as many of the Windows machines as possible. :)

You are correct, it failed 7 of 200 windows runs.

I removed gtest... not really happy about that, but not sure what else to do.

Inc:
http://cr.openjdk.java.net/~rehn/8214180/v3/inc/webrev/
Full:
http://cr.openjdk.java.net/~rehn/8214180/v3/full/webrev/

Thanks, Robbin

> 
> Thanks,
> David
> 
>> Full:
>> http://cr.openjdk.java.net/~rehn/8214180/v2/webrev/
>>
>> On 2018-11-27 00:08, David Holmes wrote:
>>> Hi Robbin,
>>>
>>> On 22/11/2018 12:06 am, Robbin Ehn wrote:
>>>> Hi all, please review.
>>>>
>>>> naked_short_sleep is to coarse grained on contemporary hardware/os:es.
>>>> 1 ms as minimum when we can complete an entire safe-point in 0.5 ms is a 
>>>> very long time.
>>>> Sleeping a very short time instead yielding have several uses-cases.
>>>
>>> So you factored out os::naked_short_sleep into os_posix.cpp for use by all 
>>> platforms except Windows. That seems fine. Solaris is already linked with 
>>> -lrt so use of nanosleep should be fine there.
>>>
>>> You added os::naked_short_nanosleep, defined in os_posix.cpp, to use 
>>> nanosleep. Also fine.
>>>
>>> Question: have you actually measured the observable minimum sleep time on 
>>> different OS? (And it can even vary depending on hardware).
>>
>> Windows ~1ms, Linux ~55us(can vary a lot depending on power saving, scheduler 
>> timings etc..).
>>
>>>
>>> For Windows you create and use a WaitableTimer. That does not seem okay. That 
>>> seems extremely heavyweight. The time taken to create and use the timer might 
>>> be longer than what you intended to sleep for! And again there is the issue 
>>> of the actual accuracy of the timer even if you can specify nanosecond times. 
>>> I'm also unclear about the time value passed to the timer - the docs state it 
>>> is supposed to be expressed in 100ns increments, and it's unclear if that 
>>> also applies to the relative form ??
>>
>> Yes, I commented the creation of the timer. I considered adding the timer to 
>> each thread, but I rather not. And on Linux
>> you don't need any syscall for creating such primitives, if you still need to 
>> do that windows I don't know. But as it
>> turns out it doesn't matter, since the scheduler delay is ~1ms on my win10 
>> box, if I'm luck I get 0.5ms. So the cost is
>> not measurable. Presumably windows is still not tick-less?
>>
>> Yes, I miss-read the docs, correct, it should be in hundreds of nanos, not 
>> nanos, thanks.
>>
>>>
>>>> Here I add it SpinYield to get much smother back-off delay curve.
>>>> Which means it will be usable in more places.
>>>
>>> Seems okay - assuming a 1 microsecond sleep time is achieveable.
>>
>> As I said it is not achievable today, it should read as do not execute for at 
>> least 1us.
>> Arguably we could go higher or lower. I think of it the other way around:
>> Your CAS have repeatedly failed, how many instruction should the competing 
>> threads execute before it worth testing again.
>>
>> Thanks!
>>
>> /Robbin
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8214180
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rehn/8214180/webrev/
>>>>
>>>> Passes t1-3.
>>>>
>>>> Thanks, Robbin