RFR: 6900441 PlatformEvent.park(millis) on Linux could still be affected by changes to the time-of-day clock
David Holmes
david.holmes at oracle.com
Fri Sep 13 04:28:55 PDT 2013
Testing ok - v4 should be good to go.
Thanks,
David
On 13/09/2013 1:23 PM, David Holmes wrote:
> <sigh> Still not right. I confused the PlatformEvent logic with the
> Parker logic and thought that the check "s < 1" indicated a thread was
> definitely blocked; but it doesn't it only means that a thread may be
> blocked. Hence if the thread is not blocked _cur_index == -1 and the
> assert trips. So remove the assert and add a check for _cur_index != -1,
> as that means a thread is definitely blocked.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v4/
>
> but testing still TBD so hold off on further reviews.
>
> The logic is messy now with yet-another mutex-unlock having to be added,
> but I wanted to minimize the potential change - let me know if you think
> it worthwhile refactoring.
>
> Thanks,
> David
>
> On 13/09/2013 10:59 AM, David Holmes wrote:
>> On 13/09/2013 9:30 AM, Daniel D. Daugherty wrote:
>>> On 9/12/13 3:47 PM, David Holmes wrote:
>>>>
>>>> I found a bug in the assertion related logic - new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v3/
>>>>
>>>> For the untimed case in park I wasn't setting the _cur_index value (I
>>>> was implicitly using zero) and so the assertion it wasn't -1 was
>>>> failing in unpark.
>>>
>>> So is the usage model for timed park() such that no other thread
>>> is calling unpark() on the thread that called timed park()?
>>>
>>> If another thread can call unpark() on a thread calling timed park(),
>>> then I think you may have a race in:
>>>
>>> 5917 assert(_cur_index != -1, "invariant");
>>>
>>> The thread calling unpark() might reach the assertion after the
>>> thread that made the timed call to park() set:
>>>
>>> 5888 _cur_index = -1;
>>>
>>> Don't know if it's really possible, but...
>>
>> The code you refer to is all inside a critical region locked with the
>> mutex. No races possible.
>>
>>
>>>> With regard to assert versus assert_status - yes this would work:
>>>>
>>>> assert_status(status == 0, errno, "gettimeofday");
>>>>
>>>> I had never considered that. :) When I added assert_status to diagnoze
>>>> a particular issue I never attempted to change all the existing
>>>> asserts on the library calls. So we have a lot of simple
>>>> "assert(status==0,"invariant") where assert_status could have been
>>>> used, as well as the old gettimeofday style calls. Do you want me to
>>>> change them all in os_linux.cpp?
>>>
>>> I guess that depends on whether we think this fix will need to be
>>> backported to HSX-24 or earlier... Less changes, the better, if a
>>> backport is in the future...
>>
>> Yes backports to 7 and 6 are definitely on the cards. The
>> assert/assert_status can be cleaned up as part of the big cleanup for 9
>> (where a lot of the calls will disappear anyway as we get rid of all the
>> ancient workarounds). So I will leave this aspect as is.
>>
>> Thanks,
>> David
>>
>>> Dan
>>>
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 13/09/2013 1:14 AM, Daniel D. Daugherty wrote:
>>>>> On 9/11/13 5:14 PM, David Holmes wrote:
>>>>>> updated webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v2/
>>>>>
>>>>> Looks good. Possibly modulo one comment below...
>>>>>
>>>>>
>>>>>> On 12/09/2013 7:10 AM, Daniel D. Daugherty wrote:
>>>>>>
>>>>>>> line 5541: assert_status(status == 0, status, "clock_gettime");
>>>>>>> line 5553: assert(status == 0, "gettimeofday");
>>>>>>> So why is one an assert_status() call and the other is a
>>>>>>> plain old assert() call?
>>>>>>
>>>>>> Different API's. The old unix API's, like gettimeofday return -1 on
>>>>>> error and set errno. The "modern" posix APIs, eg pthread APIs and
>>>>>> clock_gettime etc, return the actual error code on error - hence
>>>>>> assert_status can be used to show the actual error in that case.
>>>>>
>>>>> I don't quite get what you're trying to say here.
>>>>> It seems that both calls are trying to verify
>>>>> that "status == 0". Or are you saying that:
>>>>>
>>>>> assert_status(status == 0, status, "gettimeofday");
>>>>>
>>>>> is kind of a waste because "status" always be either "0" or "-1".
>>>>> So how about this:
>>>>>
>>>>> assert_status(status == 0, errno, "gettimeofday");
>>>>>
>>>>> instead? That pattern should work to get more info.
>>>>>
>>>>> Dan
>>>>>
>>>
More information about the hotspot-dev
mailing list