RFR: 6900441 PlatformEvent.park(millis) on Linux could still be affected by changes to the time-of-day clock
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Sep 13 10:03:55 PDT 2013
> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v4/
I think you want to get some other eyes on this change. This is
my fourth review and I'm not catching everything. Or I'm catching
something, but not for exactly the right reasons.
src/os/linux/vm/os_linux.hpp
No comments.
src/os/linux/vm/os_linux.cpp
No comments.
On 9/12/13 9:23 PM, David Holmes wrote:
> <sigh> Still not right. I confused the PlatformEvent logic with the
> Parker logic and thought that the check "s < 1" indicated a thread was
> definitely blocked; but it doesn't it only means that a thread may be
> blocked. Hence if the thread is not blocked _cur_index == -1 and the
> assert trips. So remove the assert and add a check for _cur_index !=
> -1, as that means a thread is definitely blocked.
Yup. So it was possible for the assert() in unpark() to race
with the value of _cur_index. Not for the reason I thought so
not my catch here.
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v4/
>
> but testing still TBD so hold off on further reviews.
>
> The logic is messy now with yet-another mutex-unlock having to be
> added, but I wanted to minimize the potential change - let me know if
> you think it worthwhile refactoring.
Strangely enough, the new version seems more clear to me.
Please don't refactor; that can wait for JDK9.
Dan
>
> Thanks,
> David
>
> On 13/09/2013 10:59 AM, David Holmes wrote:
>> On 13/09/2013 9:30 AM, Daniel D. Daugherty wrote:
>>> On 9/12/13 3:47 PM, David Holmes wrote:
>>>>
>>>> I found a bug in the assertion related logic - new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v3/
>>>>
>>>> For the untimed case in park I wasn't setting the _cur_index value (I
>>>> was implicitly using zero) and so the assertion it wasn't -1 was
>>>> failing in unpark.
>>>
>>> So is the usage model for timed park() such that no other thread
>>> is calling unpark() on the thread that called timed park()?
>>>
>>> If another thread can call unpark() on a thread calling timed park(),
>>> then I think you may have a race in:
>>>
>>> 5917 assert(_cur_index != -1, "invariant");
>>>
>>> The thread calling unpark() might reach the assertion after the
>>> thread that made the timed call to park() set:
>>>
>>> 5888 _cur_index = -1;
>>>
>>> Don't know if it's really possible, but...
>>
>> The code you refer to is all inside a critical region locked with the
>> mutex. No races possible.
>>
>>
>>>> With regard to assert versus assert_status - yes this would work:
>>>>
>>>> assert_status(status == 0, errno, "gettimeofday");
>>>>
>>>> I had never considered that. :) When I added assert_status to diagnoze
>>>> a particular issue I never attempted to change all the existing
>>>> asserts on the library calls. So we have a lot of simple
>>>> "assert(status==0,"invariant") where assert_status could have been
>>>> used, as well as the old gettimeofday style calls. Do you want me to
>>>> change them all in os_linux.cpp?
>>>
>>> I guess that depends on whether we think this fix will need to be
>>> backported to HSX-24 or earlier... Less changes, the better, if a
>>> backport is in the future...
>>
>> Yes backports to 7 and 6 are definitely on the cards. The
>> assert/assert_status can be cleaned up as part of the big cleanup for 9
>> (where a lot of the calls will disappear anyway as we get rid of all the
>> ancient workarounds). So I will leave this aspect as is.
>>
>> Thanks,
>> David
>>
>>> Dan
>>>
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 13/09/2013 1:14 AM, Daniel D. Daugherty wrote:
>>>>> On 9/11/13 5:14 PM, David Holmes wrote:
>>>>>> updated webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v2/
>>>>>
>>>>> Looks good. Possibly modulo one comment below...
>>>>>
>>>>>
>>>>>> On 12/09/2013 7:10 AM, Daniel D. Daugherty wrote:
>>>>>>
>>>>>>> line 5541: assert_status(status == 0, status,
>>>>>>> "clock_gettime");
>>>>>>> line 5553: assert(status == 0, "gettimeofday");
>>>>>>> So why is one an assert_status() call and the other is a
>>>>>>> plain old assert() call?
>>>>>>
>>>>>> Different API's. The old unix API's, like gettimeofday return -1 on
>>>>>> error and set errno. The "modern" posix APIs, eg pthread APIs and
>>>>>> clock_gettime etc, return the actual error code on error - hence
>>>>>> assert_status can be used to show the actual error in that case.
>>>>>
>>>>> I don't quite get what you're trying to say here.
>>>>> It seems that both calls are trying to verify
>>>>> that "status == 0". Or are you saying that:
>>>>>
>>>>> assert_status(status == 0, status, "gettimeofday");
>>>>>
>>>>> is kind of a waste because "status" always be either "0" or "-1".
>>>>> So how about this:
>>>>>
>>>>> assert_status(status == 0, errno, "gettimeofday");
>>>>>
>>>>> instead? That pattern should work to get more info.
>>>>>
>>>>> Dan
>>>>>
>>>
>
>
More information about the hotspot-dev
mailing list