RFR: 6900441 PlatformEvent.park(millis) on Linux could still be affected by changes to the time-of-day clock
David Holmes
david.holmes at oracle.com
Fri Sep 13 16:41:13 PDT 2013
On 14/09/2013 3:03 AM, Daniel D. Daugherty wrote:
> > http://cr.openjdk.java.net/~dholmes/6900441/webrev.v4/
>
> I think you want to get some other eyes on this change. This is
> my fourth review and I'm not catching everything. Or I'm catching
> something, but not for exactly the right reasons.
Still need that second reviewer anyway.
My fault for not doing full debug testing before asking for review.
> src/os/linux/vm/os_linux.hpp
> No comments.
>
> src/os/linux/vm/os_linux.cpp
> No comments.
>
>
> On 9/12/13 9:23 PM, David Holmes wrote:
>> <sigh> Still not right. I confused the PlatformEvent logic with the
>> Parker logic and thought that the check "s < 1" indicated a thread was
>> definitely blocked; but it doesn't it only means that a thread may be
>> blocked. Hence if the thread is not blocked _cur_index == -1 and the
>> assert trips. So remove the assert and add a check for _cur_index !=
>> -1, as that means a thread is definitely blocked.
>
> Yup. So it was possible for the assert() in unpark() to race
> with the value of _cur_index. Not for the reason I thought so
> not my catch here.
Not a race simply a logic error. unpark() is often issued even if the
thread is not parked, but the assert assumed a thread had to be parked.
>
>> Updated webrev:
>>
>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v4/
>>
>> but testing still TBD so hold off on further reviews.
>>
>> The logic is messy now with yet-another mutex-unlock having to be
>> added, but I wanted to minimize the potential change - let me know if
>> you think it worthwhile refactoring.
>
> Strangely enough, the new version seems more clear to me.
> Please don't refactor; that can wait for JDK9.
Okay - thanks for re-review. We actually get a performance boost with
the new code because now Parker (like PlatformEvent) skips the
pthread_cond_signal if no thread is waiting.
David
> Dan
>
>>
>> Thanks,
>> David
>>
>> On 13/09/2013 10:59 AM, David Holmes wrote:
>>> On 13/09/2013 9:30 AM, Daniel D. Daugherty wrote:
>>>> On 9/12/13 3:47 PM, David Holmes wrote:
>>>>>
>>>>> I found a bug in the assertion related logic - new webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v3/
>>>>>
>>>>> For the untimed case in park I wasn't setting the _cur_index value (I
>>>>> was implicitly using zero) and so the assertion it wasn't -1 was
>>>>> failing in unpark.
>>>>
>>>> So is the usage model for timed park() such that no other thread
>>>> is calling unpark() on the thread that called timed park()?
>>>>
>>>> If another thread can call unpark() on a thread calling timed park(),
>>>> then I think you may have a race in:
>>>>
>>>> 5917 assert(_cur_index != -1, "invariant");
>>>>
>>>> The thread calling unpark() might reach the assertion after the
>>>> thread that made the timed call to park() set:
>>>>
>>>> 5888 _cur_index = -1;
>>>>
>>>> Don't know if it's really possible, but...
>>>
>>> The code you refer to is all inside a critical region locked with the
>>> mutex. No races possible.
>>>
>>>
>>>>> With regard to assert versus assert_status - yes this would work:
>>>>>
>>>>> assert_status(status == 0, errno, "gettimeofday");
>>>>>
>>>>> I had never considered that. :) When I added assert_status to diagnoze
>>>>> a particular issue I never attempted to change all the existing
>>>>> asserts on the library calls. So we have a lot of simple
>>>>> "assert(status==0,"invariant") where assert_status could have been
>>>>> used, as well as the old gettimeofday style calls. Do you want me to
>>>>> change them all in os_linux.cpp?
>>>>
>>>> I guess that depends on whether we think this fix will need to be
>>>> backported to HSX-24 or earlier... Less changes, the better, if a
>>>> backport is in the future...
>>>
>>> Yes backports to 7 and 6 are definitely on the cards. The
>>> assert/assert_status can be cleaned up as part of the big cleanup for 9
>>> (where a lot of the calls will disappear anyway as we get rid of all the
>>> ancient workarounds). So I will leave this aspect as is.
>>>
>>> Thanks,
>>> David
>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 13/09/2013 1:14 AM, Daniel D. Daugherty wrote:
>>>>>> On 9/11/13 5:14 PM, David Holmes wrote:
>>>>>>> updated webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dholmes/6900441/webrev.v2/
>>>>>>
>>>>>> Looks good. Possibly modulo one comment below...
>>>>>>
>>>>>>
>>>>>>> On 12/09/2013 7:10 AM, Daniel D. Daugherty wrote:
>>>>>>>
>>>>>>>> line 5541: assert_status(status == 0, status,
>>>>>>>> "clock_gettime");
>>>>>>>> line 5553: assert(status == 0, "gettimeofday");
>>>>>>>> So why is one an assert_status() call and the other is a
>>>>>>>> plain old assert() call?
>>>>>>>
>>>>>>> Different API's. The old unix API's, like gettimeofday return -1 on
>>>>>>> error and set errno. The "modern" posix APIs, eg pthread APIs and
>>>>>>> clock_gettime etc, return the actual error code on error - hence
>>>>>>> assert_status can be used to show the actual error in that case.
>>>>>>
>>>>>> I don't quite get what you're trying to say here.
>>>>>> It seems that both calls are trying to verify
>>>>>> that "status == 0". Or are you saying that:
>>>>>>
>>>>>> assert_status(status == 0, status, "gettimeofday");
>>>>>>
>>>>>> is kind of a waste because "status" always be either "0" or "-1".
>>>>>> So how about this:
>>>>>>
>>>>>> assert_status(status == 0, errno, "gettimeofday");
>>>>>>
>>>>>> instead? That pattern should work to get more info.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>
>>
>>
>
More information about the hotspot-dev
mailing list