RFR(S): 8170409: CMS: Crash in CardTableModRefBSForCTRS::process_chunk_boundaries

dean.long at oracle.com dean.long at oracle.com
Wed Nov 30 21:43:45 UTC 2016


I'm fine with leaving it in.  How about adding a comment saying it's 
there for clarity?

dl


On 11/30/16 8:41 AM, Volker Simonis wrote:
> On Tue, Nov 29, 2016 at 11:37 PM,  <dean.long at oracle.com> wrote:
>> This looks good to me too, but I'm just wondering if the 2nd load_acquire
>> inside the lock is really needed.
>>
> Yes, we think you're right and the second load_acquire inside the lock
> could be replaced by a normal load because entering the lock does an
> implicit acquire and leaving the lock does an implicit release.
>
> Do you want me to remove it or can we leave it in for clarity (I don't
> think it hurts a lot) ?
>
> Thank you and best regards,
> Volker
>
>
>> dl
>>
>>
>>
>> On 11/29/16 10:26 AM, Volker Simonis wrote:
>>> On Tue, Nov 29, 2016 at 4:55 PM, Mikael Gerdin <mikael.gerdin at oracle.com>
>>> wrote:
>>>> Hi Volker,
>>>>
>>>> On 2016-11-29 14:59, Volker Simonis wrote:
>>>>> On Tue, Nov 29, 2016 at 11:58 AM, Mikael Gerdin
>>>>> <mikael.gerdin at oracle.com> wrote:
>>>>>> Hi Volker,
>>>>>>
>>>>>> On 2016-11-28 19:33, Volker Simonis wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> can I please have a review and sponsor for the following fix submitted
>>>>>>> by gunter.haug at sap.com:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170409/
>>>>>>
>>>>>>
>>>>>> I haven't looked too closely on the change but one thing I'm curious
>>>>>> about
>>>>>> just to further my understanding of the issue:
>>>>>> Would an alternate solution be to always acquire the
>>>>>> ParGCRareEvent_lock
>>>>>> before reading _last_LNC_resizing_collection[i] since in that case the
>>>>>> mutex
>>>>>> semantics would ensure proper ordering of the memory accesses?
>>>>>>
>>>> Ok, thanks for confirming my thoughts around this.
>>>>
>>>> I think the change seems ok but what do you think about making
>>>> _last_LNC_resizing_collection volatile to further ensure that there's not
>>>> funny compiler optimizations going on?
>>>>
>>> In general we agree that it would be fine to also make
>>> _last_LNC_resizing_collection volatile and here's an updated webrev
>>> which does exactly that:
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170409.v2/
>>>
>>> Our initial reasoning was that the aquiring the ParGCRareEvent_lock
>>> implies a C++ "sequence point" which should prevent the compiler from
>>> caching the value read from _last_LNC_resizing_collection. But with
>>> the "volatile" modifier the code is defintely clearer.
>>>
>>> @Thomas: can you please use the new webrev for pushing?
>>>
>>> Thanks,
>>> Volker
>>>
>>>> /Mikael
>>>>
>>>>
>>>>> Hi Mikael,
>>>>>
>>>>> we've discussed that as well and we think you are right in that always
>>>>> acquiring the ParGCRareEvent_lock before reading
>>>>> _last_LNC_resizing_collection[i] would solve the problem as well. But
>>>>> as ParGCRareEvent_lock is potentially a heavy-weight OS lock that
>>>>> would not be a practical solution.
>>>>>
>>>>> Thanks,
>>>>> Volker
>>>>>
>>>>>> Thanks
>>>>>> /Mikael
>>>>>>
>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8170409
>>>>>>>
>>>>>>> We've observed a crash (see bug report for a stack trace) in
>>>>>>> CardTableModRefBSForCTRS::process_chunk_boundaries() from time to time
>>>>>>> since several years now, but only on non TSO platforms:
>>>>>>>
>>>>>>> - It only happens in opt builds.
>>>>>>> - Analysis of the assembly code revealed the actual crash site to be
>>>>>>> an array store to a pointer (_lowest_non_clean) which is an argument
>>>>>>> to process_chunk_boundaries()
>>>>>>> - The pointer is actually calculated in
>>>>>>> CardTableModRefBS::get_LNC_array_for_space() and passed as argument to
>>>>>>> CardTableModRefBSForCTRS::process_chunk_boundaries()
>>>>>>> - CardTableModRefBS::get_LNC_array_for_space() doesn't enforce TSO on
>>>>>>> _last_LNC_resizing_collection[i] so the pointer to an uninitialized
>>>>>>> structure (i.e._lowest_non_clean) could become visible to other
>>>>>>> threads before the value of _last_LNC_resizing_collection[i].
>>>>>>>
>>>>>>> Solution:
>>>>>>>
>>>>>>> Use OrderAccess::load_acquire and OrderAccess::release_store for
>>>>>>> accessing _last_LNC_resizing_collection[i] in
>>>>>>> CardTableModRefBSForCTRS::get_LNC_array_for_space()
>>>>>>>
>>>>>>> Thanks you and best regards,
>>>>>>> Volker
>>>>>>>




More information about the hotspot-gc-dev mailing list