RFR(S): 8170409: CMS: Crash in CardTableModRefBSForCTRS::process_chunk_boundaries

Volker Simonis volker.simonis at gmail.com
Wed Nov 30 16:41:10 UTC 2016


On Tue, Nov 29, 2016 at 11:37 PM,  <dean.long at oracle.com> wrote:
> This looks good to me too, but I'm just wondering if the 2nd load_acquire
> inside the lock is really needed.
>

Yes, we think you're right and the second load_acquire inside the lock
could be replaced by a normal load because entering the lock does an
implicit acquire and leaving the lock does an implicit release.

Do you want me to remove it or can we leave it in for clarity (I don't
think it hurts a lot) ?

Thank you and best regards,
Volker


> dl
>
>
>
> On 11/29/16 10:26 AM, Volker Simonis wrote:
>>
>> On Tue, Nov 29, 2016 at 4:55 PM, Mikael Gerdin <mikael.gerdin at oracle.com>
>> wrote:
>>>
>>> Hi Volker,
>>>
>>> On 2016-11-29 14:59, Volker Simonis wrote:
>>>>
>>>> On Tue, Nov 29, 2016 at 11:58 AM, Mikael Gerdin
>>>> <mikael.gerdin at oracle.com> wrote:
>>>>>
>>>>> Hi Volker,
>>>>>
>>>>> On 2016-11-28 19:33, Volker Simonis wrote:
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> can I please have a review and sponsor for the following fix submitted
>>>>>> by gunter.haug at sap.com:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170409/
>>>>>
>>>>>
>>>>>
>>>>> I haven't looked too closely on the change but one thing I'm curious
>>>>> about
>>>>> just to further my understanding of the issue:
>>>>> Would an alternate solution be to always acquire the
>>>>> ParGCRareEvent_lock
>>>>> before reading _last_LNC_resizing_collection[i] since in that case the
>>>>> mutex
>>>>> semantics would ensure proper ordering of the memory accesses?
>>>>>
>>> Ok, thanks for confirming my thoughts around this.
>>>
>>> I think the change seems ok but what do you think about making
>>> _last_LNC_resizing_collection volatile to further ensure that there's not
>>> funny compiler optimizations going on?
>>>
>> In general we agree that it would be fine to also make
>> _last_LNC_resizing_collection volatile and here's an updated webrev
>> which does exactly that:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170409.v2/
>>
>> Our initial reasoning was that the aquiring the ParGCRareEvent_lock
>> implies a C++ "sequence point" which should prevent the compiler from
>> caching the value read from _last_LNC_resizing_collection. But with
>> the "volatile" modifier the code is defintely clearer.
>>
>> @Thomas: can you please use the new webrev for pushing?
>>
>> Thanks,
>> Volker
>>
>>> /Mikael
>>>
>>>
>>>> Hi Mikael,
>>>>
>>>> we've discussed that as well and we think you are right in that always
>>>> acquiring the ParGCRareEvent_lock before reading
>>>> _last_LNC_resizing_collection[i] would solve the problem as well. But
>>>> as ParGCRareEvent_lock is potentially a heavy-weight OS lock that
>>>> would not be a practical solution.
>>>>
>>>> Thanks,
>>>> Volker
>>>>
>>>>> Thanks
>>>>> /Mikael
>>>>>
>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8170409
>>>>>>
>>>>>> We've observed a crash (see bug report for a stack trace) in
>>>>>> CardTableModRefBSForCTRS::process_chunk_boundaries() from time to time
>>>>>> since several years now, but only on non TSO platforms:
>>>>>>
>>>>>> - It only happens in opt builds.
>>>>>> - Analysis of the assembly code revealed the actual crash site to be
>>>>>> an array store to a pointer (_lowest_non_clean) which is an argument
>>>>>> to process_chunk_boundaries()
>>>>>> - The pointer is actually calculated in
>>>>>> CardTableModRefBS::get_LNC_array_for_space() and passed as argument to
>>>>>> CardTableModRefBSForCTRS::process_chunk_boundaries()
>>>>>> - CardTableModRefBS::get_LNC_array_for_space() doesn't enforce TSO on
>>>>>> _last_LNC_resizing_collection[i] so the pointer to an uninitialized
>>>>>> structure (i.e._lowest_non_clean) could become visible to other
>>>>>> threads before the value of _last_LNC_resizing_collection[i].
>>>>>>
>>>>>> Solution:
>>>>>>
>>>>>> Use OrderAccess::load_acquire and OrderAccess::release_store for
>>>>>> accessing _last_LNC_resizing_collection[i] in
>>>>>> CardTableModRefBSForCTRS::get_LNC_array_for_space()
>>>>>>
>>>>>> Thanks you and best regards,
>>>>>> Volker
>>>>>>
>



More information about the hotspot-gc-dev mailing list