RFR(S): 8170409: CMS: Crash in CardTableModRefBSForCTRS::process_chunk_boundaries
Mikael Gerdin
mikael.gerdin at oracle.com
Tue Nov 29 15:55:45 UTC 2016
Hi Volker,
On 2016-11-29 14:59, Volker Simonis wrote:
> On Tue, Nov 29, 2016 at 11:58 AM, Mikael Gerdin
> <mikael.gerdin at oracle.com> wrote:
>> Hi Volker,
>>
>> On 2016-11-28 19:33, Volker Simonis wrote:
>>>
>>> Hi,
>>>
>>> can I please have a review and sponsor for the following fix submitted
>>> by gunter.haug at sap.com:
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170409/
>>
>>
>> I haven't looked too closely on the change but one thing I'm curious about
>> just to further my understanding of the issue:
>> Would an alternate solution be to always acquire the ParGCRareEvent_lock
>> before reading _last_LNC_resizing_collection[i] since in that case the mutex
>> semantics would ensure proper ordering of the memory accesses?
>>
Ok, thanks for confirming my thoughts around this.
I think the change seems ok but what do you think about making
_last_LNC_resizing_collection volatile to further ensure that there's
not funny compiler optimizations going on?
/Mikael
>
> Hi Mikael,
>
> we've discussed that as well and we think you are right in that always
> acquiring the ParGCRareEvent_lock before reading
> _last_LNC_resizing_collection[i] would solve the problem as well. But
> as ParGCRareEvent_lock is potentially a heavy-weight OS lock that
> would not be a practical solution.
>
> Thanks,
> Volker
>
>> Thanks
>> /Mikael
>>
>>
>>> https://bugs.openjdk.java.net/browse/JDK-8170409
>>>
>>> We've observed a crash (see bug report for a stack trace) in
>>> CardTableModRefBSForCTRS::process_chunk_boundaries() from time to time
>>> since several years now, but only on non TSO platforms:
>>>
>>> - It only happens in opt builds.
>>> - Analysis of the assembly code revealed the actual crash site to be
>>> an array store to a pointer (_lowest_non_clean) which is an argument
>>> to process_chunk_boundaries()
>>> - The pointer is actually calculated in
>>> CardTableModRefBS::get_LNC_array_for_space() and passed as argument to
>>> CardTableModRefBSForCTRS::process_chunk_boundaries()
>>> - CardTableModRefBS::get_LNC_array_for_space() doesn't enforce TSO on
>>> _last_LNC_resizing_collection[i] so the pointer to an uninitialized
>>> structure (i.e._lowest_non_clean) could become visible to other
>>> threads before the value of _last_LNC_resizing_collection[i].
>>>
>>> Solution:
>>>
>>> Use OrderAccess::load_acquire and OrderAccess::release_store for
>>> accessing _last_LNC_resizing_collection[i] in
>>> CardTableModRefBSForCTRS::get_LNC_array_for_space()
>>>
>>> Thanks you and best regards,
>>> Volker
>>>
>>
More information about the hotspot-gc-dev
mailing list