RFR(S): 8170409: CMS: Crash in CardTableModRefBSForCTRS::process_chunk_boundaries

Volker Simonis volker.simonis at gmail.com
Tue Nov 29 13:49:25 UTC 2016


Hi Thomas,

thanks a lot for the quick review!

although self-evident, I would nevertheless prefer to let the comment
in just in order to increase the awareness for non-TSO architectures.

But as I'd also need a sponsor (and kindly ask you hereby to be one :)
I'm fine to leave the ultimate decision to him :)

Notice that the fix was contributed by gunter.haug at sap.com so please
leave the "Contributed-by:" field in place.

Regards,
Volker

On Tue, Nov 29, 2016 at 11:43 AM, Thomas Schatzl
<thomas.schatzl at oracle.com> wrote:
> Hi,
>
> On Mon, 2016-11-28 at 19:33 +0100, Volker Simonis wrote:
>> Hi,
>>
>> can I please have a review and sponsor for the following fix
>> submitted
>> by gunter.haug at sap.com:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170409/
>> https://bugs.openjdk.java.net/browse/JDK-8170409
>>
>> We've observed a crash (see bug report for a stack trace) in
>> CardTableModRefBSForCTRS::process_chunk_boundaries() from time to
>> time since several years now, but only on non TSO platforms:
>>
>> - It only happens in opt builds.
>> - Analysis of the assembly code revealed the actual crash site to be
>> an array store to a pointer (_lowest_non_clean) which is an argument
>> to process_chunk_boundaries()
>> - The pointer is actually calculated in
>> CardTableModRefBS::get_LNC_array_for_space() and passed as argument
>> to
>> CardTableModRefBSForCTRS::process_chunk_boundaries()
>> - CardTableModRefBS::get_LNC_array_for_space() doesn't enforce TSO on
>> _last_LNC_resizing_collection[i] so the pointer to an uninitialized
>> structure (i.e._lowest_non_clean) could become visible to other
>> threads before the value of _last_LNC_resizing_collection[i].
>>
>> Solution:
>>
>> Use OrderAccess::load_acquire and OrderAccess::release_store for
>> accessing _last_LNC_resizing_collection[i] in
>> CardTableModRefBSForCTRS::get_LNC_array_for_space()
>>
>
>   looks good to me. Nice catch.
>
> During my review I did look at other uses of
> _last_LNC_resizing_collection and friends, aparently no issues there.
> I also looked at what parallel gc does, and while it implements the
> same idea of slicing the card table in chunks for parallel card table
> scan, it does not need or use the LNC tables (of course). That
> implementation also does not seem to use any similar shared data
> structure, so it should be good (in
> CardTableExtension::scavenge_contents_parallel).
>
> One comment: maybe the "Therefore use acquire/release to guarantee this
> in non TSO architectures." sentence in the added comment could be
> removed.
>
> Adding appropriate code so that collectors work on supported
> architectures seems to be an obvious thing to do at this point, and
> should not need to be argued for.
>
> However you can leave that sentence in if you really want. I do not
> need a re-review if you removed it before pushing either.
>
> Thanks,
>   Thomas
>



More information about the hotspot-gc-dev mailing list