RFR: 8351091: Shenandoah: global marking context completeness is not accurately maintained

Xiaolong Peng xpeng at openjdk.org
Tue Mar 4 21:26:04 UTC 2025


On Tue, 4 Mar 2025 17:48:58 GMT, William Kemper <wkemper at openjdk.org> wrote:

>> With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: [ShenandoahHeap::complete_marking_context()](https://github.com/openjdk/jdk/pull/23886/files#diff-d5ddf298c36b1c91bf33f9bff7bedcc063074edd68c298817f1fdf39d2ed970fL642) should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed.
>> 
>> This change PR fix the bug in global marking context completeness flag, and update all the places using `ShenandoahHeap::complete_marking_context()` to use proper API.
>> 
>> ### Test
>> - [x] hotspot_gc_shenandoah
>> - [x] Tier 1
>> - [ ]  Tier 2
>
> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2837:
> 
>> 2835:   } else if (affiliation == OLD_GENERATION) {
>> 2836:     return old_generation();
>> 2837:   } else if (affiliation == FREE) {
> 
> I don't think it makes sense to connect `FREE` regions to the global generation in this way. Free regions are _not_ affiliated with any generation. I think in some of these cases where you want to find the mark context, it would be possible to take it from a `_generation` member variable.

Yeah, I don't think it is necessary to change the behavior here either, I'll remove it in later update.

> src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp line 337:
> 
>> 335:   // If generation is young and referent is in old, marking context of the old
>> 336:   // may or may not be complete, we can safely drop the reference when old gen mark is complete.
>> 337:   if (_generation->is_young() && referent_region->is_old()) {
> 
> Have you seen this happen? The reference processor for each generation is only supposed to discover references for which the referent is in the collected generation. See `ShenandoahReferenceProcessor::should_discover`:
> 
>   if (!heap->is_in_active_generation(referent)) {
>     log_trace(gc,ref)("Referent outside of active generation: " PTR_FORMAT, p2i(referent));
>     return false;
>   }

Ok, I didn't see happen in any of the jtreg tests yet.

Just base on the the behavior we saw in old gc, I assumed this could happen. Now I am more curious about the real cause of the crash caused by reference from old to young, since we always check if the referent is in the active generation, that shouldn't have happened if it works as described, my feeling is there might be something fishy in the place where we use `_active_generation`(the comments it should be update only in the STW phases), maybe should we should get rid of it, currently we directly use _gc_generation in many places as well, not sure it if is possible to cause inconsistency.

I'll revert this part, I'll follow up on the questions in separate work.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1979976173
PR Review Comment: https://git.openjdk.org/jdk/pull/23886#discussion_r1980145388


More information about the hotspot-gc-dev mailing list