RFR: 8352185: Shenandoah: Invalid logic for remembered set verification
Xiaolong Peng
xpeng at openjdk.org
Tue Mar 18 22:18:44 UTC 2025
On Tue, 18 Mar 2025 01:25:11 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:
>> There are some scenarios in which GenShen may have improper remembered set verification logic:
>>
>> 1. Concurrent young cycles following a Full GC:
>>
>> In the end of ShenandoahFullGC, it resets bitmaps for the entire heap w/o resetting marking context to be incomplete, but ShenandoahVerifier has code like below to get a complete old marking context for remembered set verification
>>
>>
>> ShenandoahVerifier
>> ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() {
>> shenandoah_assert_generations_reconciled();
>> if (_heap->old_generation()->is_mark_complete() || _heap->gc_generation()->is_global()) {
>> return _heap->complete_marking_context();
>> }
>> return nullptr;
>> }
>>
>>
>> For the concurrent young GC cycles after a full GC, the old marking context used for remembered set verification is stale, and may cause unexpected result.
>>
>> 2. For the impl of `ShenandoahVerifier::get_marking_context_for_old` mentioned above, it always return a marking context for global GC, but marking bitmaps is already reset before before init-mark, `ShenandoahVerifier::help_verify_region_rem_set` always skip verification in this case.
>>
>> 3. ShenandoahConcurrentGC always clean remembered set read table, but only swap read/write table when gc generation is young, this issue causes remembered set verification before init-mark to use a completely clean remembered set, but it is covered by issue 2.
>>
>>
>> ### Test
>> - [x] `make test TEST=hotspot_gc_shenandoah`
>
> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 648:
>
>> 646: assert(!heap->has_forwarded_objects(), "No forwarded objects on this path");
>> 647:
>> 648: if (heap->mode()->is_generational()) {
>
> I think we do not want to change this code. We only swap remembered set for young-gen because gen will scan the remset and reconstruct it with more updated information. For a global GC, we do not scan the remset and do not reconstruct it. If we swap here, we will lose the information that is currently within the remset.
Thanks for for explanation, I have been reading and trying the understand how the remembered set works in GenShen. I wasn't sure whether this is actually right.
In generational mode, if the GC cycle is global, the read table is already cleaned during reset phase, so remembered set verification from `verify_before_concmark` and `verify_before_update_refs` shouldn't work properly, I think the remembered set verification before mark and update references should be disabled, what do you think? Meanwhile, there is no need to clean read table during global cycle in generational mode.
> src/hotspot/share/gc/shenandoah/shenandoahFullGC.cpp line 953:
>
>> 951: // pinned regions.
>> 952: if (!r->is_pinned()) {
>> 953: _heap->marking_context()->reset_top_at_mark_start(r);
>
> Here, and below, I think we want to keep complete_marking_context() rather than changing to marking_context()
The marking context is not complete anymore after ShenandoahMCResetCompleteBitmapTask, but ShenandoahMCResetCompleteBitmapTask only reset bitmaps for the regions w/o pinned objects, the place calling `set_mark_incomplete()` need to moved to some place after ShenandoahPostCompactClosure being executed if use complete_marking_context here.
> src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 1281:
>
>> 1279: ShenandoahMarkingContext* ShenandoahVerifier::get_marking_context_for_old() {
>> 1280: shenandoah_assert_generations_reconciled();
>> 1281: if (_heap->old_generation()->is_mark_complete()) {
>
> In the case that this is an global GC, we know that old-generation()->is_mark_complete() by virtue of the current program counter, I assume. (We would only ask for the old marking context if global marking were already finished.) In the case that we are doing a global GC cycle, I'm guessing that we do not set is-mark-complete for the old generation. So that's why I believe you need to keep the condition as originally written.
If it is global GC in generational mode, old-generation()->is_mark_complete() is always false after reset and before mark because bitmaps of the entire heap including old gen has been reset during concurrent reset phase, so old mark is not finished in when verify_before_concmark is called. The marking context return by this method is only used for remembered set verification, but as I pointed out in the first comments, we shouldn't do remembered set verification in such case because the rem-set read table is already cleaned/stale.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2000354419
PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2000361095
PR Review Comment: https://git.openjdk.org/jdk/pull/24092#discussion_r2000371950
More information about the shenandoah-dev
mailing list