RFR: Load balance remembered set scanning [v4]
Y. Srinivas Ramakrishna
ysr at openjdk.org
Mon Aug 8 16:11:47 UTC 2022
On Mon, 8 Aug 2022 13:01:09 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:
>> src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 127:
>>
>>> 125: return;
>>> 126: }
>>> 127: #endif
>>
>> The crashes are likely because you are pulling an assignment off the worklist, and if you see a cancellation, you are dropping it on the floor and returning.
>>
>> I think the check should be at the end of the loop after the assignment pulled off the work list has been processed but before the next one is pulled.
>>
>> Here's a diff that should work:
>>
>>
>> diff --git a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
>> index 2d7cbbb97dd..03f647f23b2 100644
>> --- a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
>> +++ b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
>> @@ -81,15 +81,7 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
>> // set up thread local closure for shen ref processor
>> _rp->set_mark_closure(worker_id, &cl);
>> struct ShenandoahRegionChunk assignment;
>> - bool has_work = _work_list->next(&assignment);
>> - while (has_work) {
>> -#ifdef ENABLE_REMEMBERED_SET_CANCELLATION
>> - // This check is currently disabled to avoid crashes that occur
>> - // when we try to cancel remembered set scanning
>> - if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
>> - return;
>> - }
>> -#endif
>> + while (_work_list->next(&assignment)) {
>> ShenandoahHeapRegion* region = assignment._r;
>> log_debug(gc)("ShenandoahScanRememberedTask::do_work(%u), processing slice of region "
>> SIZE_FORMAT " at offset " SIZE_FORMAT ", size: " SIZE_FORMAT,
>> @@ -101,13 +93,15 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
>> assert(clusters * cluster_size == assignment._chunk_size, "Chunk assignments must align on cluster boundaries");
>> HeapWord* end_of_range = region->bottom() + assignment._chunk_offset + assignment._chunk_size;
>>
>> - // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass. */
>> + // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass.
>> if (end_of_range > region->top()) {
>> end_of_range = region->top();
>> }
>> scanner->process_region_slice(region, assignment._chunk_offset, clusters, end_of_range, &cl, false, _is_concurrent);
>> }
>> - has_work = _work_list->next(&assignment);
>> + if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
>> + return;
>> + }
>> }
>> }
>
> Thanks very much for figuring out this problem. I'd like to address this change in a different pull request to keep the history cleaner.
👍
-------------
PR: https://git.openjdk.org/shenandoah/pull/153
More information about the shenandoah-dev
mailing list