RFR: Load balance remembered set scanning [v4]

Y. Srinivas Ramakrishna ysr at openjdk.org
Mon Aug 8 16:11:47 UTC 2022


On Mon, 8 Aug 2022 13:01:09 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 127:
>> 
>>> 125:       return;
>>> 126:     }
>>> 127: #endif
>> 
>> The crashes are likely because you are pulling an assignment off the worklist, and if you see a cancellation, you are dropping it on the floor and returning.
>> 
>> I think the check should be at the end of the loop after the assignment pulled off the work list has been processed but before the next one is pulled.
>> 
>> Here's a diff that should work:
>> 
>> 
>> diff --git a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
>> index 2d7cbbb97dd..03f647f23b2 100644
>> --- a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
>> +++ b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
>> @@ -81,15 +81,7 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
>>    // set up thread local closure for shen ref processor
>>    _rp->set_mark_closure(worker_id, &cl);
>>    struct ShenandoahRegionChunk assignment;
>> -  bool has_work = _work_list->next(&assignment);
>> -  while (has_work) {
>> -#ifdef ENABLE_REMEMBERED_SET_CANCELLATION
>> -    // This check is currently disabled to avoid crashes that occur
>> -    // when we try to cancel remembered set scanning
>> -    if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
>> -      return;
>> -    }
>> -#endif
>> +  while (_work_list->next(&assignment)) {
>>      ShenandoahHeapRegion* region = assignment._r;
>>      log_debug(gc)("ShenandoahScanRememberedTask::do_work(%u), processing slice of region "
>>                    SIZE_FORMAT " at offset " SIZE_FORMAT ", size: " SIZE_FORMAT,
>> @@ -101,13 +93,15 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
>>        assert(clusters * cluster_size == assignment._chunk_size, "Chunk assignments must align on cluster boundaries");
>>        HeapWord* end_of_range = region->bottom() + assignment._chunk_offset + assignment._chunk_size;
>>  
>> -      // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass.  */
>> +      // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass.
>>        if (end_of_range > region->top()) {
>>          end_of_range = region->top();
>>        }
>>        scanner->process_region_slice(region, assignment._chunk_offset, clusters, end_of_range, &cl, false, _is_concurrent);
>>      }
>> -    has_work = _work_list->next(&assignment);
>> +    if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
>> +      return;
>> +    }
>>    }
>>  }
>
> Thanks very much for figuring out this problem.  I'd like to address this change in a different pull request to keep the history cleaner.

👍

-------------

PR: https://git.openjdk.org/shenandoah/pull/153


More information about the shenandoah-dev mailing list