RFR: Load balance remembered set scanning [v4]
Kelvin Nilsen
kdnilsen at openjdk.org
Mon Aug 8 13:04:58 UTC 2022
On Mon, 8 Aug 2022 07:03:34 GMT, Y. Srinivas Ramakrishna <ysr at openjdk.org> wrote:
>> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Fix white space
>
> src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 127:
>
>> 125: return;
>> 126: }
>> 127: #endif
>
> The crashes are likely because you are pulling an assignment off the worklist, and if you see a cancellation, you are dropping it on the floor and returning.
>
> I think the check should be at the end of the loop after the assignment pulled off the work list has been processed but before the next one is pulled.
>
> Here's a diff that should work:
>
>
> diff --git a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
> index 2d7cbbb97dd..03f647f23b2 100644
> --- a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
> @@ -81,15 +81,7 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
> // set up thread local closure for shen ref processor
> _rp->set_mark_closure(worker_id, &cl);
> struct ShenandoahRegionChunk assignment;
> - bool has_work = _work_list->next(&assignment);
> - while (has_work) {
> -#ifdef ENABLE_REMEMBERED_SET_CANCELLATION
> - // This check is currently disabled to avoid crashes that occur
> - // when we try to cancel remembered set scanning
> - if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
> - return;
> - }
> -#endif
> + while (_work_list->next(&assignment)) {
> ShenandoahHeapRegion* region = assignment._r;
> log_debug(gc)("ShenandoahScanRememberedTask::do_work(%u), processing slice of region "
> SIZE_FORMAT " at offset " SIZE_FORMAT ", size: " SIZE_FORMAT,
> @@ -101,13 +93,15 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
> assert(clusters * cluster_size == assignment._chunk_size, "Chunk assignments must align on cluster boundaries");
> HeapWord* end_of_range = region->bottom() + assignment._chunk_offset + assignment._chunk_size;
>
> - // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass. */
> + // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass.
> if (end_of_range > region->top()) {
> end_of_range = region->top();
> }
> scanner->process_region_slice(region, assignment._chunk_offset, clusters, end_of_range, &cl, false, _is_concurrent);
> }
> - has_work = _work_list->next(&assignment);
> + if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
> + return;
> + }
> }
> }
Thanks very much for figuring out this problem. I'd like to address this change in a different pull request to keep the history cleaner.
-------------
PR: https://git.openjdk.org/shenandoah/pull/153
More information about the shenandoah-dev
mailing list