RFR: Load balance remembered set scanning [v4]
Y. Srinivas Ramakrishna
ysr at openjdk.org
Mon Aug 8 07:09:45 UTC 2022
On Mon, 1 Aug 2022 21:56:10 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:
>> This branch divides remembered set scanning into smaller units of work so that multiple cores can more effectively share the workload between them. The benefit is to reduce concurrent scan remembered set times and to increase the parallelism of this phase.
>
> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix white space
Noticed this when playing with the code. I didn't review the rest of the changes, but I think this might be a possible explanation (and a potential fix) for the crashes you were seeing.
I have the fix running SPECjbb successfully, but have not checked if your original code would have elicited the crash you saw with SPECjbb.
src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 127:
> 125: return;
> 126: }
> 127: #endif
The crashes are likely because you are pulling an assignment off the worklist, and if you see a cancellation, you are dropping it on the floor and returning.
I think the check should be at the end of the loop after the assignment pulled off the work list has been processed but before the next one is pulled.
Here's a diff that should work:
diff --git a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
index 2d7cbbb97dd..03f647f23b2 100644
--- a/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
+++ b/src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp
@@ -81,15 +81,7 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
// set up thread local closure for shen ref processor
_rp->set_mark_closure(worker_id, &cl);
struct ShenandoahRegionChunk assignment;
- bool has_work = _work_list->next(&assignment);
- while (has_work) {
-#ifdef ENABLE_REMEMBERED_SET_CANCELLATION
- // This check is currently disabled to avoid crashes that occur
- // when we try to cancel remembered set scanning
- if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
- return;
- }
-#endif
+ while (_work_list->next(&assignment)) {
ShenandoahHeapRegion* region = assignment._r;
log_debug(gc)("ShenandoahScanRememberedTask::do_work(%u), processing slice of region "
SIZE_FORMAT " at offset " SIZE_FORMAT ", size: " SIZE_FORMAT,
@@ -101,13 +93,15 @@ void ShenandoahScanRememberedTask::do_work(uint worker_id) {
assert(clusters * cluster_size == assignment._chunk_size, "Chunk assignments must align on cluster boundaries");
HeapWord* end_of_range = region->bottom() + assignment._chunk_offset + assignment._chunk_size;
- // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass. */
+ // During concurrent mark, region->top() equals TAMS with respect to the current young-gen pass.
if (end_of_range > region->top()) {
end_of_range = region->top();
}
scanner->process_region_slice(region, assignment._chunk_offset, clusters, end_of_range, &cl, false, _is_concurrent);
}
- has_work = _work_list->next(&assignment);
+ if (heap->check_cancelled_gc_and_yield(_is_concurrent)) {
+ return;
+ }
}
}
-------------
PR: https://git.openjdk.org/shenandoah/pull/153
More information about the shenandoah-dev
mailing list