RFR: 8251570: 8215624 causes assert(worker_id <' _n_workers) failed: Invalid worker_id
Stefan Karlsson
stefan.karlsson at oracle.com
Fri Aug 14 18:25:37 UTC 2020
Hi all,
Further manual testing shows that this patch asserts when run with:
makec ../build/fastdebug-shenandoah/ test
TEST="java/util/logging/TestLoggerWeakRefLeak.java"
JTREG="JAVA_OPTIONS=-XX:+UseG1GC -XX:ParallelGCThreads=1"
# guarantee(num_workers <= total_workers()) failed: Trying to execute
task Iterating heap with 12 workers which is more than the amount of
total workers 1.
V [libjvm.so+0x19cf977] WorkGang::run_task(AbstractGangTask*, unsigned
int, bool)+0x457
V [libjvm.so+0xcbf38a] HeapInspection::populate_table(KlassInfoTable*,
BoolObjectClosure*, unsigned int)+0x60a
V [libjvm.so+0xcbf91c] HeapInspection::heap_inspection(outputStream*,
unsigned int)+0xcc
V [libjvm.so+0xc44fa5] VM_GC_HeapInspection::doit()+0x45
My patch fixes the problem when the heap inspection code decides to use
fewer threads than the available worker threads, but it doesn't fix the
case when the heap inspection tries to use more threads.
Without this patch, the code silently reverts to using few worker
threads than the heap inspection requested. With the patch, we try to
run the number of threads requested by the heap inspection code, and
asserts that it won't work.
There needs to be some sort of agreement between the HeapInspection and
the GC about how many threads to run.
I've taken a step back and completely removed the
CollectedHeap::run_task code, since it's too fragile to use. Instead I
let callers that want to fiddle with the number of worker threads, have
to deal with the added complexity.
https://cr.openjdk.java.net/~stefank/8251570/webrev.02.delta/
https://cr.openjdk.java.net/~stefank/8251570/webrev.02/
You can almost see some traces of this problem in
SafepointSynchronize::do_cleanup_tasks(), which can't simply use a
CollectedHeap::run_task call because of the way ParallelSPCleanupTask
needs to know the number of worker threads. The heap inspection problem
is similar, but worse since it wants to decide for itself the number of
worker threads, instead of relying on the current active number of workers.
This patch is increasingly becoming more and more GC agnostic, so I
think it would be good if serviceability devs take a close look at this
as well.
Note that even if the user tries to specify the number of threads to use
for the parallel heap inspection, there's no guarantee at all that we'll
use that many threads. I didn't change that with this patch, but I'm not
sure if that was apparent for the authors and reviewers of 8215624.
I've only done limited testing on this patch on my computer. I'm
currently running more extended testing.
Thanks,
StefanK
On 2020-08-14 16:40, Stefan Karlsson wrote:
> Hi all,
>
> Please review this patch to fix a newly introduced bug where there can
> be a mismatch between the number of parallel heap inspection threads and
> the number of used GC threads
>
> https://cr.openjdk.java.net/~stefank/8251570/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8251570
>
> The assert triggers because of this code:
> ParallelObjectIterator* poi =
> Universe::heap()->parallel_object_iterator(parallel_thread_num);
> if (poi != NULL) {
> ParHeapInspectTask task(poi, cit, filter);
> Universe::heap()->run_task(&task);
>
> G1's parallel_object_iterator sets up a HeapRegionClaimer with
> parallel_thread_num, while run_task uses active_workers() of the used
> WorkGang.
>
> The original patch also has another, latent bug. Both ZGC and Shenandoah
> uses the wrong WorkGang in run_task. Currently, non of those GCs provide
> a parallel_object_iterator, so the code isn't executed ATM.
>
> The proposed fix for this is to simplify the code and introduce a
> concrete function CollectedHeap::run_task_at_safepoint that takes the
> number of requested workers as a parameter, and dispatches the task
> through the CollectedHeap::get_safepoint_workers() workers (if provided
> by the GC).
>
> Testing:
> - Locally with the failing tests and all GCs
> - tier1-tier5 on Linux
>
> Thanks,
> StefanK
More information about the hotspot-gc-dev
mailing list