RFR: 8251570: 8215624 causes assert(worker_id <' _n_workers) failed: Invalid worker_id

Fri Aug 14 16:48:21 UTC 2020

On 2020-08-14 18:35, Thomas Schatzl wrote:
> Hi,
> 
> On 14.08.20 16:40, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to fix a newly introduced bug where there can 
>> be a mismatch between the number of parallel heap inspection threads 
>> and the number of used GC threads
>>
>> https://cr.openjdk.java.net/~stefank/8251570/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8251570
>>
>> The assert triggers because of this code:
>>       ParallelObjectIterator* poi = 
>> Universe::heap()->parallel_object_iterator(parallel_thread_num);
>>       if (poi != NULL) {
>>         ParHeapInspectTask task(poi, cit, filter);
>>        Universe::heap()->run_task(&task);
>>
>> G1's parallel_object_iterator sets up a HeapRegionClaimer with 
>> parallel_thread_num, while run_task uses active_workers() of the used 
>> WorkGang.
>>
>> The original patch also has another, latent bug. Both ZGC and 
>> Shenandoah uses the wrong WorkGang in run_task. Currently, non of 
>> those GCs provide a parallel_object_iterator, so the code isn't 
>> executed ATM.
>>
>> The proposed fix for this is to simplify the code and introduce a 
>> concrete function CollectedHeap::run_task_at_safepoint that takes the 
>> number of requested workers as a parameter, and dispatches the task 
>> through the CollectedHeap::get_safepoint_workers() workers (if 
>> provided by the GC).
>>
> 
> Please add an assert(SafepointSynchronize::is_at_safepoint()) to the 
> CollectedHeap::run_task_at_safepoint() method to make sure it's only 
> called at that time.
> I do not need a re-review for such an addition.

OK.

> 
>> Testing:
>> - Locally with the failing tests and all GCs
>> - tier1-tier5 on Linux
> 
> Looks good otherwise.

Thanks!

StefanK

> 
> Thomas