RFR: 8309404: Parallel: Process class loader data graph in parallel in young gc

Guoxiong Li gli at openjdk.org
Tue Jun 6 16:10:00 UTC 2023


On Sat, 3 Jun 2023 09:57:43 GMT, Guoxiong Li <gli at openjdk.org> wrote:

> Hi all,
> 
> This patch parallelizes the process of the class loader data graph in young gc.
> 
> The class `ClassLoaderData` has a field `_claim` to avoid applying oop closure more than once 
> and the method `ClassLoaderData::oops_do` can check if the CLD had been claimed.
> The parallel full gc has already used them in `MarkFromRootsTask::work` and `PSAdjustTask::work`.
> 
> But I don't have experience to test/verify the performance improvement of the GC.
> If this patch needs such test data before integrating, please guide and help me here.
> 
> Thanks for the review and guidance.
> 
> Best Regards,
> -- Guoxiong

> The reason why this is not parallelized in stw pauses is because the CLD data structure is not amenable to parallelization at all. This is basically a linked list, and when parallelizing it this way,
> 
> * every thread visits every CLD anyway (doing the pointer chasing)
> * threads are massively choking themselves on obtaining the claim value
> * the code adds another pass through the CLD linked list clearing the claim marks
>   In my experience you _will_ get significant negative scaling (i.e. that phase taking significant multiples of the original time) with this simple approach.
> 
> I.e. this has been analyzed before, see https://bugs.openjdk.org/browse/JDK-8030144; only Shenandoah does a parallel walk, but with limited number of threads (e.g. https://bugs.openjdk.org/browse/JDK-8246097)
> 
> The latter CR also provides some applications with apparently many CLDG entries (Spring Boot and CLion), which may be used for this investigation. Fwiw, it may be easier to start this investigation with G1 as it has the timing logging already implmented (but it is no problem to add this to Parallel temporarily(?)).

Thanks for the guidance. The full GC in `Parallel` had parallelized it, so previously, I thought it is good to do the same thing in young GC. One of young GC or full GC should be adjusted. I will investigate more information in related issues (mainly test the performance).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14297#issuecomment-1579053881


More information about the hotspot-gc-dev mailing list