RFR: 8343704: Bad GC parallelism with processing Cleaner queues [v13]

Wed Nov 20 00:52:16 UTC 2024

On Tue, 19 Nov 2024 19:53:45 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> See the bug for more discussion and reproducer. This PR replaces the ad-hoc linked list with segmented list of arrays. Arrays are easy targets for GC. There are possible improvements here, most glaring is parallelism that is currently knee-capped by global synchronization. The synchronization scheme follows what we have in original code, and I think it is safer to continue with it right now.
>> 
>> I'll put performance data in a separate comment.
>> 
>> Additional testing:
>>  - [x] Original reproducer improves drastically
>>  - [x] New microbenchmark shows no regression on "churning" tests, which covers insertion/removal perf
>>  - [x] New microbenchmark shows improvement on Full GC times (crude, but repeatable), serves as a proxy for reproducer
>>  - [x] `java/lang/ref` tests in release 
>>  - [x] `all` tests in fastdebug
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use RandomFactory in test

src/java.base/share/classes/jdk/internal/ref/PhantomCleanable.java line 55:

> 53:      * Synchronized by the same lock as the list itself.
> 54:      */
> 55:     CleanerImpl.CleanableList.Node node;

I think we can get away with not storing a `Node` in every `PhantomCleanable`, and instead look for it at `index` in each `Node` in the list. But I don't know if this would be a performance win. Heap savings (in `PhantomCleanableRef`)? Worse locality from loading multiple `Nodes`?
Just an idea...

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22043#discussion_r1849349523