RFR: 8307106: Allow concurrent GCs to walk CLDG without ClassLoaderDataGraph_lock

Erik Österlund eosterlund at openjdk.org
Tue May 2 12:53:17 UTC 2023


On Tue, 2 May 2023 09:25:43 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> A concurrent GC with concurrent class unloading can't currently walk the CLDG without the CLDG_lock today. We should add some synchronization code so it can do that safely. This patch adds the missing bits.
>
> I don't see any change here where code stops acquiring the CLDG lock. Is this just a preparatory change?
> 
> Using acquire/release etc is certainly necessary to be able to walk the list safely, but it is not obvious it is sufficient. I can't tell when nodes in the CLDG actually get deleted such that the GC thread doing the walking can't access a no longer existing node.
> 
> Thanks.

Thanks for having a look, @dholmes-ora.

> I don't see any change here where code stops acquiring the CLDG lock. Is this just a preparatory change?

Yes, indeed. It's for generational ZGC.

> Using acquire/release etc is certainly necessary to be able to walk the list safely, but it is not obvious it is sufficient. I can't tell when nodes in the CLDG actually get deleted such that the GC thread doing the walking can't access a no longer existing node.

1. We unlink all unloading classes during do_unloading(). Then the GC performs a thread-local handshake with all threads, before it finally calls purge() to delete them. So in that sense, we use safe memory reclamation to ensure that nobody is poking around at the CLD concurrently any longer, when we delete them.
2. Note that we already perform release_store on the head when CLDs are inserted (always prepended). And that's because we need to safely publish the next pointer (and generally the initial CLD contents). However, when we unlink CLDs, we only use relaxed stores, because we are not mutating the CLD in any interesting way that needs ordering w.r.t. the unlinking stores.

> src/hotspot/share/classfile/classLoaderDataGraph.cpp line 531:
> 
>> 529:       assert(dead == _head, "sanity check");
>> 530:       // The GC might be walking this concurrently
>> 531:       Atomic::store(&_head, data);
> 
> If you are using `load_acquire` then surely this must be a `release-store` as they need to be paired.

The load_acquire synchronizes with release_store on insertions. This relaxed unlinking store, doesn't publish anything new that the load_acquire is interested in. It rolls back the head to a CLD that was installed at some earlier point with release_store, meaning it has already been published safely before.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13718#issuecomment-1531419021
PR Review Comment: https://git.openjdk.org/jdk/pull/13718#discussion_r1182503175


More information about the hotspot-runtime-dev mailing list