RFR: 8317007: Add bulk removal of dead nmethods during class unloading [v3]
Albert Mingkun Yang
ayang at openjdk.org
Wed Dec 6 18:01:39 UTC 2023
On Tue, 5 Dec 2023 11:50:46 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
>> Hi all,
>>
>> please review this change that bulk-removes dead nmethods for the STW collectors instead of unregistering nmethod by nmethod. This significantly speeds up the class unloading phase.
>>
>> For G1, this is almost 100% the code that has been removed from the review for JDK-8315503.
>>
>> For Serial and Parallel GC, the code is new.
>>
>> This change does not try to improve the situation for concurrent collectors - this would at first glance require extending the scope of the `CodeCache_lock` which I did not want to do. See the CR for more details.
>>
>> Also, no parallelism for Parallel GC: the existing data structure for code root remembered set is a linked list. There is in almost all cases no point in trying to parallelize what is basically a traversal of the linked list (with each element not having a lot of work to do). I filed JDK-8320067. There should already be a significant speedup with this change, as each `unregister_nmethod` call previously traversed the entire linked list to find that element (i.e. complexity reduction of O(n^2) -> O(n)).
>>
>> One more comment:
>>
>> * So it would be possible to merge the `ScavengableNMethod::prune_nmethods_not_into_young` which removes elements from the code root sets that do no longer need to be remembered (due to object movement) with removing dead/unlinked nmethods. However, this would mean to put that class unloading code to the end of gc as during phase 1 the new locations which are relevant for pruning uninteresting code root remembered set entries did not move yet. I wanted to keep determining the unlinking nmethods and class/code unloading them together in the code, but I could be convinced to move it.
>>
>> Testing: tier1-4
>>
>> Depends on #16759, please review first.
>>
>> Thanks,
>> Thomas
>
> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
>
> - Remove some dead code introduced by merge
> - Merge branch 'master' of https://git.openjdk.org/jdk into submit/8317007-nmethod-bulk-removal
> - 8317007
>
> please review this change that bulk-removes dead nmethods for the STW collectors instead of unregistering nmethod by nmethod. This significantly speeds up the class unloading phase.
>
> For G1, this is almost 100% the code that has been removed from the review for JDK-8315503.
>
> For Serial and Parallel GC, the code is new.
>
> This change does not try to improve the situation for concurrent collectors - this would at first glance require extending the scope of the CodeCache_lock which I did not want to do. See the CR for more details.
>
> Also, no parallelism for Parallel GC: the existing data structure for code root remembered set is a linked list. There is in almost all cases no point in trying to parallelize what is basically a traversal of the linked list (with each element not having a lot of work to do). I file JDK-8320067. There should still be a significant speedup, as each unregister_nmethod call previously traversed the entire linked list to find that element (i.e. complexity reduction of O(n^2) -> O(n)).
>
> One more comment:
> So it would be possible to merge the ScavengableNMethod::prune_nmethods which removes elements from the code root sets that do no longer need to be remembered (due to object movement) and removing dead nmethods.
> However, this would mean to put that code at the end of gc (where prune_nmethods is called now) as during phase 1 the new locations which are relevant for pruning uninteresting code root remembered set entries did not move yet. I wanted to keep determining the dead nmethods and removing them together in the code, but I could be convinced to move it.
>
> Testing: tier1-4
> - 8317809 Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2)
>
> Introduce a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading.
> GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge).
>
> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform
> th...
How much perf impact does "bulk removal" introduce? (If I understand this PR correctly, this IS an optimization after all.)
src/hotspot/share/code/compiledMethod.hpp line 177:
> 175: void* _gc_data;
> 176:
> 177: virtual void purge(bool free_code_cache_data = true, bool unregister_nmethods = true) = 0;
I wonder if it's possible to make them non-optional. Having multiple optional args hinders readability, IMO.
src/hotspot/share/code/nmethod.cpp line 1467:
> 1465: }
> 1466:
> 1467: if (unregister_nmethods) {
Could it be named sth like "unregister_individual_nmethod" to emphasize that this operation is on a single nmethod?
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 2521:
> 2519: HeapRegionClaimer _hrclaimer;
> 2520:
> 2521: class RemoveUnlinkedHeapRegionClosure : public HeapRegionClosure {
Can it be `struct`?
-------------
PR Review: https://git.openjdk.org/jdk/pull/16767#pullrequestreview-1768276025
PR Review Comment: https://git.openjdk.org/jdk/pull/16767#discussion_r1417759659
PR Review Comment: https://git.openjdk.org/jdk/pull/16767#discussion_r1417758332
PR Review Comment: https://git.openjdk.org/jdk/pull/16767#discussion_r1417760251
More information about the hotspot-gc-dev
mailing list