RFR: 8316694: Implement relocation of nmethod within CodeCache [v7]
Evgeny Astigeevich
eastigeevich at openjdk.org
Tue Apr 8 21:56:57 UTC 2025
On Wed, 26 Mar 2025 13:03:43 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:
>> Chad Rakoczy has updated the pull request incrementally with two additional commits since the last revision:
>>
>> - Relocate nmethod at safepoint
>> - Fix windows build
>
> I have only skimmed through what you are doing but what I have read makes me worried from a GC point of view. In general, I am not fond of "special nmethods" that work subtly different to normal nmethods and have their own special life cycles.
> It might be that some of my concerns are false because this is more of a drive by review to sanity check if you thought about the GC implications. These are just random things on top of my head.
> 1) You can't just copy oops. Propagating stale pointers to a new nmethod is not valid and will make the GC vomit. The GC assumes that it can traverse a snapshot of nmethods, and that new nmethods created after that snapshot, will have sane valid oops initially, and hence do not need fixing. Copying stale oops to a new nmethod would violate those invariants and inevitably blow up.
> 2) Class redefinition tracks in an external data structure which nmethods contained metadata that we want to eventually throw away. This is done to avoid walking the entire code cache just to keep tabs on the one nmethod that still uses the old metadata. If we clone the nmethod without putting it in said data structure, we will blow up.
> 3) I'm worried about the initial state of the nmethod entry barrier guard value being copied from the source nmethod, instead of having the initial value we expect for newly created nmethods. It means that the initial invocation will not get the nmethod entry barrier callback. The GC traverses the nmethods assuming that new nmethods created during the traversal will not start off with weird stale values.
> 4) I'm worried about copying the nmethod epoch counters used by virtual threads to mark which nmethods have been found on-stack. Copying it implies that this nmethod has been found on-stack even though it never has. To me, the implications are unknown, but perhaps you thought about it?
> 5) You don't check if the nmethod is_unloading() when cloning it. That means you can create a new nmethod that has dead oops from the get go - that cannot be allowed
> 6) Have you checked what the JVMCI speculation data and JVMCI data contains and if your approach will break that? JVMCI has an nmethod mirror object that refers back to the nmethod - this is unlikely to work out of the box with cloning.
> 7) By running the operation in a safepoint you a) introduce an obvious latency problem, b) create a new source for stale nmethod pointers that will become stale and burn. The _nm of the safepoint operation might not survive a safepoint. For ...
Hi @fisk,
Thank you for the very valuable comment. It has point we have not thought about.
> I am not fond of "special nmethods" that work subtly different to normal nmethods and have their own special life cycles.
It's not clear to me what you mean "special nmethods". IMO we don't introduce any special nmethods.
>From my point of view, a normal nmethod is an nmethod for a ordinary Java method. Nmethods for non-ordinary Java methods are special, e.g. native nmethods or method handle linkers(JDK-8263377). I think normal nmethods should be relocatable within CodeCache.
> You can't just copy oops.
Yes, this is the main issue at the moment. Can we do this at a safepoint?
> I'm worried about copying the nmethod epoch counters
We should clear them. If not, it is a bug.
> You don't check if the nmethod is_unloading() when cloning it.
Should such nmethods be not entrant? We don't relocate not entrant nmethods.
> Have you checked what the JVMCI speculation data
Good point to check.
> By running the operation in a safepoint you a) introduce an obvious latency problem
Yes, we are going to measure it. We don't expect relocation to be a frequent operation.
> What are the consequences of copying the deoptimization generation?
What do you mean?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23573#issuecomment-2787747282
More information about the hotspot-compiler-dev
mailing list