RFR: 8373697: AArch64: Remove deoptimization stub code for nmethods with small stack frames

Ruben duke at openjdk.org
Thu Dec 18 11:38:47 UTC 2025


On Tue, 16 Dec 2025 23:15:12 GMT, Ruben <duke at openjdk.org> wrote:

> Removal of deoptimization stub codes improves density of compiled code.
> It is possible at least for methods with relatively small stack frames.

Thank you for the feedback.

I agree that this change requires a detailed description.

I will be out of office until early January - I'm planning to provide more details, update PR description, and add comments, when I am back.

= A high-level description =

Assumptions:
- the deoptimization stub code needs to be unique per method only for the purpose of identifying the deoptimized method based on the pointer to the return address stored within the deoptimized compiled frame.
- for any deoptimized compiled frame, original PC slot always contains valid pointer into the deoptimized method

Normally, VM doesn't know how to find the original PC slot within the compiled frame until it knows what nmethod corresponds to the compiled frame: because layouts of compiled frames, and so offsets to the original PC slots, vary between compiled methods.

Prior to the change, the first thing the VM would do when parsing a deoptimized compiled frame: lookup nmethod using the return address which would be pointing to the nmethod-specific deoptimization stub code.
The chain of lookup was: return address -> find containing nmethod blob in code cache -> lookup the original PC slot offset -> load original PC from the compiled frame.

If we are to remove deoptimization stub code, we need an alternative way for the VM to identify the nmethod based on the return address value.
At the same time, the return address value should be a valid pointer to executable code transferring control to the shared deoptimization blob.

The proposal implemented in the current version:
 - introduce a number of extra entry points into the shared deoptimization blob: every entry point corresponds to a particular offset of original PC slot within a compiled frame;
 - when deoptimizing a compiled frame, use the original PC slot as usual and patch the return address with one of the extra entry points corresponding to original PC slot offset within the particular compiled frame

This allows VM to identify the location of original PC slot, and so to find the nmethod, based on the return address.
At the same time, the return address indeed transfers control to the shared deoptimization blob.

The proposed chain of lookup is: return address -> lookup the original PC slot offset -> load original PC from the compiled frame -> find containing nmethod blob in code cache.

A restriction is: the compiled frame size is not limited. So, for some of the methods - with relatively big stack frames - VM would not be able to find a matching extra entry point.
For those methods, the per-nmethod deoptimization stub code still has to be emitted.


> like putting the original pc at a fixed slot in the compiled frames

Technically, original PC slot is a fixed slot already:
https://github.com/openjdk/jdk/blob/2ba423db9925355348106fc9fcf84450123d2605/src/hotspot/share/opto/output.cpp#L232

The obstacle is that the fixed slots are located at the end of compiled frame, so offset to the first fixed slot depends on number of spill and argument slots.
I believe, it is possible to move the slot after the argument slots and before the spill slots - in this case, the offset has a relatively low upper limit.
However, that would result in moving all spill slots further into the compiled frame - correspondingly, increasing their offsets in memory access instructions.
That would increase a chance of a frequently accessed spill slot requiring an extra operation to compute its address within the stack frame.
This is unlikely to be an issue with LDR/STR due to the offset range they allow; however for LDP/STP the issue might be likely enough to happen.
As far as I understand, this might negatively affect performance due to extra operation required; and also might increase code size, indirectly further affecting the performance.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28857#issuecomment-3669859579


More information about the hotspot-dev mailing list