RFR: 8365265: x86 short forward jump exceeds 8-bit offset in methodHandles_x86.cpp when using Intel APX

Andrew Dinn adinn at openjdk.org
Tue Aug 12 10:39:12 UTC 2025


On Tue, 12 Aug 2025 10:19:54 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>>> > > Looks good. This is diagnostics code, so performance is not a question.
>>> > > I think we generally avoid shortening branches over `__ STOP`, for example, which size is generally unpredictable. So this looks in alignment with that tactics. Maybe you want to unshorten the branch at L157 as well.
>>> > 
>>> > 
>>> > All thi.s long-and-short branch stuff is a pain. I wonder, given that we're now saving stubs in an archive, whether we should just bite the bullet and implement branch relaxation for stubs. I don't think it would be very hard.
>>> 
>>> Code density still matters for runtime performance, alas.
>> 
>> Well, yes. I'm suggesting that we should generate short branches automagically.
>
>> Well, yes. I'm suggesting that we should generate short branches automagically.
> 
> We do generate short branches auto-magically, but only for back-branches, where we know where the target is at the time we emit the jump. So _forward jumps_ get the short (pun intended) end of the stick. 
> 
> I thought about this a bit a few years back: I can imagine how could one do multiple scratch emits that try to progressively figure out which forward jumps can be shortened. That would need to be iterative, because shortening an _inner_ jump likely opens opportunities for shortening more _outer_ jumps. Or maybe you can do this from the end, would that guarantee completeness? Anyway, this opens a question how this all impacts compilation time. I guess it is not prohibitive for small code blobs like stubs. But then, going through all this hassle to only optimize stubs? We might as well spend this time hand-optimizing the forward jumps by hand :)

@shipilev One difficulty with shuffling code up the buffer would be recognizing where an instruction or embedded data has been nop-padded for alignment. There is no marker for that at present. The other obvious one is keeping all your relocs targeted at the correct instruction (not just adjusting offsets incrementally but also, potentially, removing a reloc_None added to bridge a large enough gap between sites).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26731#issuecomment-3178764360


More information about the hotspot-compiler-dev mailing list