RFR: 8359386: Fix incorrect value for max_size of C2CodeStub when APX is used [v3]

Tue Jun 17 04:26:31 UTC 2025

On Tue, 17 Jun 2025 04:07:14 GMT, Srinivas Vamsi Parasa <sparasa at openjdk.org> wrote:

>> The goal of this PR is to fix the value of max_size of the C2CodeStub hardcoded in the C2_MacroAssembler::convertF2I() function when Intel APX instrucitons are used. Currently, max_size is hardcoded to 23 (introduced in [JDK-8306706](https://bugs.openjdk.org/browse/JDK-8306706)) . However, this value is incorrect when Intel APX instructions with extended general-purpose registers (EGPRs) are used in the code stub as using EGPRs with APX instructions leads to an increase in the instruction encoding size by additional 1 byte.
>> 
>> Without this fix, we see the following error for the C2 compiler tests below:
>> 
>> compiler/vectorization/runner/ArrayTypeConvertTest.java
>> compiler/intrinsics/zip/TestFpRegsABI.java
>> 
>> 
>> 
>> 
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error (/src/hotspot/share/opto/c2_CodeStubs.cpp:50), pid=3961123, tid=3961332
>> # assert(max_size >= actual_size) failed: Expected stub size (23) must be larger than or equal to actual stub size (24)
>> #
>> # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.parasa.jdkdemotion)
>> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.parasa.jdkdemotion, mixed mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> # Problematic frame:
>> # V [libjvm.so+0x955a77] C2CodeStubList::emit(C2_MacroAssembler&)+0x227
>> #
>> 
>> 
>> This PR fixes the errors in the above-mentioned tests.
>> 
>> Currently, the ConvertF2I macro works as follows:
>> 
>> 
>> vcvttss2si %xmm1,%eax
>>     cmp    $0x80000000,%eax
>>     je     STUB
>> CONTINUE:
>> 
>> STUB:
>>     sub    $0x8,%rsp
>>     vmovss %xmm1,(%rsp)
>>     call   Stub::f2i_fixup              ;   {runtime_call StubRoutines (initial stubs)}
>>     pop    %rax
>>     jmp    CONTINUE
>> 
>> 
>> The maximum size (max_size) of the stub is precomputed as 23. However, as seen in the convertF2I_slowpath implementation (below), the usage of pop(dst) instruction increases the instruction encoding size by 1 byte if dst is an extended general-purpose register (R16-R31) . 
>> 
>> For example, `pop (r15)` is encoded as `41 5f`, whereas `pop(r21)` is encoded as `d5 10 5d`.         
>> 
>> 
>> 
>> 
>> static void convertF2I_slowpath(C2_MacroAssembler& masm, C2GeneralStub<Register, XMMRegister, address>& stub) {
>> #define __ masm.
>>   Register dst = stub.data<0>();
>>   XMMRegister src = stub.data<1>();
>>   address target = stub.data<2>();
>>   __ bind(stub.entry());
>>   __ subptr(rsp, 8);
>>   __ mo...
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix the change in the stub size by 1 byte

Please see the updated version which fixes the error in the initial PR. While it's true that the size of the convert with truncation and the compare instruction in the main common path, it does not contribute to the increase in the stub size. As explained in the updated PR description, the increase in size by 1 byte is being caused by `pop(dst)` instruction in the stub.

Hello Jatin (@jatin-bhateja),

Please see the updated PR description which fixed the error in the initial PR.

Thanks,
Vamsi

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25787#issuecomment-2978885267
PR Comment: https://git.openjdk.org/jdk/pull/25787#issuecomment-2978888298