RFR: 8351950: C2: AVX512 vector assembler routines causing SIGFPE / no valid evex tuple_table entry [v2]

Mon May 19 22:25:55 UTC 2025

On Mon, 12 May 2025 12:17:11 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> PR adds missing EVEX compressed displacement attributes used for computing the scale factor (N) of compressed displacement.
>> AVX512 memory operand instructions use compressed disp8 encoding if the displacement is a multiple of scale (N), which depends on Vector Length, embedded broadcasting, and lane size.  Please refer to section 2.7.5 of Intel SDM for more details.
>> 
>> e.g., Consider two instructions, one with displacement 0x10203040 and the other with displacement 0x40, instruction operates over full 64-byte vector hence scale N = 64. Displacement of latter instruction is a multiple of scale, thus can be represented by 1 byte displacement encoding, while the former requires 4 bytes to represent displacement in instruction encoding.
>> 
>> 
>> 1) vpternlogq $0xff,0x10203040(%r20,%r21,8),%zmm23,%zmm24
>>     EVEX        OP   MR   SIB       DISP       IMM
>> --------------|----|----|----|---------------|-----|
>> 62 6b c1 40     25   84   ec     40 30 20 10     ff
>> 
>> 2) vpternlogq $0xff,0x40(%r20,%r21,8),%zmm23,%zmm24
>> For full vector width operation, scalar matches with vector size, hence scale N = 64
>> effective displacement / compressed DISP8 = OFFSET(64) / 64 = 0x1 
>>     EVEX       OP   MR SIB    DISP     IMM
>> -------------|----|---|---|-----------|---|
>> 62 6b c1 40    25  44   ec      01     ff 
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Correcting tuple types in some assembler routines

Some more places correction needs to be done for address attributes:
1. evpmovzxbd  tuple type needs change from HVM to QVM.
2. Address attribute missing for two additional instructions taking Address as input/output:  vpermb, paddd.
3. The input_size_in_bits should be EVEX_32bit for cvtsi2ssq, cvtsi2sdq.
4. The input_size_in_bits should be EVEX_64bit for evpgatherdq, evpscatterdq, evgatherdpd, evscatterdpd.

src/hotspot/cpu/x86/assembler_x86.cpp line 11379:

> 11377:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit || VM_Version::supports_avx512vl()), "");
> 11378:   InstructionAttr attributes(vector_len, /* vex_w */ false,/* legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);
> 11379:   attributes.set_address_attributes(/* tuple_type */ EVEX_FVM,/* input_size_in_bits */ EVEX_NObit);

No address attribute needed for this instruction.

src/hotspot/cpu/x86/assembler_x86.cpp line 11408:

> 11406:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit || VM_Version::supports_avx512vl()), "");
> 11407:   InstructionAttr attributes(vector_len, /* vex_w */ false,/* legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);
> 11408:   attributes.set_address_attributes(/* tuple_type */ EVEX_FVM,/* input_size_in_bits */ EVEX_NObit);

No address attribute needed for this instruction.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25021#issuecomment-2892419514
PR Review Comment: https://git.openjdk.org/jdk/pull/25021#discussion_r2096520907
PR Review Comment: https://git.openjdk.org/jdk/pull/25021#discussion_r2096521528