RFR: 8351950: C2: masked vector MIN/MAX AVX512: SIGFPE / no valid evex tuple_table entry
Jatin Bhateja
jbhateja at openjdk.org
Tue May 6 18:11:21 UTC 2025
On Tue, 6 May 2025 16:40:05 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> PR adds missing EVEX compressed displacement attributes used for computing the scale factor (N) of compressed displacement.
>> AVX512 memory operand instructions use compressed disp8 encoding if the displacement is a multiple of scale (N), which depends on Vector Length, embedded broadcasting, and lane size. Please refer to section 2.7.5 of Intel SDM for more details.
>>
>> e.g., Consider two instructions, one with displacement 0x10203040 and the other with displacement 0x40, instruction operates over full 64-byte vector hence scale N = 64. Displacement of latter instruction is a multiple of scale, thus can be represented by 1 byte displacement encoding, while the former requires 4 bytes to represent displacement in instruction encoding.
>>
>>
>> 1) vpternlogq $0xff,0x10203040(%r20,%r21,8),%zmm23,%zmm24
>> EVEX OP MR SIB DISP IMM
>> --------------|----|----|----|---------------|-----|
>> 62 6b c1 40 25 84 ec 40 30 20 10 ff
>>
>> 2) vpternlogq $0xff,0x40(%r20,%r21,8),%zmm23,%zmm24
>> For full vector width operation, scalar matches with vector size, hence scale N = 64
>> effective displacement / compressed DISP8 = OFFSET(64) / 64 = 0x1
>> EVEX OP MR SIB DISP IMM
>> -------------|----|---|---|-----------|---|
>> 62 6b c1 40 25 44 ec 01 ff
>>
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> src/hotspot/cpu/x86/assembler_x86.cpp line 11542:
>
>> 11540: assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), "");
>> 11541: InstructionAttr attributes(vector_len, /* vex_w */ true,/* legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);
>> 11542: attributes.set_address_attributes(/* tuple_type */ EVEX_FV,/* input_size_in_bits */ EVEX_NObit);
>
> @jatin-bhateja How is this `fma` case related to the `min / max` cases that were reported? I did also not find a test below.
Hi @eme64 , For tuple_type Fully Vector (FV) scale factor (N) does not take into account the lane size, thus EVEX_NObit is right argument here, using EVEX_32bit will not cause functional correctness as lane size is anyways ignored, but EVEX_NObit better conveys our intent.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25021#discussion_r2076011064
More information about the hotspot-compiler-dev
mailing list