RFR: 8360116: Add support for AVX10 floating point minmax instruction [v2]

Thu Jun 26 08:47:13 UTC 2025

On Wed, 25 Jun 2025 15:31:46 GMT, Manuel Hässig <mhaessig at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update comments
>
> src/hotspot/cpu/x86/assembler_x86.hpp line 2752:
> 
>> 2750:   void eminmaxss(XMMRegister dst, XMMRegister nds, XMMRegister src, int imm8);
>> 2751:   void eminmaxsd(XMMRegister dst, XMMRegister nds, XMMRegister src, int imm8);
>> 2752:   void evminmaxph(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int imm8, int vector_len);
> 
> Is there a reason `evminmaxph` does not have a version where `src` has type `Address`?

Currently, we do not have a matcher pattern to consume it, as the MIN/MAX sequence was anyway, a bulky one. I have added a new pattern for memory operand flavor of the pattern specifically for AVX-10, along with this patch.

Patch has been regressed over the following tests using Intel SDE https://www.intel.com/content/www/us/en/download/684897/intel-software-development-emulator.html (Version 9.53). 

- test/jdk/jdk/incubator/vector/Double*VectorTests:: (min/max all variants including reduction)
- test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java
- test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java

e.g. command line /home/jatinbha/softwares/sde-external-9.53.0-2025-03-16-lin/sde64 -future -ptr_raise -icount -- java <args>

> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1241:
> 
>> 1239: }
>> 1240: 
>> 1241: void C2_MacroAssembler::vminmax_fp(int opc, BasicType elem_bt, XMMRegister dst, KRegister mask,
> 
> Line 1122 mentions the differences between `vminps/vmaxps` and Java semantics. Perhaps a mention of the new instructions introduced in this PR might help people who are confused about the fact that `vminmax_fp` is overloaded.

Details on insturction semantics can be found in section 11.2 of AVX10 manual https://www.intel.com/content/www/us/en/content-details/856721/intel-advanced-vector-extensions-10-2-intel-avx10-2-architecture-specification.html?wapkw=AVX10

> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1246:
> 
>> 1244:          opc == Op_MaxV || opc == Op_MaxReductionV, "sanity");
>> 1245:   if (elem_bt == T_FLOAT) {
>> 1246:     evminmaxps(dst, mask, src1, src2, true, opc == Op_MinV || opc == Op_MinReductionV ? 0x4 : 0x5, vlen_enc);
> 
> Perhaps `0x4` and `0x5` should be factored into named constants since they are used in multiple places and it would also help readability if one does not have the documentation handy when reading the code.

Hi @mhaessig , 
Command bits are in accordance with Tables 11.1 and 11.2 of section 11.2.  First 2 bits [1:0] signify the operation kind, 00 for min and 01 for max.  Next two bits [3:2] signify the sign selection logic and 4th bit 0 for both min/max, with this command word we can emulate the semantics of Math.max/min using a single AVX10 instruciton.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168533731
PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168533872
PR Review Comment: https://git.openjdk.org/jdk/pull/25914#discussion_r2168533554