RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

Tue Aug 27 18:30:08 UTC 2024

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Hi All,
>> 
>> As per the discussion on panama-dev mailing list[1], patch adds the support following new vector operators.
>> 
>> 
>>      . SUADD   : Saturating unsigned addition.
>>      . SADD    : Saturating signed addition. 
>>      . SUSUB   : Saturating unsigned subtraction.
>>      . SSUB    : Saturating signed subtraction.
>>      . UMAX    : Unsigned max
>>      . UMIN    : Unsigned min.
>>      
>> 
>> New vector operators are applicable to only integral types since their values wraparound in over/underflowing scenarios after setting appropriate status flags. For floating point types, as per IEEE 754 specs there are multiple schemes to handler underflow, one of them is gradual underflow which transitions the value to subnormal range. Similarly, overflow implicitly saturates the floating-point value to an Infinite value.
>> 
>> As the name suggests, these are saturating operations, i.e. the result of the computation is strictly capped by lower and upper bounds of the result type and is not wrapped around in underflowing or overflowing scenarios.
>> 
>> Summary of changes:
>> - Java side implementation of new vector operators.
>> - Add new scalar saturating APIs for each of the above saturating vector operator in corresponding primitive box classes, fallback implementation of vector operators is based over it.
>> - C2 compiler IR and inline expander changes.
>> - Optimized x86 backend implementation for new vector operators and their predicated counterparts.
>> - Extends existing VectorAPI Jtreg test suite to cover new operations.
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> PS: Intrinsification and auto-vectorization of new core-lib API will be addressed separately in a follow-up patch.
>> 
>> [1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments resolutions.

src/hotspot/cpu/x86/x86.ad line 1773:

> 1771:         return false;
> 1772:       }
> 1773:       if (bt == T_LONG && !VM_Version::supports_avx512vl()) {

we should be able to support bt == T_LONG for 512 bit irrespective of avx512vl.

src/hotspot/cpu/x86/x86.ad line 1953:

> 1951:        if (UseAVX < 1 || size_in_bits < 128 || (size_in_bits == 512 && !VM_Version::supports_avx512bw())) {
> 1952:          return false;
> 1953:        }

UseAVX < 1 could be written as UseAVX == 0. Could we not do register version for size_in_bit < 128?

src/hotspot/cpu/x86/x86.ad line 1962:

> 1960:         return false; // Implementation limitation
> 1961:       }
> 1962:       break;

Could we not do register version for size_in_bit < 128?

src/hotspot/cpu/x86/x86.ad line 2143:

> 2141:       if (is_subword_type(bt) && !VM_Version::supports_avx512bw()) {
> 2142:         return false; // Implementation limitation
> 2143:       }

UMinV and UMaxV are supported on AVX1, AVX2 platform.

src/hotspot/cpu/x86/x86.ad line 2155:

> 2153:         return false; // Implementation limitation
> 2154:       }
> 2155:       return true;

Byte/Short saturating vector add is supported for AVX1, AVX2 platforms. Could we not do register version for size_in_bit < 128?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1733330892
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1733333203
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1733333608
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1733336005
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1733338300