RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v30]

Mon Oct 21 09:14:34 UTC 2024

On Fri, 18 Oct 2024 14:56:29 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Hi All,
>> 
>> As per the discussion on panama-dev mailing list[1], patch adds the support for following new vector operators.
>> 
>> 
>>      . SUADD   : Saturating unsigned addition.
>>      . SADD    : Saturating signed addition. 
>>      . SUSUB   : Saturating unsigned subtraction.
>>      . SSUB    : Saturating signed subtraction.
>>      . UMAX    : Unsigned max
>>      . UMIN    : Unsigned min.
>>      
>> 
>> New vector operators are applicable to only integral types since their values wraparound in over/underflowing scenarios after setting appropriate status flags. For floating point types, as per IEEE 754 specs there are multiple schemes to handler underflow, one of them is gradual underflow which transitions the value to subnormal range. Similarly, overflow implicitly saturates the floating-point value to an Infinite value.
>> 
>> As the name suggests, these are saturating operations, i.e. the result of the computation is strictly capped by lower and upper bounds of the result type and is not wrapped around in underflowing or overflowing scenarios.
>> 
>> Summary of changes:
>> - Java side implementation of new vector operators.
>> - Add new scalar saturating APIs for each of the above saturating vector operator in corresponding primitive box classes, fallback implementation of vector operators is based over it.
>> - C2 compiler IR and inline expander changes.
>> - Optimized x86 backend implementation for new vector operators and their predicated counterparts.
>> - Extends existing VectorAPI Jtreg test suite to cover new operations.
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> PS: Intrinsification and auto-vectorization of new core-lib API will be addressed separately in a follow-up patch.
>> 
>> [1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Replacing flag based checks with CPU feature checks in IR validation test.

Wow this is really a very moving target - quite frustrating to review - it takes up way too much of the reviewers bandwidth. You really need to split up your PRs as much as possible so that review is easier and faster.

I think these optimizations should be done in a separate PR. I see no reason why they need to be mixed in.

https://github.com/openjdk/jdk/commit/c56508899b000b8b1eb6755c901798a2a3685ef5 The `UMinVNode::Ideal` etc changes with IR rules.

I also cannot easily review just such a diff, it does not let me make comments - so I still have to go find the relevant code in the whole PR.

Some comments on this section:

Node* UMinVNode::Ideal(PhaseGVN* phase, bool can_reshape) {
  bool match1 = in(1)->Opcode() == Op_UMinV || in(1)->Opcode() == Op_UMaxV;
  bool match2 = in(2)->Opcode() == Op_UMinV || in(2)->Opcode() == Op_UMaxV;
  // UMin (UMin(a, b), UMax(a, b))  => UMin(a, b)
  // UMin (UMin(a, b), UMax(b, a))  => UMin(a, b)
  if (match1 && match2) {
    if ((in(1)->in(1) == in(2)->in(1) && in(1)->in(2) == in(2)->in(2)) ||
         (in(1)->in(2) == in(2)->in(1) && in(1)->in(1) == in(2)->in(2))) {
      return new UMinVNode(in(1)->in(1), in(1)->in(2), vect_type());
    }
  }
  return nullptr;
}

What about the reverse case `min(max, min)`? And are we sure we do not need to verify any types in all of these cases? Maybe not - but I'd rather be super sure - not that things get misinterpreted and then folded the wrong way.

I mean if I now approve only that diff, then I still need to approve the whole PR, which means I need to spend a lot of time on everything again. Otherwise, in theory people could smuggle anything in.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20507#issuecomment-2426080652