[vectorIntrinsics+mask] RFR: 8264563: Add masked vector intrinsics for binary/store operations [v5]

Vladimir Ivanov vlivanov at openjdk.java.net
Wed Apr 21 10:58:48 UTC 2021


On Thu, 15 Apr 2021 16:22:45 GMT, Paul Sandoz <psandoz at openjdk.org> wrote:

>>> The changes to Java code generally look ok. @iwanowww will know more than I with regards to how this all behaves with C2 e.g. using `null` or some other sentinel, such as a constant mask with all bits set?
>> 
>> Hi @PaulSandoz , thanks for the comments! Yeah, we can also use a `allTrue` constant mask for the non-mask APIs. This seems reasonable, with just a shortage that the compiler needs to generate more codes to check whether the mask is all true or not. Considering the performance, I didn't use this way. Hope @iwanowww can give more advice for this. Thanks!
>> 
>>> There is an awkward set of differences between non-mask and mask for ROR etc. Although most of that would go away if support is in C2. Possibly leaving subtle differences for DIV. I imagine adjustment of other templates will be similar.
>> 
>> Yes, the adjustment of other templates will be similar.
>> 
>>> I am tempted to have a template method for both non-mask and mask. Thus specialized code (that before calling the intrinsic) may be somewhat duplicated before calling `binaryMaskOp`, using a constant for the op -> lambda function, and passing in null or the mask value.
>> 
>> Yeah, that's why I added the `lanewise0` and `lanewise0Template` methods.
>
>> > I am tempted to have a template method for both non-mask and mask. Thus specialized code (that before calling the intrinsic) may be somewhat duplicated before calling `binaryMaskOp`, using a constant for the op -> lambda function, and passing in null or the mask value.
>> 
>> Yeah, that's why I added the `lanewise0` and `lanewise0Template` methods.
> 
> I am suggesting to separate out templates for non-masked and masked, similar to the current pattern. That would result in some duplication for the special cases, but i think makes it easier to reason about in one place, rather than being somewhat spread out. Perhaps easier to explain as code. We should just iterate on the branch.

It's better to ask @PaulSandoz about how to better shape implementation code on JDK side.

IMO having an explicit null check (e.g., `Objects.requireNonNull()`) as part of argument validation in all public methods for masked operations should be a bare minimum. We already rely on the ability to propagate all the important information from the call site down to the intrinsic call.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/57


More information about the panama-dev mailing list