[vectorIntrinsics+mask] RFR: 8266287: Basic mask IR implementation for the Vector API masking feature support [v2]

Fri Jun 18 08:21:02 UTC 2021

On Fri, 14 May 2021 16:20:02 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits:
>> 
>>  - Use "Flag_is_predicated_vector" for masked vector nodes
>>  - Merge branch 'vectorIntrinsics+mask'
>>  - 8266287: Basic mask IR implementation for the Vector API masking feature support
>
> Can you elaborate, please, what's the purpose of new nodes? 
> 
> There's some duplication with existing vector nodes and I'd like to understand how do you intend to merge them.
> 
> `LoadVectorMask`/`StoreVectorMask` duplicate `VectorLoadMask (LoadVector)`/`StoreVector (VectorStoreMask)`. What kind of benefits do you expect from exposing the operation as a single node?
> 
> Depending on how `MaskToVector`/`VectorToMask` are specified, they can duplicate  `VectorStoreMask`/`VectorLoadMask`.
> 
> If mask value always has a vector type,`AndVMask`/`OrVMask`/`XorVMask` can be replaced by `AndV`/`OrV`/`XorV` and special implementations for 2 representations (canonical and native). Same considerations apply to `VectorCmpMaskGen` (compared to `VectorMaskCmp`).
> 
> What is left is `MaskAll`, but its purpose is not evident to me. Broadcast, but for masks?

Hi @iwanowww @jatin-bhateja ,

Regarding to the mask IRs for SVE/AVX-512, it's ok for me to reuse most of the existing IRs both for SVE/AVX-512 and other platforms. I agree that it can work well by defining different types for different platforms (i.e. `TypeVectMask` for SVE/AVX-512 while `TypeVect` for others). The benefit is that it doesn't need to add more new IRs. And the compiler can define the right register for vector mask according to its bottom type. So some mask related IRs (`VectorLoadMask, VectorMaskCmp, VectorMaskCast`) can be reused.

However, to separate the vector and vector mask operations for SVE/AVX-512, we think it's better to define new IRs for vector mask for nodes that could be both vector and vector mask like (AndV/OrV/XorV/Replicate). So the following new IRs will be added for mask：

AndVMask, OrVMask, XorVMask, MaskAll

These IRs can also be used for other platforms that do not support predicate features.

Further more, considering the performance of vector mask loading/storing with SVE, the following optimizations are also needed:

   fromArray:  LoadVector + VectorLoadMask    ==>   LoadVectorMask     (load + type extending)                     
   intoArray:  VectorStoreMask + StoreVector   ==>   StoreVectorMask     (type narrowing + store)

Does this solution make sense to you? Any feedback from you are helpful and welcome, and hope we can have a conclusion for the mask IR definition part. Thanks so much!

Best Regards,
Xiaohong

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/78