[vectorIntrinsics] RFR: RFC: Vector API masking support proposal for Arm SVE
Xiaohong Gong
xgong at openjdk.java.net
Wed Mar 10 03:58:13 UTC 2021
On Fri, 5 Mar 2021 03:12:07 GMT, Jie Fu <jiefu at openjdk.org> wrote:
>>> Good job!
>>> Thanks @XiaohongGong and @nsjian .
>>>
>>> Just wondering: is there any performance number before and after this enhancement?
>>
>> Thanks for looking at this PR @DamonFool . Currently we only runs a small part of JMH benchmarks (`ADDMasked`) internally. And since the existing cases are simple which focus on single API operations, the performance can improve but the number is not so obvious (about 6% ~ 13% for integer types). Also we still have many missing masking supports, so we think it's too early to run all the benchmarks.
>>
>>> There seems some code duplication between inline_vector_nary_operation and inline_vector_nary_mask_operation.
>>> Is it possible to support mask operations in inline_vector_nary_operation(int n)?
>>
>> This is a temporary state since we only tries the masking binary with this solutions. So yes, these two functions can be merged into one. And we will do this once unary/ternary masked operations are supported in future. Thanks!
>>
>>>
>>> Thanks.
>
>> And since the existing cases are simple which focus on single API operations, the performance can improve but the number is not so obvious (about 6% ~ 13% for integer types). Also we still have many missing masking supports, so we think it's too early to run all the benchmarks.
>
> Good to know that.
> Thanks.
Hi @sviswa7 @jatin-bhateja ,
Could you please take a look at this PR which contains the basic masking support? Although the changes focus on Arm SVE platform, we expect the ideas can also be applied to AVX-512 especially for the shared codes. It will be very helpful if you can give any comments from the side of AVX-512. And it will be very great if there is an optimal final solution for the masking support both for Arm SVE and AVX-512. So please help to have a look at it! Thanks so much!
BTW, this PR only contains a part of the masking support codes. We will continue working on the missing parts including other masked vector operations, mask generation operations and the mask operations. And currently we are looking at the deoptimization support for predicate registers. To avoid conflicts, it will be helpful if you could also share us any ideas or plans from your side. Thanks so much!
Best Regards,
Xiaohong Gong
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/40
More information about the panama-dev
mailing list