[vectorIntrinsics+mask] RFR: 8272100: VectorAPI: modify existing implementation of masked neg and not operation.

Mon Aug 9 11:37:43 UTC 2021

On Mon, 9 Aug 2021 07:12:33 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> - Masked NEG/NOT operations can be implemented using existing masked operations instead of emitting an explicit vector blend after the operation. 
>> - Changes shows around 1.3-1.5X gain when merged with X86 backend changes for masking operations. https://github.com/openjdk/panama-vector/pull/99
>
> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java line 596:
> 
>> 594:                 return lanewise(XOR, broadcast(-1), m);
>> 595:             } else if (op == NEG) {
>> 596:                 return lanewise(NOT, m).lanewise(ADD, broadcast(1), m);
> 
> Why not directly intrinsify `NEG` as an unary op and generate the "NegVI" node in compiler? SVE supports the masked neg instruction. How about x86 AVX-512?

For targets which do not support direct vector NEG, logic will need to dismantled into vector NOT + ADD.
Other approach was to emit vector multiplication with -1. But it will be costly for sub-word types (byte/word). I think we can take any approach, keep this java change as its and add compiler transform which folds the logic into NegVI during XOR idealization since it will be specific for one target (SVE) currently.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/108