[vectorIntrinsics] RFR: RFC: Vector API masking support proposal for Arm SVE [v3]
Jatin Bhateja
jbhateja at openjdk.java.net
Thu Mar 18 11:54:51 UTC 2021
On Fri, 12 Mar 2021 09:47:33 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
>> Please help to review this proposal for Vector API masking support.
>>
>> This is the masking part of https://bugs.openjdk.java.net/browse/JDK-8261663 - the second incubator for JEP 338 Vector API. As the JEP described, the masked vector operations are currently implemented by composing the non-masked operation with a blend operation. This can be improved by using the hardware mask feature on supported architecture like Arm SVE and X86 AVX-512. So here is the proposals for Arm SVE. We assume the ideas could also be applied to X86 AVX-512.
>>
>> To support the masking feature, this PR added the following implementations:
>> - SVE predicate register allocation
>> - Mask type and basic mask IR definition
>> - Mask implementation for masked vector store
>> - Mask implementation for masked binary operations
>>
>> For the masked binary operations, we have created two proposals for discussion:
>> - By mainly changing the C2 compiler
>> - By improving the Vector API Java implementation together with simpler C2 compiler changes
>>
>> This PR shows the second solution since we think this solution is better. But we also have a prototype for the first solution. Please see: https://github.com/XiaohongGong/panama-vector/commit/372489feeae06bc53c46709d389cb0e46e9fb4f6 . The basic support changes are shared with this PR.
>>
>> This PR doesn't contain all the masking support changes. There are still too many missing parts that we will continue working on, including:
>> - Mask support for other operations (unary,ternary,reduction,load,etc.)
>> - More mask IRs implementation (maskAll, toVector, allTrue, anyTrue, trueCount, eq, etc)
>> - Better solution for vector mask load/store (the memory type is boolean)
>> - Vector boxing/unboxing support for mask type (deoptimization?)
>> - Tail loop elimination?
>>
>> It's worth to mention that this PR mainly provides the proposals for SVE masking support, and any suggestions and discussions are welcome! Thanks a lot!
>
> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision:
>
> Add mask support for masked binary operations
src/hotspot/share/opto/vectorIntrinsics.cpp line 583:
> 581: const TypeVMask* vmask_type = TypeVMask::make(elem_bt, num_elem);
> 582: mask = gvn().transform(new VectorToMaskNode(mask, vmask_type));
> 583: operation->add_req(mask);
Following is the link to a reference implementation (PoC) for X86:
http://cr.openjdk.java.net/~jbhateja/avx512_masked_operation_optimization/webrev.02/src/hotspot/share/opto/vectornode.cpp.udiff.html
This demonstrates creation of new masked operation node which folds blend+vector operation graph patterns, there is one new Ideal node for each kind of operation i.e. ternary, binary and unary. New node also carry the meta data info which specifies the kind of masked operation being performed.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/40
More information about the panama-dev
mailing list