[vectorIntrinsics+mask] RFR: 8272971: Intrinsification of VectorMask.cast operation for all compatible vector species [v2]

Jatin Bhateja jbhateja at openjdk.java.net
Mon Aug 30 06:27:19 UTC 2021


On Mon, 30 Aug 2021 03:51:45 GMT, Ningsheng Jian <njian at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8272971: Optimizing IR for mask-casting over non-predicated targets.
>
> src/hotspot/share/opto/vectorIntrinsics.cpp line 2320:
> 
>> 2318:   if (is_mask && (type2aelembytes(elem_bt_from) != type2aelembytes(elem_bt_to))) {
>> 2319:     return false; // elem size mismatch
>> 2320:   }
> 
> Since Arm SVE has different mask representation than AVX-512 for different types, I think this change will break existing assertion in VectorMaskCastNode constructor (https://github.com/openjdk/panama-vector/blob/vectorIntrinsics%2Bmask/src/hotspot/share/opto/vectornode.hpp#L1427)?

Hi @nsjian, 
Optimization handles casting operation b/w all compatible mask species i.e. having same vector lengths. 
And we see around 2X speedup over AVX512. IR fragment created for non-predicated targets uses existing VectorCast[B/S/I/L] nodes with appropriate insertion of VectorMaskCast for floating point types but they operate over regular vector operands, should be ok for AARCH64 NEON.
Do you plan to extend the SVE backend to handle VectorMaskCast operations for various compatible species?  
Alternatively as an interim solution I can add a target specific matcher routine Matcher::supports_predicate_reg_cast()  and re-write above check as follows.


  if (is_mask && dst_type->isa_vectmask() && src_type->isa_vectmask() && !Matcher::supports_predicate_reg_cast()) {
    return false; // elem size mismatch
  }

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/113


More information about the panama-dev mailing list