[vectorIntrinsics+mask] RFR: 8270349: Initial X86 backend support for optimizing masking operations on AVX512 targets. [v3]

Jatin Bhateja jbhateja at openjdk.java.net
Thu Aug 12 18:55:50 UTC 2021


On Thu, 12 Aug 2021 17:48:40 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits:
>> 
>>  - 8270349: Review comments resolution.
>>  - Merge branch 'vectorIntrinsics+mask' of http://github.com/openjdk/panama-vector into JDK-8270349
>>  - 8270349: Merge with latest vectorIntrinsics+mask tip + extend backend support for XorV,AndV,OrV and Compare masked operations.
>>  - 8270349: Fix for 32-bit build failure.
>>  - 8270349: Initial X86 backend support for optimizing masking operations on AVX512 targets.
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 3915:
> 
>> 3913:       evand(eType, dst, mask, src1, src2, merge, vlen_enc); break;
>> 3914:     case Op_AbsVD:
>> 3915:     case Op_MulVB:
> 
> Why specifically only AbsVD and MulVB are called out here?

Will remove them.

> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8212:
> 
>> 8210:     case T_BOOLEAN:
>> 8211:     case T_BYTE:
>> 8212:        kandbl(dst, src1, src2);
> 
> Is the type here the vector element type? Is so, for byte vector we need kandql. For short kanddl. For int kandwl, For long kandbl.

No its not vector type. From the instruction side always T_LONG basic type is passed. TypeVectMask is vector of booleans.  We can use vector length to emit shorter instructions ,  but efficiency wise its wont be different to doing an operation over entire opmask register.

> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8236:
> 
>> 8234:     case T_BOOLEAN:
>> 8235:     case T_BYTE:
>> 8236:        korbl(dst, src1, src2);
> 
> Is the type here the vector element type? Is so, for byte vector we need korql. For short kordl. For int korwl, For long korbl.

same as above

> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8260:
> 
>> 8258:     case T_BOOLEAN:
>> 8259:     case T_BYTE:
>> 8260:        kxorbl(dst, src1, src2);
> 
> Is the type here the vector element type? Is so, for byte vector we need kxorql. For short kxordl. For int kxorwl, For long kxorbl.

same as above

> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8384:
> 
>> 8382:   switch(type) {
>> 8383:     case T_INT:
>> 8384:       Assembler::evpord(dst, mask, nds, src, merge, vector_len); break;
> 
> What about subword types here for or, xor, and?

Skipped for now through match_rule_supported_vector_mask, since blend vector operation may be beneficial in that case instead of doing any special handling for sub-words.

> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8425:
> 
>> 8423: }
>> 8424: 
>> 8425: void MacroAssembler::evpperm(BasicType type, XMMRegister dst, KRegister mask, XMMRegister nds, Address src, bool merge, int vector_len) {
> 
> This should be moved next to the register variant of the evpperm.

Will do this change.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/99


More information about the panama-dev mailing list