[vectorIntrinsics] RFR: 8274569: X86 backend related incorrectness issues in legacy store mask patterns
Jie Fu
jiefu at openjdk.java.net
Fri Oct 1 00:39:15 UTC 2021
On Thu, 30 Sep 2021 19:34:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
> - Issue was seen while unit testing changes for masking related optimizations on vectorIntrinsics branch.
>
> - Following patterns result into incorrect store mask computation, even though contextually store-mask is followed by StoreVector which does takes care of various vector sizes. But C2 does emit VectorStoreMask in IR to populate a byte vector.
>
> - instruct storeMask2B
> - instruct storeMask4B
> - instruct storeMask8B
> - instruct storeMask8B_avx
>
> - Replicate with immI operand is correctly taking care of shorter vector lengths thus it can be fall back case for following pattern with immediate -1 argument.
> - instruct ReplI_M1
>
> Patch also adds a test case which exhaustively coves load/store vector masks operations for different SPECIES.
Do we still need `andq`?
void C2_MacroAssembler::vector_mask_operation(int opc, Register dst, XMMRegister mask, XMMRegister xtmp,
XMMRegister xtmp1, Register tmp, int masklen, int vec_enc) {
assert(VM_Version::supports_avx(), "");
vpxor(xtmp, xtmp, xtmp, vec_enc);
vpsubb(xtmp, xtmp, mask, vec_enc);
vpmovmskb(tmp, xtmp, vec_enc);
if (masklen < 64) {
andq(tmp, (((jlong)1 << masklen) - 1)); // <---- Do we still need this?
}
switch(opc) {
case Op_VectorMaskTrueCount:
popcntq(dst, tmp);
break;
case Op_VectorMaskLastTrue:
mov64(dst, -1);
bsrq(tmp, tmp);
cmov(Assembler::notZero, dst, tmp);
break;
case Op_VectorMaskFirstTrue:
mov64(dst, masklen);
bsfq(tmp, tmp);
cmov(Assembler::notZero, dst, tmp);
break;
default: assert(false, "Unhandled mask operation");
}
}
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/139
More information about the panama-dev
mailing list