[jdk17] RFR: 8269568: JVM crashes when running VectorMask query tests
Jie Fu
jiefu at openjdk.java.net
Thu Jul 1 00:28:03 UTC 2021
On Thu, 1 Jul 2021 00:14:08 GMT, Jie Fu <jiefu at openjdk.org> wrote:
>>> IIRC we wrote them as smoke tests because they were not intrinsic. We need to think more carefully about converting them from smoke tests.
>>>
>> Thanks for looking at this PR @PaulSandoz ! Yes, I think this need more work and more carefully to move them from the smoke tests. Maybe we can revisit them in future?
>>
>>> Ideally we should convert them to kernel tests, but that is more work. Instead we can copy the generated pattern and do the following:
>>>
>>> * move the assertion outside of the loops (it will generate garbage with string concatenation)
>>> * assert over arrays, thereby also moving the actual scalar computation result outside the loops. The simplest approach is to create an `int[] array` of the same length as the input and write the reduced result at index `i`. Thus it's sparse.
>>>
>>> That should result in an inner loop body that is very focused on exercising the intrinsic method. It will also likely reduce the test execution times.
>>
>> Agree, I will move the assertion outside of the loop first. Thanks so much!
>
>> @XiaohongGong The following should fix this for x86:
>>
>> ```
>> diff --git a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>> index 147bcb8..b548877 100644
>> --- a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>> +++ b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>> @@ -3870,6 +3870,9 @@ void C2_MacroAssembler::vector_mask_operation(int opc, Register dst, XMMRegister
>> vpsubb(xtmp, xtmp, mask, vec_enc);
>> evpmovb2m(ktmp, xtmp, vec_enc);
>> kmovql(tmp, ktmp);
>> + if (masklen < 64) {
>> + andq(tmp, (((jlong)1 << masklen) - 1));
>> + }
>> switch(opc) {
>> case Op_VectorMaskTrueCount:
>> popcntq(dst, tmp);
>> @@ -3894,6 +3897,9 @@ void C2_MacroAssembler::vector_mask_operation(int opc, Register dst, XMMRegister
>> vpxor(xtmp, xtmp, xtmp, vec_enc);
>> vpsubb(xtmp, xtmp, mask, vec_enc);
>> vpmovmskb(tmp, xtmp, vec_enc);
>> + if (masklen < 64) {
>> + andq(tmp, (((jlong)1 << masklen) - 1));
>> + }
>> switch(opc) {
>> case Op_VectorMaskTrueCount:
>> popcntq(dst, tmp);
>> ```
>>
>> Please include this in your PR.
>
> The fix seems to be tricky.
> So do you think we need to change the code-gen like this https://github.com/openjdk/jdk17/pull/168#issuecomment-871412973 ? @sviswa7
> @DamonFool This is the cleanest solution I could think of.
> The problem comes only with newly added intrinsic where firstTrue, lastTrue, trueCount become consumers to VectorStoreMask. All the other consumers of VectorStoreMask consume the right number of bits.
> The fix is to make the new intrinsic also consume the right number of bits by doing the 'and' masking.
But we actually do not `and` masking with AVX3.
So I'm afraid the root cause is the incorrect code-gen for VectorStoreMask.
-------------
PR: https://git.openjdk.java.net/jdk17/pull/168
More information about the hotspot-compiler-dev
mailing list