[vectorIntrinsics+mask] RFR: 8270349: Initial X86 backend support for optimizing masking operations on AVX512 targets. [v7]
Jatin Bhateja
jbhateja at openjdk.java.net
Thu Aug 19 20:03:29 UTC 2021
On Mon, 16 Aug 2021 18:41:25 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>>
>> 8270349: Optimizing JIT sequence for alltrue/anytrue and maskAll operations.
>
> src/hotspot/cpu/x86/x86.ad line 1962:
>
>> 1960: assert(bt != T_INT || VM_Version::supports_avx512bw(), "");
>> 1961: assert(bt != T_LONG || VM_Version::supports_avx512bw(), "");
>> 1962: if (bt == T_BYTE && VM_Version::supports_avx512dq()) {
>
> Should this be
> if (bt == T_BYTE && !VM_Version::supports_avx512dq())
DONE
> src/hotspot/cpu/x86/x86.ad line 3798:
>
>> 3796: BasicType elem_bt = vector_element_basic_type(this);
>> 3797: assert(!is_subword_type(elem_bt), "sanity"); // T_INT, T_LONG, T_FLOAT, T_DOUBLE
>> 3798: __ kmovwl($ktmp$$KRegister, $mask$$KRegister);
>
> Why do we need kmovwl here?
Gather/Scatter instruction partially updates predicate register, hence moving mask to temporary.
> src/hotspot/cpu/x86/x86.ad line 3838:
>
>> 3836: assert(vector_length_in_bytes(this, $src) >= 16, "sanity");
>> 3837: assert(!is_subword_type(elem_bt), "sanity"); // T_INT, T_LONG, T_FLOAT, T_DOUBLE
>> 3838: __ kmovwl($ktmp$$KRegister, $mask$$KRegister);
>
> Why do we need kmovwl here?
Gather/Scatter instruction partially updates predicate register, hence moving mask to temporary.
> src/hotspot/cpu/x86/x86.ad line 9061:
>
>> 9059: __ movslq($tmp$$Register, $src$$Register);
>> 9060: __ kmovql($dst$$KRegister, $tmp$$Register);
>> 9061: __ kshiftrql($dst$$KRegister, $dst$$KRegister, 64 - vec_len);
>
> Could we not do kmovdl followed by kshiftrdl here? Thereby removing the need for movslq.
maskAll accept a boolean argument (false (0) , true(-1)).
value operand has a rRegI register class which represent 32 bit register. This value need to be sign extended to 64 bit value before computing the final mask value using shift right operation.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/99
More information about the panama-dev
mailing list