RFR: 8318650: Optimized subword gather for x86 targets. [v2]

Fri Oct 27 06:44:31 UTC 2023

On Fri, 27 Oct 2023 05:09:23 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/share/opto/vectorIntrinsics.cpp line 1486:
>> 
>>> 1484:     // Check whether the predicated gather/scatter node is supported by architecture.
>>> 1485:     VectorMaskUseType mask = (is_scatter || !is_subword_type(elem_bt)) ? (VectorMaskUseType) (VecMaskUseLoad | VecMaskUsePred) : VecMaskUseLoad;
>>> 1486:     if (!arch_supports_vector(is_scatter ? Op_StoreVectorScatterMasked : Op_LoadVectorGatherMasked, num_elem, elem_bt, mask)) {
>> 
>> What is the difference between subword-type load gather and others? It seems only check `VecMaskUseLoad` is enough for all kinds of masked gather/scatter. Only checking `is_match_rule_supported_vector` for these ops are ok. WDYT?
>
> sub-word gather do not emit any predicated instructions, thus only VectorMaskUseLoad is relevant in this context, however AVX512 and SVE does have a direct predicated gather instructions for 32/64 bit types.

I see, thanks! `VecMaskUsePred` is added to check `match_rule_supported_vector_masked` for normal vector ops. That's because we may add an additional mask input for those vector ops. But `Load|StoreVectorScatterMasked` are different. They point to the masked operations no matter how they are implemented. So just `VecMaskUseLoad` is fine for all these two ops for me.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1374135147