RFR: 8318650: Optimized subword gather for x86 targets. [v3]

Sandhya Viswanathan sviswanathan at openjdk.org
Mon Nov 6 18:58:47 UTC 2023


On Sun, 5 Nov 2023 12:58:57 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1606:
>> 
>>> 1604: void C2_MacroAssembler::vpgather8b_offset(BasicType elem_bt, XMMRegister dst, Register base, Register idx_base,
>>> 1605:                                           Register offset, Register rtmp, int vlen_enc) {
>>> 1606:   vpxor(dst, dst, dst, vlen_enc);
>> 
>> Why the vpxor here and in vpgather8b. These are not masked gathers.
>
> This is to clear the vector gathering 8 subwords with each iteration.

This is not a masked operation so every lane of dst will be written through pinsrw/pinsrb. An vpxor before is not required.

>> src/hotspot/cpu/x86/x86.ad line 1921:
>> 
>>> 1919:     case Op_LoadVectorGatherMasked:
>>> 1920:       if (!is_subword_type(bt) && size_in_bits < 512 && !VM_Version::supports_avx512vl()) {
>>> 1921:         return false;
>> 
>> Also need to return false when size_in_bits == 64 for non subword types.
>
> match_rule_supported_vector  called in the beginning will enforce these checks.

This method is match_rule_support_vector and it is not enforcing this check now. It was doing so before through fall through.

>> src/hotspot/cpu/x86/x86.ad line 1939:
>> 
>>> 1937:       // fallthrough
>>> 1938:     case Op_LoadVectorGather:
>>> 1939:       if (!is_subword_type(bt) && size_in_bits == 64 ) {
>> 
>> Since this is a fallthrough case, the change also applies to StoreVectorScatter*.  The !is_subword_type() is needed only for LoadVectorGather.
>
> We do not intrinsify sub-word scatter operations currently.

The StoreVectorScatter* should return false when size_in_bits == 64 irrespective of subword.  It was doing so before and is not doing so now.

>> src/hotspot/share/opto/vectorIntrinsics.cpp line 1579:
>> 
>>> 1577:   Node* index    = argument(11);
>>> 1578:   Node* indexMap = argument(12);
>>> 1579:   Node* indexM   = argument(13);
>> 
>> Could be renamed as follows for better understanding: index -> offset, indexM -> indexOffset. Also this should be moved under else part of if (is_scatter).
>
> Naming scheme is based on the actual names used in intrinsic entry point for these arguments.

How about moving this to else part as we are not using these for Scatter?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1383795523
PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1383799090
PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1383803037
PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1383818954


More information about the core-libs-dev mailing list