RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v3]

Jatin Bhateja jbhateja at openjdk.org
Thu Sep 19 07:10:38 UTC 2024


On Wed, 18 Sep 2024 17:00:30 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Currently the rearrange and selectFrom APIs check shuffle indices and throw IndexOutOfBoundsException if there is any exceptional source index in the shuffle. This causes the generated code to be less optimal. This PR modifies the rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes and performs optimizations to generate efficient code.
>> 
>> Summary of changes is as follows:
>>  1) The rearrange/selectFrom methods do wrapIndexes instead of checkIndexes.
>>  2) Intrinsic for wrapIndexes and selectFrom to generate efficient code
>> 
>> For the following source:
>> 
>> 
>>     public void test() {
>>         var index = ByteVector.fromArray(bspecies128, shuffles[1], 0);
>>         for (int j = 0; j < bspecies128.loopBound(size); j += bspecies128.length()) {
>>             var inpvect = ByteVector.fromArray(bspecies128, byteinp, j);
>>             index.selectFrom(inpvect).intoArray(byteres, j);
>>         }
>>     }
>> 
>> 
>> The code generated for inner main now looks as follows:
>> ;; B24: #      out( B24 B25 ) <- in( B23 B24 ) Loop( B24-B24 inner main of N173 strip mined) Freq: 4160.96
>>   0x00007f40d02274d0:   movslq %ebx,%r13
>>   0x00007f40d02274d3:   vmovdqu 0x10(%rsi,%r13,1),%xmm1
>>   0x00007f40d02274da:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d02274df:   vmovdqu %xmm1,0x10(%rax,%r13,1)
>>   0x00007f40d02274e6:   vmovdqu 0x20(%rsi,%r13,1),%xmm1
>>   0x00007f40d02274ed:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d02274f2:   vmovdqu %xmm1,0x20(%rax,%r13,1)
>>   0x00007f40d02274f9:   vmovdqu 0x30(%rsi,%r13,1),%xmm1
>>   0x00007f40d0227500:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d0227505:   vmovdqu %xmm1,0x30(%rax,%r13,1)
>>   0x00007f40d022750c:   vmovdqu 0x40(%rsi,%r13,1),%xmm1
>>   0x00007f40d0227513:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d0227518:   vmovdqu %xmm1,0x40(%rax,%r13,1)
>>   0x00007f40d022751f:   add    $0x40,%ebx
>>   0x00007f40d0227522:   cmp    %r8d,%ebx
>>   0x00007f40d0227525:   jl     0x00007f40d02274d0
>> 
>> Best Regards,
>> Sandhya
>
> Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Change method name

src/hotspot/share/opto/vectorIntrinsics.cpp line 772:

> 770: 
> 771:   if (elem_klass == nullptr || shuffle_klass == nullptr || shuffle->is_top() || vlen == nullptr) {
> 772:     return false; // dead code

Why dead code  in comment ? this is a failed intrinsification condition.

src/hotspot/share/opto/vectorIntrinsics.cpp line 776:

> 774:   if (!vlen->is_con() || shuffle_klass->const_oop() == nullptr) {
> 775:     return false; // not enough info for intrinsification
> 776:   }

Why don't you club it with above conditions to be consistent with other inline expanders ?

src/hotspot/share/opto/vectorIntrinsics.cpp line 790:

> 788:   // Shuffles use byte array based backing storage
> 789:   BasicType shuffle_bt = T_BYTE;
> 790: 

No need a of new line b/w 789 and 791

src/hotspot/share/opto/vectorIntrinsics.cpp line 793:

> 791:   if (!arch_supports_vector(Op_AndV, num_elem, shuffle_bt, VecMaskNotUsed) ||
> 792:       !arch_supports_vector(Op_Replicate, num_elem, shuffle_bt, VecMaskNotUsed)) {
> 793:     return false;

You should emit proper intrinsification failure message here.

src/hotspot/share/opto/vectorIntrinsics.cpp line 805:

> 803:   const TypeVect* vt  = TypeVect::make(shuffle_bt, num_elem);
> 804:   const Type* shuffle_type_bt = Type::get_const_basic_type(shuffle_bt);
> 805: 

No need of a blank line here.

src/hotspot/share/opto/vectorIntrinsics.cpp line 808:

> 806:   Node* mod_mask = gvn().makecon(TypeInt::make(num_elem-1));
> 807:   Node* bcast_mod_mask  = gvn().transform(VectorNode::scalar2vector(mod_mask, num_elem, shuffle_type_bt));
> 808: 

Remove redundant new line.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766272449
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766273205
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766273880
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766274718
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766275107
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766275345


More information about the core-libs-dev mailing list