RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v4]

Sandhya Viswanathan sviswanathan at openjdk.org
Thu Sep 19 21:43:01 UTC 2024


> Currently the rearrange and selectFrom APIs check shuffle indices and throw IndexOutOfBoundsException if there is any exceptional source index in the shuffle. This causes the generated code to be less optimal. This PR modifies the rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes and performs optimizations to generate efficient code.
> 
> Summary of changes is as follows:
>  1) The rearrange/selectFrom methods do wrapIndexes instead of checkIndexes.
>  2) Intrinsic for wrapIndexes and selectFrom to generate efficient code
> 
> For the following source:
> 
> 
>     public void test() {
>         var index = ByteVector.fromArray(bspecies128, shuffles[1], 0);
>         for (int j = 0; j < bspecies128.loopBound(size); j += bspecies128.length()) {
>             var inpvect = ByteVector.fromArray(bspecies128, byteinp, j);
>             index.selectFrom(inpvect).intoArray(byteres, j);
>         }
>     }
> 
> 
> The code generated for inner main now looks as follows:
> ;; B24: #      out( B24 B25 ) <- in( B23 B24 ) Loop( B24-B24 inner main of N173 strip mined) Freq: 4160.96
>   0x00007f40d02274d0:   movslq %ebx,%r13
>   0x00007f40d02274d3:   vmovdqu 0x10(%rsi,%r13,1),%xmm1
>   0x00007f40d02274da:   vpshufb %xmm2,%xmm1,%xmm1
>   0x00007f40d02274df:   vmovdqu %xmm1,0x10(%rax,%r13,1)
>   0x00007f40d02274e6:   vmovdqu 0x20(%rsi,%r13,1),%xmm1
>   0x00007f40d02274ed:   vpshufb %xmm2,%xmm1,%xmm1
>   0x00007f40d02274f2:   vmovdqu %xmm1,0x20(%rax,%r13,1)
>   0x00007f40d02274f9:   vmovdqu 0x30(%rsi,%r13,1),%xmm1
>   0x00007f40d0227500:   vpshufb %xmm2,%xmm1,%xmm1
>   0x00007f40d0227505:   vmovdqu %xmm1,0x30(%rax,%r13,1)
>   0x00007f40d022750c:   vmovdqu 0x40(%rsi,%r13,1),%xmm1
>   0x00007f40d0227513:   vpshufb %xmm2,%xmm1,%xmm1
>   0x00007f40d0227518:   vmovdqu %xmm1,0x40(%rax,%r13,1)
>   0x00007f40d022751f:   add    $0x40,%ebx
>   0x00007f40d0227522:   cmp    %r8d,%ebx
>   0x00007f40d0227525:   jl     0x00007f40d02274d0
> 
> Best Regards,
> Sandhya

Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision:

  Implement review comments

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/20634/files
  - new: https://git.openjdk.org/jdk/pull/20634/files/87e103ee..f8e67fb3

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20634&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20634&range=02-03

  Stats: 27 lines in 1 file changed: 9 ins; 8 del; 10 mod
  Patch: https://git.openjdk.org/jdk/pull/20634.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20634/head:pull/20634

PR: https://git.openjdk.org/jdk/pull/20634


More information about the core-libs-dev mailing list