RFR: 8310691: [REDO] [vectorapi] Refactor VectorShuffle implementation

Quan Anh Mai qamai at openjdk.org
Wed Sep 18 16:15:38 UTC 2024


Hi,

This is just a redo of https://github.com/openjdk/jdk/pull/13093. mostly just the revert of the backout.

Regarding the related issues:

- [JDK-8306008](https://bugs.openjdk.org/browse/JDK-8306008) and [JDK-8309531](https://bugs.openjdk.org/browse/JDK-8309531) have been fixed before the backout.
- [JDK-8309373](https://bugs.openjdk.org/browse/JDK-8309373) was due to missing `ForceInline` on `AbstractVector::toBitsVectorTemplate`
- [JDK-8306592](https://bugs.openjdk.org/browse/JDK-8306592), I have not been able to find the root causes. I'm not sure if this is a blocker, now I cannot even build x86-32 tests.

Finally, I moved some implementation of public methods and methods that call into intrinsics to the concrete class as that may help the compiler know the correct types of the variables.

Please take a look and leave reviews. Thanks a lot.

The description of the original PR:

This patch reimplements `VectorShuffle` implementations to be a vector of the bit type. Currently, `VectorShuffle` is stored as a byte array, and would be expanded upon usage. This poses several drawbacks:

Inefficient conversions between a shuffle and its corresponding vector. This hinders the performance when the shuffle indices are not constant and are loaded or computed dynamically.
Redundant expansions in `rearrange` operations. On all platforms, it seems that a shuffle index vector is always expanded to the correct type before executing the `rearrange` operations.
Some redundant intrinsics are needed to support this handling as well as special considerations in the C2 compiler.
Range checks are performed using `VectorShuffle::toVector`, which is inefficient for FP types since both FP conversions and FP comparisons are more expensive than the integral ones.
Upon these changes, a `rearrange` can emit more efficient code:

    var species = IntVector.SPECIES_128;
    var v1 = IntVector.fromArray(species, SRC1, 0);
    var v2 = IntVector.fromArray(species, SRC2, 0);
    v1.rearrange(v2.toShuffle()).intoArray(DST, 0);

    Before:
    movabs $0x751589fa8,%r10            ;   {oop([I{0x0000000751589fa8})}
    vmovdqu 0x10(%r10),%xmm2
    movabs $0x7515a0d08,%r10            ;   {oop([I{0x00000007515a0d08})}
    vmovdqu 0x10(%r10),%xmm1
    movabs $0x75158afb8,%r10            ;   {oop([I{0x000000075158afb8})}
    vmovdqu 0x10(%r10),%xmm0
    vpand  -0xddc12(%rip),%xmm0,%xmm0        # Stub::vector_int_to_byte_mask
                                                            ;   {external_word}
    vpackusdw %xmm0,%xmm0,%xmm0
    vpackuswb %xmm0,%xmm0,%xmm0
    vpmovsxbd %xmm0,%xmm3
    vpcmpgtd %xmm3,%xmm1,%xmm3
    vtestps %xmm3,%xmm3
    jne    0x00007fc2acb4e0d8
    vpmovzxbd %xmm0,%xmm0
    vpermd %ymm2,%ymm0,%ymm0
    movabs $0x751588f98,%r10            ;   {oop([I{0x0000000751588f98})}
    vmovdqu %xmm0,0x10(%r10)

    After:
    movabs $0x751589c78,%r10            ;   {oop([I{0x0000000751589c78})}
    vmovdqu 0x10(%r10),%xmm1
    movabs $0x75158ac88,%r10            ;   {oop([I{0x000000075158ac88})}
    vmovdqu 0x10(%r10),%xmm2
    vpxor  %xmm0,%xmm0,%xmm0
    vpcmpgtd %xmm2,%xmm0,%xmm3
    vtestps %xmm3,%xmm3
    jne    0x00007fa818b27cb1
    vpermd %ymm1,%ymm2,%ymm0
    movabs $0x751588c68,%r10            ;   {oop([I{0x0000000751588c68})}
    vmovdqu %xmm0,0x10(%r10)

-------------

Commit messages:
 - copyright year
 - remove LoadShuffle from riscv, whitespace
 - tighten concrete types
 - [vectorapi] Refactor VectorShuffle implementation

Changes: https://git.openjdk.org/jdk/pull/21042/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21042&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8310691
  Stats: 4984 lines in 64 files changed: 2984 ins; 981 del; 1019 mod
  Patch: https://git.openjdk.org/jdk/pull/21042.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21042/head:pull/21042

PR: https://git.openjdk.org/jdk/pull/21042


More information about the hotspot-compiler-dev mailing list