[vectorIntrinsics] RFR: 8283413: Add C2 mid-end and x86 back-end implementation for bit REVERSE and REVERSE_BYTES operation [v3]

Thu Mar 24 08:46:24 UTC 2022

On Wed, 23 Mar 2022 22:05:34 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Hi All,
>> 
>> Patch includes following changes:-
>> - New C2 IR nodes to support VectorOperations.REVERSE operation.
>> - X86 backend implementation for targets supporting AVX2, AVX512 and GFNI features.
>> 
>> Please find below the performance data of Vector API JMH micros:-
>> 
>> System Configuration:
>> ICX: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (40C 2S)
>> CLX: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (28C 2S)
>> 
>> ![image](https://user-images.githubusercontent.com/59989778/159196997-fd1ae2ad-37ee-4294-9928-5764707bb456.png)
>> 
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits:
> 
>  - 8283413: White space removal.
>  - 8283413: Review comments resoultion, ReverseBytes IR and x86 backend support.
>  - Merge branch 'vectorIntrinsics' of http://github.com/openjdk/panama-vector into JDK-8283413
>  - 8283413: Adding Ideal transform for (ReverseV (ReverseV VEC)) => VEC and (ReverseV (ReverseV VEC MASK) MASK)) => VEC
>  - 8283413: Add C2 mid-end and x86 back-end implementation for bit REVERSE operation

Looks good to me otherwise, thank you very much.

src/hotspot/cpu/x86/assembler_x86.cpp line 9971:

> 9969:   assert(VM_Version::supports_avx(), "");
> 9970:   InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
> 9971:   attributes.set_is_evex_instruction();

This is not an evex-exclusive instruction

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4535:

> 4533: void C2_MacroAssembler::vector_reverse_bit(BasicType bt, XMMRegister dst, XMMRegister src, XMMRegister xtmp1,
> 4534:                                            XMMRegister xtmp2, XMMRegister xtmp3, Register rtmp, int vec_enc) {
> 4535:   if (VM_Version::supports_avx512bw()) {

`VM_Version::supports_avx512bw() && (VM_Version::supports_avx512vl() || vec_enc == AVX_512bit)`

src/hotspot/cpu/x86/x86.ad line 8982:

> 8980: 
> 8981: // -------------------------------- Bit and Byte Reversal Vector Operations ------------------------
> 8982: instruct vreverse_reg_avx(vec dst, vec src, vec xtmp1, vec xtmp2, vec xtmp3, rRegI rtmp) %{

This should be `legVec` as this relies on the AVX encoding

src/hotspot/cpu/x86/x86.ad line 8997:

> 8995: 
> 8996: instruct vreverse_reg_evex(vec dst, vec src, vec xtmp1, vec xtmp2, rRegI rtmp) %{
> 8997:   predicate((VM_Version::supports_avx512bw() || Matcher::vector_length_in_bytes(n) == 64) && !VM_Version::supports_gfni());

If vector length < 64 we need avx512vl here. I think this and the above should be `VM_Version::supports_avx512vlbw()` instead

src/hotspot/cpu/x86/x86.ad line 9026:

> 9024: 
> 9025: instruct vreverse_byte_reg(vec dst, vec src, rRegI rtmp) %{
> 9026:   predicate(VM_Version::supports_avx512bw() || Matcher::vector_length_in_bytes(n) < 64);

For `!VM_Version::supports_avx512bw` this needs `legVec`

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/182