[vectorIntrinsics] RFR: 8283413: Add C2 mid-end and x86 back-end implementation for bit REVERSE and REVERSE_BYTES operation [v2]
Jatin Bhateja
jbhateja at openjdk.java.net
Wed Mar 23 22:05:35 UTC 2022
On Tue, 22 Mar 2022 10:02:28 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>>
>> 8283413: Adding Ideal transform for (ReverseV (ReverseV VEC)) => VEC and (ReverseV (ReverseV VEC MASK) MASK)) => VEC
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4397:
>
>> 4395: #endif
>> 4396:
>> 4397: void C2_MacroAssembler::vector_reverse_bit_avx(BasicType bt, XMMRegister dst, XMMRegister src, XMMRegister xtmp1,
>
> You can do a this bit reverse using lookup table on each nibble and oring the results, the pseudocode would look something like this
>
> lut = broadcasti128(0b0000, 0b1000, 0b0100, 0b1100, 0b0010, 0b1010, 0b0110, 0b1110, 0b0001, ...)
> mask = pbroadcastd(0x0f0f0f0f)
> hi = pand(src, mask)
> hi = pshufb(lut, hi)
> hi = pslld(hi, 4)
> lo = psrld(src, 4)
> lo = pand(lo, mask)
> lo = pshufb(lut, lo)
> dst = por(lo, hi)
@merykitty , your comments addressed.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/182
More information about the panama-dev
mailing list