RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 [v11]

Mon Sep 23 12:25:39 UTC 2024

On Mon, 23 Sep 2024 10:49:40 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Mikhail Ablakatov has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - Add asm tests for Neon Vector - Scalar insts
>>  - fixup: restrict Vm to V0-V15 for mulvs when esize is H
>
> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 2887:
> 
>> 2885:       f(0b10, 23, 22), f(index & 1, 21), rf(Vm, 16), f(op2, 15, 12), f(index >> 1, 11);             \
>> 2886:     }                                                                                               \
>> 2887:     f(0, 10), rf(Vn, 5), rf(Vd, 0);                                                                 \
> 
> Suggestion:
> 
> #define INSN(NAME, op1, op2)                                                              \
>   void NAME(FloatRegister Vd, SIMD_Arrangement T, FloatRegister Vn, FloatRegister Vm, int index) { \
>     starti;                                                                               \
>     assert(T == T4H || T == T8H || T == T2S || T == T4S, "invalid arrangement");          \
>     assert(index >= 0 &&                                                                  \
>                ((T == T2S && index <= 1) || (T != T2S && index <= 3) || (T == T8H && index <= 7)), \
>            "invalid index");                                                              \
>     assert((T != T4H && T != T8H) || Vm->encoding() < 16, "invalid source SIMD&FP register"); \
>     f(0, 31), f((int)T & 1, 30), f(op1, 29), f(0b01111, 28, 24), f(0b01, 23, 22);         \
>     if (T == T4H || T == T8H) {                                                           \
>       f(index & 0b11, 21, 20), lrf(Vm, 16);                                               \
>     } else {                                                                              \
>       f(index & 1, 21), rf(Vm, 16);                                                       \
>     }                                                                                     \
>     f(op2, 15, 12), f(index >> 1, 11), f(0, 10), rf(Vn, 5), rf(Vd, 0);                    \
> 
> I think it's a bit easier to see what's going on here if we lose the duplicated code.

Looks like that's incorrect: the 22th-23th bits and 11th bits differ.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18487#discussion_r1771302511