RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 [v5]
Andrew Haley
aph at openjdk.org
Thu Aug 22 15:56:09 UTC 2024
On Thu, 22 Aug 2024 12:23:04 GMT, Mikhail Ablakatov <duke at openjdk.org> wrote:
> > One thing that's odd, but not really wrong. Why do you process byte arrays 32-wide instead of 16-wide like everything else? It makes the code more complex than doing everything 8-wide ...
>
> There's no arrangement specifier for `LD1 (multiple structures)` which instructs to load 4 single byte sized elements per a SIMD&FP register.
Isn't that `ld1 V1.s, V2.s, V3.s, v4.s, [x1]`?
> > ... and doesn't seem to increase performance, either with my measurements or yours.
>
> What measurements are you referring to here?
Your performance figures, and mine, as quoted in this PR.
It's really not important, though.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2305108265
More information about the hotspot-dev
mailing list