RFR: 8353237: [AArch64] Incorrect result of VectorizedHashCode intrinsic on Cortex-A53
Stuart Monteith
smonteith at openjdk.org
Thu Apr 10 17:05:39 UTC 2025
On Thu, 10 Apr 2025 16:10:26 GMT, Volker Simonis <simonis at openjdk.org> wrote:
>> The root of the problem is that VectorizedHashCode intrinsic introduced by JDK-8341194 is not aware of JDK-8079203. JDK-8079203 generates additional nop with madd instruction on Cortex-A53 as a workaround for Cortex-A53 erratum 835769 "AArch64 multiply-accumulate instruction might produce incorrect result". Current VectorizedHashCode intrinsic calculates byte offset to jump inside the unrolled loop code. It assumes 2 instructions per each unrolled iteration (load and madd). JDK-8079203 adds additional nop for Cortex-A53, which breaks offset calculation logic.
>>
>> Current offset calculation logic is using shift instead of multiplication, power-of-2 number instructions are present in each unrolled loop iteration. To keep it simple, this fix adds one more nop into each loop iteration on Cortex-A53 in order to have 4 instruction per iteration, which is also a power-of-2. To account for that, the shift argument for offset calculation logic is increased by 1, because each loop iteration has 2 times more instructions on Cortex-A53.
>>
>> This fix is tested on Raspberry Pi 3 (based on Cortex-A53) by running initially reported application and by running hotspot jtreg tests (not a single test could be run on Cortex-A53 before the fix). After the fix, the specialized test hotspot/jtreg/compiler/intrinsics/TestArraysHashCode.java passes.
>>
>> The performance gain from the intrinsic is also observed on Cortex-A53 using the ArraysHashCode benchmark.
>
> This sounds a little scary. Is there a comprehensive list of available ARM erratas that require special workarounds in the native compiler and the HotSpot's own code generation? @stooart-mon?
To not answer your question @simonis, errata are published for each processor, with details on how each revision of the CPU is affected. For example. the A53:
https://developer.arm.com/documentation/epm048406/2100/?lang=en
I'm not aware or a curated list of CPU errata and mitigations for compilers, so I'll get back to you.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24489#issuecomment-2794561452
More information about the hotspot-compiler-dev
mailing list