RFR: 8331558: AArch64: optimize integer remainder [v2]
Jin Guojie
duke at openjdk.org
Wed May 8 02:26:58 UTC 2024
On Mon, 6 May 2024 10:08:30 GMT, Jin Guojie <duke at openjdk.org> wrote:
>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 447:
>>
>>> 445: inline void msub(Register Rd, Register Rn, Register Rm, Register Ra) {
>>> 446: if (VM_Version::supports_a53mac() && Ra != zr)
>>> 447: nop();
>>
>> It was in JDK-8079203 [1] for the first time. May I ask what's the specials on a53mac?
>>
>> [1] https://github.com/openjdk/jdk/commit/a65f9f95894e22ce2fd160024ce46f6aaa6c8bd3
>
> This code entered the JDK in 2015. Frankly, I have no idea why an extra nop is needed on CPUs with the a53mac feature.
> Perhaps the author of patch a65f9f9589, enevill at openjdk.org, could explain?
> It was in JDK-8079203 [1] for the first time. May I ask what's the specials on a53mac?
>
> [1] [a65f9f9](https://github.com/openjdk/jdk/commit/a65f9f95894e22ce2fd160024ce46f6aaa6c8bd3)
@e1iu
The feature is clearly described in this material:
**Cortex-A53 MPCore Product Revision r0 - Software Developers Errata Notice**
https://developer.arm.com/documentation/EPM048406/2000/?lang=en
> 835769: AArch64 multiply-accumulate instruction might produce incorrect result
>
> Description
> When executing in AArch64 state, some multiply-accumulate instructions which read an accumulator operand from the
> result of an earlier multiply instruction might produce incorrect results.
>
> Workaround
> The only viable workaround is to avoid any of these code sequences, typically by avoiding the use of multiply-
> accumulate instructions, or by inserting a NOP between any adjacent load/store/prefetch instruction and multiply-
> accumulate instruction with no data dependency between them.
>
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19093#discussion_r1593300840
More information about the hotspot-dev
mailing list