RFR: 8366441: AArch64: Support WFET in OnSpinWait [v3]

Per Minborg pminborg at openjdk.org
Tue Feb 10 07:24:42 UTC 2026


On Mon, 9 Feb 2026 21:33:44 GMT, Ruben <duke at openjdk.org> wrote:

>> Implement OnSpinWait based on WFET - wait for event with timeout:
>>  - introduce OnSpinWaitDelay - the OnSpinWait time in nanoseconds;
>>  - the OnSpinWaitInstCount is expected to be 1 when WFET is used;
>>  - the waiting loop is followed by SB - to ensure following instructions aren't speculated until wait is finished;
>>  - the timer register is read via the self-synchronized view CNTVCTSS_EL0 to prevent the read being hoisted out of the loop.
>> 
>> The WFET and CNTVCTSS_EL0 read are added to aarch64-asmtest.py as hex values - using the instruction mnemonics would require support of -march=armv9-2.a, and consequently, the binutils 2.36+.
>
> Ruben has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits:
> 
>  - Set default OnSpinWaitDelay to 100
>  - Address review comments
>  - Apply PR review "Suggested changes" from @theRealAph
>  - Merge from mainline
>  - Fix bsd_aarch64 build
>  - Update
>    
>    - Address review comments
>    - Fix test
>    - Mark the support experimental
>    - Remove changes in src/hotspot/os_cpu/bsd_aarch64
>  - Merge from mainline
>  - 8366441: AArch64: Support WFET in OnSpinWait
>    
>    Implement OnSpinWait based on WFET - wait for event with timeout:
>     - introduce OnSpinWaitDelay - the OnSpinWait time in nanoseconds;
>     - the OnSpinWaitInstCount is expected to be 1 when WFET is used;
>     - the waiting loop is followed by SB - to ensure following instructions
>       aren't speculated until wait is finished;
>     - the timer register is read via the self-synchronized view
>       CNTVCTSS_EL0 to prevent the read being hoisted out of the loop.
>    
>    The WFET and CNTVCTSS_EL0 read are added to aarch64-asmtest.py as
>    hex values - using the instruction mnemonics would require support of
>    -march=armv9-2.a, and consequently, the binutils 2.36+.
>    
>    Co-authored-by: Stuart Monteith <stuart.monteith at arm.com>

Unless the default value is changed, this will have a relatively large impact on certain low-latency applications where one waits for a CAS-lock in a spin loop. Latencies in such applications are often measured in ns. So, this is something that should be communicated to the community in some way.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27030#issuecomment-3875804714


More information about the hotspot-dev mailing list