[aarch64-port-dev ] RFR: 8186671: Use `yield` instruction in SpinPause on linux-aarch64

Dmitry Chuyko dmitry.chuyko at bell-sw.com
Thu Aug 24 15:33:25 UTC 2017


On 08/23/2017 10:39 PM, White, Derek wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: aarch64-port-dev [mailto:aarch64-port-dev-
>> bounces at openjdk.java.net] On Behalf Of Andrew Haley
>> Sent: Wednesday, August 23, 2017 12:32 PM
>> To: aarch64-port-dev at openjdk.java.net
>> Subject: Re: [aarch64-port-dev ] RFR: 8186671: Use `yield` instruction in
>> SpinPause on linux-aarch64
>>
>> On 23/08/17 17:07, Dmitry Chuyko wrote:
>>> Please review a change in SpinPause implementation.
>>>
>>> related study:
>>> http://cr.openjdk.java.net/~dchuyko/8186670/yield/spinwait.html
>>> rfe: https://bugs.openjdk.java.net/browse/JDK-8186671
>>> webrev: http://cr.openjdk.java.net/~dchuyko/8186671/webrev.00/
>>>
>>> The function was moved to platform .S file and now contains yield
>>> instruction.
>> Re the use of YIELD for onSpinWait(), I think this probably would be a
>> mistake:
Andrew, thank you for the discussion but I don't quite get your point. 
Some thoughts and questions below.
>> Intel's PAUSE is intended to improve the performance of spin-wait
>> loops, whereas ARM's YIELD is intended to hint that the task performed by a
>> thread is of low importance so that it could yield.
If we go further in yield subsection of ARMv8 Reference Manual, it says:
"Examples of when the YIELD instruction might be used include a thread 
that is sitting in a spin-lock", which to me is the case. If we look at 
Java usage, it is like
----
        else if ((LockSupport.nextSecondarySeed() & OVERFLOW_YIELD_RATE) 
== 0)
             Thread.yield();
         else
             Thread.onSpinWait()
----
Yield is also used in kernel's cpu_relax() variants that look 
semantically close.
>> So, despite that the instructions superficially look similar, they have
>> diametrically opposite semantics!  But we won't really know if YIELD will
>> make a spin loop faster until somebody implements it.
I can imagine yield making entire app throughput higher in SMT case if 
it gives more cycles to neighbor strand.

How do you see typical yield usage with opposite semantics?

What are other possible implementations for the intrinsic?
I'd say that even issuing 2-4 NOP instructions may be useful in both SMT 
and temporal MT case.
> I might re-word this to say that both PAUSE and YIELD were implemented to improve system performance while running spin-loops.
>   
> Intel's PAUSE has several parts to it:
> 1) Cancels out checking for memory order violation on out-of-order reads to speed up spin-loop exit. Also referred to as "de-pipelining"?
> 2) Adds a pause (delay) for some number of cycles (0, or 10-140) to slow down the spin-loop.
>   - Memory updates cannot happen as quickly as instruction execution,
>   - Which may reduce power consumption, or:
> 3) On hyper-threaded cores, may give core resources to the other thread for some number of cycles. If the other thread was part of the spin-loop transaction, this speeds up the spin-loop, otherwise it speeds up the system as a whole.
>
> Aarch64's YIELD seems to address feature (3) only:
>   - A hint that the current thread is low priority and can yield.
>      - On an SMT system this should be able to release core resources to other HW threads for some number of cycles.
> (ARM ARM also talks about how this can be used to "suspend and resume multiple software threads if it supports the capability", but if it's talking about triggering thread rescheduling in the kernel I don't see how the kernel gets notified.)
>
> But I assume that hardware vendors are allowed to implement NOPs like YIELD with whatever latency they choose, so feature (2) can also be supported by YIELD, depending on the implementation (which is true for Intel as well). This could be independent of the core supporting SMT.
>
>   - Derek
>
> In any case we
>> Re the use of yield in SpinPause(): this looks correct to me.  OK.
Good. This part seemed more scaring.

--
Dmitry
>>
>> --
>> Andrew Haley
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com>
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the aarch64-port-dev mailing list