RFR: 8322535: Change default AArch64 SpinPause instruction

Andrew Haley aph at openjdk.org
Tue Jan 16 14:57:22 UTC 2024


On Tue, 16 Jan 2024 10:18:26 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> The Java options OnSpinWaitInst lets you choose which AArch64 instruction should be used in `SpinPause()`. Valid values are "none", "nop", "isb" and "yield". Today the default value for OnSpinWaitInst is unfortunately "none".
>> 
>> However some CPUs changes the default SpinPause instruction to something better if the user hasn't used the OnSpinWaitInst option. For instance if you run a Neoverse N1, N2, V1 or V2, the default SpinPause instruction will be changed to "isb". After doing some measurements on Apple's M1-M3 CPUs it also seems like "isb" is the best yielding instruction on on those CPUs.
>> 
>> This PR changes the default SpinPause instruction to "yield" on all AArch64 platforms except on Apple's M1, M2 and M3 CPUs on which the default value will be "isb".
>> 
>> Tested tier1-tier7 successfully on linux-aarch64 and macosx-aarch64.
>
> ISB isn't really the right thing for this. Sure, it causes a delay, but the extent of the delay depends on what else the processor is doing. In some cases an ISB can work well, in other cases not. Some micro benchmarks show a great improvement with ISB. 
> It doesn't depend only on the target hardware, but on the application. Sure, in some cases an ISB is going to be exactly right, but on others it might be too much.

> > For the most part, "YIELD" is probably going to be equivalent to a "NOP". Unless there is a a demonstrable reason for this change, I would leave it as it is. With regards to the change, do you have a suite of benchmark data that demonstrates this is a benefit on Apple Silicon? Otherwise, as @theRealAph says, microbenchmarks can demonstrate an benefit from ISBs, but applications overall won't necessarily show any benefit.
> 
> I agree it would be interesting to see whether desktop applications get any improvements from ISB. We observed good performance improvements in our customers' cloud applications.

Your customers are running cloud apps on Apple M1/M2?

> BTW, ISB gets spread: https://github.com/DLTcollab/sse2neon/blob/master/sse2neon.h#L4812C14-L4812C14 [rust-lang/rust at c064b65](https://github.com/rust-lang/rust/commit/c064b6560b7ce0adeb9bbf5d7dcf12b1acb0c807) https://github.com/simd-everywhere/simde/blob/14311d60539303ca8bad6204dcbc6a29f51b0e09/simde/x86/sse2.h#L4770

Yeah, I know. What I don't know is how much of a cargo cult this is. Apple M1 etc. have very large reorder buffers, and serializing all instructions may not be the best plan.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17430#issuecomment-1893909840


More information about the hotspot-dev mailing list