RFR: 8258604: Use 'isb' instruction in SpinPause on linux-aarch64 [v2]

Stuart Monteith smonteith at openjdk.java.net
Fri Oct 1 17:49:42 UTC 2021


On Fri, 13 Aug 2021 19:33:45 GMT, Evgeny Astigeevich <github.com+42899633+eastig at openjdk.org> wrote:

>> To implement `SpinPause `an instruction creating a small delay without consuming ALU resources is needed. AArch64 has the YIELD instruction but most of AArch64 vendors still implement it as a NOP. It's been found that ISB can create a small delay without consuming ALU resources. It is more reliable than NOPs.
>> 
>> This patch implements `SpinPause` with ISB.
>> Testing: gtest, tier1, tier2
>> 
>> Results of https://github.com/ben-manes/caffeine/wiki/Benchmarks  
>> 
>> `GetPutBenchmark` has 8 threads concurrently working with a cache.
>> `GetPutBenchmark.read_only` all 8 thread read from a cache.
>> `GetPutBenchmark.write_only` all 8 thread write to a cache.
>> `GetPutBenchmark.readwrite` 6 threads concurrently read from and 2 threads write to a cache.
>> 
>> Improvements when only 4 CPUs are available:
>> 
>> 
>> (cacheType)         Benchmark                               Units    Baseline       Error           CoV     New             Error           CoV     New vs Base
>> Cache2k             GetPutBenchmark.readwrite:readwrite_get ops/s   58,893,980.32   1,903,084.75    3.23%   65,685,338.20   4,858,329.69    7.40%   11.5%
>> ExpiringMap_Fifo    GetPutBenchmark.read_only               ops/s   5,060,227.98    695,582.74      13.75%  6,040,948.02    1,046,581.70    17.32%  19.4%
>> ExpiringMap_Fifo    GetPutBenchmark.readwrite:readwrite_put ops/s   433,046.21      194,297.64      44.87%  783,825.18      272,868.60      34.81%  81.0%
>> LinkedHashMap_Lru   GetPutBenchmark.read_only               ops/s   6,405,770.38    607,772.98      9.49%   10,259,858.51   1,446,471.70    14.10%  60.2%
>> LinkedHashMap_Lru   GetPutBenchmark.readwrite:readwrite_get ops/s   3,083,275.32    400,878.84      13.00%  5,934,784.52    1,309,784.39    22.07%  92.5%
>> LinkedHashMap_Lru   GetPutBenchmark.readwrite:readwrite_put ops/s   1,669,852.29    356,469.07      21.35%  2,491,807.50    299,230.23      12.01%  49.2%
>> LinkedHashMap_Lru   GetPutBenchmark.write_only              ops/s   4,623,049.93    449,438.94      9.72%   7,069,151.98    1,725,457.51    24.41%  52.9%
>> 
>> 
>> Improvements when only 8 CPUs are available:
>> 
>> 
>> (cacheType)         Benchmark                               Units    Baseline       Error       CoV     New             Error       CoV      New vs Base
>> LinkedHashMap_Lru   GetPutBenchmark.read_only               ops/s   4,054,507.30    71,693.81   1.77%   4,306,745.31    226,212.96  5.25%   6.2%
>> LinkedHashMap_Lru   GetPutBenchmark.readwrite:readwrite_get ops/s   3,087,578.38    86,158.78   2.79%   2,818,742.51    301,261.76  10.69%  -8.7%
>> LinkedHashMap_Lru   GetPutBenchmark.readwrite:readwrite_put ops/s   1,053,560.98    66,412.90   6.30%   1,401,928.05    239,055.34  17.05%  33.1%
>> LinkedHashMap_Lru   GetPutBenchmark.write_only              ops/s   3,823,208.44    305,096.58  7.98%   4,510,823.31    390,622.00  8.66%   18.0%
>> 
>> 
>> [LinkedHashMapCache source](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/jmh/java/com/github/benmanes/caffeine/cache/impl/LinkedHashMapCache.java)
>> [Cache2k source](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/jmh/java/com/github/benmanes/caffeine/cache/impl/Cache2k.java)
>> [ExpiringMapCache source](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/jmh/java/com/github/benmanes/caffeine/cache/impl/ExpiringMapCache.java)
>> [GetPutBenchmark source](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/jmh/java/com/github/benmanes/caffeine/cache/GetPutBenchmark.java)
>
> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8258604: Use 'isb' instruction in SpinPause on linux-aarch64
>   
>   Fix: SpinPause() should return 1 if it executes code.

src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp line 387:

> 385:     // ISB creates a small delay without consuming ALU resources.
> 386:     // This allows it to act as x86 PAUSE.
> 387:     __asm volatile("isb");

I hadn't realised this was unconditional. Similar to my comment on the onSpinWait patch, while the ISB instruction happens to work for you just now on the platforms you have tested, there is no guarantee that it will continue to do so in future processors. 
Could this be made optional? This is awkward, as it is C++ code, but it may have to be a build configuration option.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5112


More information about the hotspot-runtime-dev mailing list