RFR: 8297092: [macos_aarch64] Add support for SHA feature detection [v3]

Hao Sun haosun at openjdk.org
Thu Jan 12 07:06:29 UTC 2023


On Tue, 10 Jan 2023 22:02:50 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:

>> Hao Sun has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>> 
>>  - Update the copyright year to 2023
>>  - Merge branch 'master' into jdk-8297092
>>  - Enable UseSHA3Intrinsics by default on Apple silicon
>>    
>>    Here shows the test on M1.
>>    
>>    ```
>>    $ ./build-fastdebug/images/jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
>>      bool UseSHA                   = true       {product} {default}
>>      bool UseSHA1Intrinsics        = true    {diagnostic} {default}
>>      bool UseSHA256Intrinsics      = true    {diagnostic} {default}
>>      bool UseSHA3Intrinsics        = true    {diagnostic} {default}
>>      bool UseSHA512Intrinsics      = true    {diagnostic} {default}
>>    ```
>>  - 8297092: [macos_aarch64] Add support for SHA feature detection
>>    
>>    The detection of SHA features including SHA1, SHA2, SHA512 and SHA3, is
>>    missing in macOS on Apple silicon. In this patch, we add the support.
>>    
>>    * SHA1 and SHA2 are set uncoditionally to true on darwin. See [1].
>>    
>>    * SHA512 and SHA3 will be set depending on the existing `hw.optional`
>>      flags
>>    
>>    * Some old links are updated in this patch as well.
>>    
>>    With the detection, the corresponding SHA related intrinsics flags are
>>    enabled. See the test result on my local environment, i.e. M1.
>>    
>>    ```
>>    $ ./jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
>>         bool UseSHA                = true       {product} {default}
>>         bool UseSHA1Intrinsics     = true    {diagnostic} {default}
>>         bool UseSHA256Intrinsics   = true    {diagnostic} {default}
>>         bool UseSHA3Intrinsics     = false   {diagnostic} {default}
>>         bool UseSHA512Intrinsics   = true    {diagnostic} {default}
>>    
>>    $ ./jdk/bin/java -XX:+PrintFlagsFinal -XX:+UseSHA3Intrinsics  --version | grep SHA3
>>         bool UseSHA3Intrinsics     = true    {diagnostic} {command line}
>>    ```
>>    
>>    Note that UseSHA3Intrinsics are NOT enabled by default on supported
>>    hardware because there is performance regression one some hardware. See
>>    JDK-8292894. Hence we have to enable it explicitly.
>>    
>>    Jtreg testing:
>>    
>>    We ran tier1~3 on M1 and found the following two new failures. They can
>>    be fixed once JDK-8292894 are resolved.
>>    
>>    ```
>>    compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java
>>    ```
>>    
>>    Performance testing:
>>    
>>    JMH test case MessageDigests.java is used to evaluate the performance
>>    uplifts of UseSHA*Intrinsics. Here shows the data on M1.
>>    
>>    ```
>>    Benchmark                    (digesterName)  (length)   Mode  Cnt    Before      After   Units
>>    MessageDigests.digest                 SHA-1        64  thrpt   15  2670.195  14767.116  ops/ms
>>    MessageDigests.digest                 SHA-1     16384  thrpt   15    21.756    147.816  ops/ms
>>    MessageDigests.digest               SHA-224        64  thrpt   15  1729.686  17366.833  ops/ms
>>    MessageDigests.digest               SHA-224     16384  thrpt   15    14.044    149.707  ops/ms
>>    MessageDigests.digest               SHA-256        64  thrpt   15  1729.529  16915.554  ops/ms
>>    MessageDigests.digest               SHA-256     16384  thrpt   15    14.040    149.720  ops/ms
>>    MessageDigests.digest               SHA-384        64  thrpt   15  2551.285   8996.372  ops/ms
>>    MessageDigests.digest               SHA-384     16384  thrpt   15    21.338     86.996  ops/ms
>>    MessageDigests.digest               SHA-512        64  thrpt   15  2528.185   8886.926  ops/ms
>>    MessageDigests.digest               SHA-512     16384  thrpt   15    21.298     86.840  ops/ms
>>    MessageDigests.digest              SHA3-256        64  thrpt   15  3595.392   5628.810  ops/ms
>>    MessageDigests.digest              SHA3-256     16384  thrpt   15    29.208     52.917  ops/ms
>>    MessageDigests.digest              SHA3-512        64  thrpt   15  3604.652   5563.199  ops/ms
>>    MessageDigests.digest              SHA3-512     16384  thrpt   15    16.545     28.182  ops/ms
>>    MessageDigests.getAndDigest           SHA-1        64  thrpt   15  2467.470  11878.581  ops/ms
>>    MessageDigests.getAndDigest           SHA-1     16384  thrpt   15    21.179    146.849  ops/ms
>>    MessageDigests.getAndDigest         SHA-224        64  thrpt   15  1635.583   5864.632  ops/ms
>>    MessageDigests.getAndDigest         SHA-224     16384  thrpt   15    14.023    148.556  ops/ms
>>    MessageDigests.getAndDigest         SHA-256        64  thrpt   15  1633.913   5741.416  ops/ms
>>    MessageDigests.getAndDigest         SHA-256     16384  thrpt   15    14.029    148.518  ops/ms
>>    MessageDigests.getAndDigest         SHA-384        64  thrpt   15  2373.041   7634.589  ops/ms
>>    MessageDigests.getAndDigest         SHA-384     16384  thrpt   15    21.335     86.452  ops/ms
>>    MessageDigests.getAndDigest         SHA-512        64  thrpt   15  2366.305   7781.053  ops/ms
>>    MessageDigests.getAndDigest         SHA-512     16384  thrpt   15    21.324     86.352  ops/ms
>>    MessageDigests.getAndDigest        SHA3-256        64  thrpt   15  3154.196   4770.368  ops/ms
>>    MessageDigests.getAndDigest        SHA3-256     16384  thrpt   15    29.028     52.621  ops/ms
>>    MessageDigests.getAndDigest        SHA3-512        64  thrpt   15  3345.374   4828.986  ops/ms
>>    MessageDigests.getAndDigest        SHA3-512     16384  thrpt   15    16.443     28.063  ops/ms
>>    ```
>>    
>>    [1] https://github.com/golang/go/blob/master/src/internal/cpu/cpu_arm64_darwin.go#L14
>
> src/hotspot/os_cpu/bsd_aarch64/vm_version_bsd_aarch64.cpp line 75:
> 
>> 73:     _features |= CPU_CRC32;
>> 74:   }
>> 75:   if (cpu_has("hw.optional.armv8_1_atomics")) {
> 
> Why aren't we using `hw.optional.arm.FEAT_LSE` instead of `hw.optional.armv8_1_atomics` as per https://developer.apple.com/documentation/kernel/1387446-sysctlbyname/determining_instruction_set_characteristics ?

Hi @gerard-ziemski , I improved the ASIMD and LSE features detection in the way I illustrated in my previous comment. See the latest revision. Is that good to you? Thanks.

-------------

PR: https://git.openjdk.org/jdk/pull/11382


More information about the hotspot-runtime-dev mailing list