RFR: 8297092: [macos_aarch64] Add support for SHA feature detection [v4]

Hao Sun haosun at openjdk.org
Thu Jan 12 07:00:37 UTC 2023


> The detection of SHA features including SHA1, SHA2, SHA512 and SHA3, is missing in macOS on Apple silicon. In this patch, we add the support.
> 
> * SHA1 and SHA2 are set uncoditionally to true on darwin. See [1].
> 
> * SHA512 and SHA3 will be set depending on the existing `hw.optional` flags
> 
> * Some old links are updated in this patch as well.
> 
> With the detection, the corresponding SHA related intrinsics flags are enabled. See the test result on my local environment, i.e. M1.
> 
> 
> $ ./jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
>      bool UseSHA                = true       {product} {default}
>      bool UseSHA1Intrinsics     = true    {diagnostic} {default}
>      bool UseSHA256Intrinsics   = true    {diagnostic} {default}
>      bool UseSHA3Intrinsics     = false   {diagnostic} {default}
>      bool UseSHA512Intrinsics   = true    {diagnostic} {default}
> 
> $ ./jdk/bin/java -XX:+PrintFlagsFinal -XX:+UseSHA3Intrinsics  --version | grep SHA3
>      bool UseSHA3Intrinsics     = true    {diagnostic} {command line}
> 
> 
> Note that UseSHA3Intrinsics are NOT enabled by default on supported hardware because there is performance regression one some hardware. See JDK-8292894. Hence we have to enable it explicitly.
> 
> Jtreg testing:
> 
> We ran tier1~3 on M1 and found the following two new failures. They can be fixed once JDK-8292894 are resolved.
> 
> 
> compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java
> 
> 
> Performance testing:
> 
> JMH test case MessageDigests.java is used to evaluate the performance uplifts of UseSHA*Intrinsics. Here shows the data on M1.
> 
> 
> Benchmark                    (digesterName)  (length)   Mode  Cnt    Before      After   Units
> MessageDigests.digest                 SHA-1        64  thrpt   15  2670.195  14767.116  ops/ms
> MessageDigests.digest                 SHA-1     16384  thrpt   15    21.756    147.816  ops/ms
> MessageDigests.digest               SHA-224        64  thrpt   15  1729.686  17366.833  ops/ms
> MessageDigests.digest               SHA-224     16384  thrpt   15    14.044    149.707  ops/ms
> MessageDigests.digest               SHA-256        64  thrpt   15  1729.529  16915.554  ops/ms
> MessageDigests.digest               SHA-256     16384  thrpt   15    14.040    149.720  ops/ms
> MessageDigests.digest               SHA-384        64  thrpt   15  2551.285   8996.372  ops/ms
> MessageDigests.digest               SHA-384     16384  thrpt   15    21.338     86.996  ops/ms
> MessageDigests.digest               SHA-512        64  thrpt   15  2528.185   8886.926  ops/ms
> MessageDigests.digest               SHA-512     16384  thrpt   15    21.298     86.840  ops/ms
> MessageDigests.digest              SHA3-256        64  thrpt   15  3595.392   5628.810  ops/ms
> MessageDigests.digest              SHA3-256     16384  thrpt   15    29.208     52.917  ops/ms
> MessageDigests.digest              SHA3-512        64  thrpt   15  3604.652   5563.199  ops/ms
> MessageDigests.digest              SHA3-512     16384  thrpt   15    16.545     28.182  ops/ms
> MessageDigests.getAndDigest           SHA-1        64  thrpt   15  2467.470  11878.581  ops/ms
> MessageDigests.getAndDigest           SHA-1     16384  thrpt   15    21.179    146.849  ops/ms
> MessageDigests.getAndDigest         SHA-224        64  thrpt   15  1635.583   5864.632  ops/ms
> MessageDigests.getAndDigest         SHA-224     16384  thrpt   15    14.023    148.556  ops/ms
> MessageDigests.getAndDigest         SHA-256        64  thrpt   15  1633.913   5741.416  ops/ms
> MessageDigests.getAndDigest         SHA-256     16384  thrpt   15    14.029    148.518  ops/ms
> MessageDigests.getAndDigest         SHA-384        64  thrpt   15  2373.041   7634.589  ops/ms
> MessageDigests.getAndDigest         SHA-384     16384  thrpt   15    21.335     86.452  ops/ms
> MessageDigests.getAndDigest         SHA-512        64  thrpt   15  2366.305   7781.053  ops/ms
> MessageDigests.getAndDigest         SHA-512     16384  thrpt   15    21.324     86.352  ops/ms
> MessageDigests.getAndDigest        SHA3-256        64  thrpt   15  3154.196   4770.368  ops/ms
> MessageDigests.getAndDigest        SHA3-256     16384  thrpt   15    29.028     52.621  ops/ms
> MessageDigests.getAndDigest        SHA3-512        64  thrpt   15  3345.374   4828.986  ops/ms
> MessageDigests.getAndDigest        SHA3-512     16384  thrpt   15    16.443     28.063  ops/ms
> 
> 
> [1] https://github.com/golang/go/blob/master/src/internal/cpu/cpu_arm64_darwin.go#L14

Hao Sun has updated the pull request incrementally with one additional commit since the last revision:

  Improve the comment
  
  1) As suggested by aph, state the performance result of
  UseSHA3Intrinsics on existing hardware with SHA3 feature.
  
  2) As suggest by gerard-ziemski, use Apple developer document and XNU
  kernel code base as the reference, and improve the ASIMD and LSE feature
  detection (i.e. checking both two parameters of sysctlname function).
  
  Besides, we also add assertions for AES/PMULL/SHA1/SHA2 features.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/11382/files
  - new: https://git.openjdk.org/jdk/pull/11382/files/2d22a67c..727d2bef

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=11382&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11382&range=02-03

  Stats: 31 lines in 2 files changed: 20 ins; 5 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/11382.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11382/head:pull/11382

PR: https://git.openjdk.org/jdk/pull/11382


More information about the hotspot-runtime-dev mailing list