RFR: 8297092: [macos_aarch64] Add support for SHA feature detection [v3]
Hao Sun
haosun at openjdk.org
Tue Jan 3 08:02:32 UTC 2023
> The detection of SHA features including SHA1, SHA2, SHA512 and SHA3, is missing in macOS on Apple silicon. In this patch, we add the support.
>
> * SHA1 and SHA2 are set uncoditionally to true on darwin. See [1].
>
> * SHA512 and SHA3 will be set depending on the existing `hw.optional` flags
>
> * Some old links are updated in this patch as well.
>
> With the detection, the corresponding SHA related intrinsics flags are enabled. See the test result on my local environment, i.e. M1.
>
>
> $ ./jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
> bool UseSHA = true {product} {default}
> bool UseSHA1Intrinsics = true {diagnostic} {default}
> bool UseSHA256Intrinsics = true {diagnostic} {default}
> bool UseSHA3Intrinsics = false {diagnostic} {default}
> bool UseSHA512Intrinsics = true {diagnostic} {default}
>
> $ ./jdk/bin/java -XX:+PrintFlagsFinal -XX:+UseSHA3Intrinsics --version | grep SHA3
> bool UseSHA3Intrinsics = true {diagnostic} {command line}
>
>
> Note that UseSHA3Intrinsics are NOT enabled by default on supported hardware because there is performance regression one some hardware. See JDK-8292894. Hence we have to enable it explicitly.
>
> Jtreg testing:
>
> We ran tier1~3 on M1 and found the following two new failures. They can be fixed once JDK-8292894 are resolved.
>
>
> compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java
>
>
> Performance testing:
>
> JMH test case MessageDigests.java is used to evaluate the performance uplifts of UseSHA*Intrinsics. Here shows the data on M1.
>
>
> Benchmark (digesterName) (length) Mode Cnt Before After Units
> MessageDigests.digest SHA-1 64 thrpt 15 2670.195 14767.116 ops/ms
> MessageDigests.digest SHA-1 16384 thrpt 15 21.756 147.816 ops/ms
> MessageDigests.digest SHA-224 64 thrpt 15 1729.686 17366.833 ops/ms
> MessageDigests.digest SHA-224 16384 thrpt 15 14.044 149.707 ops/ms
> MessageDigests.digest SHA-256 64 thrpt 15 1729.529 16915.554 ops/ms
> MessageDigests.digest SHA-256 16384 thrpt 15 14.040 149.720 ops/ms
> MessageDigests.digest SHA-384 64 thrpt 15 2551.285 8996.372 ops/ms
> MessageDigests.digest SHA-384 16384 thrpt 15 21.338 86.996 ops/ms
> MessageDigests.digest SHA-512 64 thrpt 15 2528.185 8886.926 ops/ms
> MessageDigests.digest SHA-512 16384 thrpt 15 21.298 86.840 ops/ms
> MessageDigests.digest SHA3-256 64 thrpt 15 3595.392 5628.810 ops/ms
> MessageDigests.digest SHA3-256 16384 thrpt 15 29.208 52.917 ops/ms
> MessageDigests.digest SHA3-512 64 thrpt 15 3604.652 5563.199 ops/ms
> MessageDigests.digest SHA3-512 16384 thrpt 15 16.545 28.182 ops/ms
> MessageDigests.getAndDigest SHA-1 64 thrpt 15 2467.470 11878.581 ops/ms
> MessageDigests.getAndDigest SHA-1 16384 thrpt 15 21.179 146.849 ops/ms
> MessageDigests.getAndDigest SHA-224 64 thrpt 15 1635.583 5864.632 ops/ms
> MessageDigests.getAndDigest SHA-224 16384 thrpt 15 14.023 148.556 ops/ms
> MessageDigests.getAndDigest SHA-256 64 thrpt 15 1633.913 5741.416 ops/ms
> MessageDigests.getAndDigest SHA-256 16384 thrpt 15 14.029 148.518 ops/ms
> MessageDigests.getAndDigest SHA-384 64 thrpt 15 2373.041 7634.589 ops/ms
> MessageDigests.getAndDigest SHA-384 16384 thrpt 15 21.335 86.452 ops/ms
> MessageDigests.getAndDigest SHA-512 64 thrpt 15 2366.305 7781.053 ops/ms
> MessageDigests.getAndDigest SHA-512 16384 thrpt 15 21.324 86.352 ops/ms
> MessageDigests.getAndDigest SHA3-256 64 thrpt 15 3154.196 4770.368 ops/ms
> MessageDigests.getAndDigest SHA3-256 16384 thrpt 15 29.028 52.621 ops/ms
> MessageDigests.getAndDigest SHA3-512 64 thrpt 15 3345.374 4828.986 ops/ms
> MessageDigests.getAndDigest SHA3-512 16384 thrpt 15 16.443 28.063 ops/ms
>
>
> [1] https://github.com/golang/go/blob/master/src/internal/cpu/cpu_arm64_darwin.go#L14
Hao Sun has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
- Update the copyright year to 2023
- Merge branch 'master' into jdk-8297092
- Enable UseSHA3Intrinsics by default on Apple silicon
Here shows the test on M1.
```
$ ./build-fastdebug/images/jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
bool UseSHA = true {product} {default}
bool UseSHA1Intrinsics = true {diagnostic} {default}
bool UseSHA256Intrinsics = true {diagnostic} {default}
bool UseSHA3Intrinsics = true {diagnostic} {default}
bool UseSHA512Intrinsics = true {diagnostic} {default}
```
- 8297092: [macos_aarch64] Add support for SHA feature detection
The detection of SHA features including SHA1, SHA2, SHA512 and SHA3, is
missing in macOS on Apple silicon. In this patch, we add the support.
* SHA1 and SHA2 are set uncoditionally to true on darwin. See [1].
* SHA512 and SHA3 will be set depending on the existing `hw.optional`
flags
* Some old links are updated in this patch as well.
With the detection, the corresponding SHA related intrinsics flags are
enabled. See the test result on my local environment, i.e. M1.
```
$ ./jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
bool UseSHA = true {product} {default}
bool UseSHA1Intrinsics = true {diagnostic} {default}
bool UseSHA256Intrinsics = true {diagnostic} {default}
bool UseSHA3Intrinsics = false {diagnostic} {default}
bool UseSHA512Intrinsics = true {diagnostic} {default}
$ ./jdk/bin/java -XX:+PrintFlagsFinal -XX:+UseSHA3Intrinsics --version | grep SHA3
bool UseSHA3Intrinsics = true {diagnostic} {command line}
```
Note that UseSHA3Intrinsics are NOT enabled by default on supported
hardware because there is performance regression one some hardware. See
JDK-8292894. Hence we have to enable it explicitly.
Jtreg testing:
We ran tier1~3 on M1 and found the following two new failures. They can
be fixed once JDK-8292894 are resolved.
```
compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java
```
Performance testing:
JMH test case MessageDigests.java is used to evaluate the performance
uplifts of UseSHA*Intrinsics. Here shows the data on M1.
```
Benchmark (digesterName) (length) Mode Cnt Before After Units
MessageDigests.digest SHA-1 64 thrpt 15 2670.195 14767.116 ops/ms
MessageDigests.digest SHA-1 16384 thrpt 15 21.756 147.816 ops/ms
MessageDigests.digest SHA-224 64 thrpt 15 1729.686 17366.833 ops/ms
MessageDigests.digest SHA-224 16384 thrpt 15 14.044 149.707 ops/ms
MessageDigests.digest SHA-256 64 thrpt 15 1729.529 16915.554 ops/ms
MessageDigests.digest SHA-256 16384 thrpt 15 14.040 149.720 ops/ms
MessageDigests.digest SHA-384 64 thrpt 15 2551.285 8996.372 ops/ms
MessageDigests.digest SHA-384 16384 thrpt 15 21.338 86.996 ops/ms
MessageDigests.digest SHA-512 64 thrpt 15 2528.185 8886.926 ops/ms
MessageDigests.digest SHA-512 16384 thrpt 15 21.298 86.840 ops/ms
MessageDigests.digest SHA3-256 64 thrpt 15 3595.392 5628.810 ops/ms
MessageDigests.digest SHA3-256 16384 thrpt 15 29.208 52.917 ops/ms
MessageDigests.digest SHA3-512 64 thrpt 15 3604.652 5563.199 ops/ms
MessageDigests.digest SHA3-512 16384 thrpt 15 16.545 28.182 ops/ms
MessageDigests.getAndDigest SHA-1 64 thrpt 15 2467.470 11878.581 ops/ms
MessageDigests.getAndDigest SHA-1 16384 thrpt 15 21.179 146.849 ops/ms
MessageDigests.getAndDigest SHA-224 64 thrpt 15 1635.583 5864.632 ops/ms
MessageDigests.getAndDigest SHA-224 16384 thrpt 15 14.023 148.556 ops/ms
MessageDigests.getAndDigest SHA-256 64 thrpt 15 1633.913 5741.416 ops/ms
MessageDigests.getAndDigest SHA-256 16384 thrpt 15 14.029 148.518 ops/ms
MessageDigests.getAndDigest SHA-384 64 thrpt 15 2373.041 7634.589 ops/ms
MessageDigests.getAndDigest SHA-384 16384 thrpt 15 21.335 86.452 ops/ms
MessageDigests.getAndDigest SHA-512 64 thrpt 15 2366.305 7781.053 ops/ms
MessageDigests.getAndDigest SHA-512 16384 thrpt 15 21.324 86.352 ops/ms
MessageDigests.getAndDigest SHA3-256 64 thrpt 15 3154.196 4770.368 ops/ms
MessageDigests.getAndDigest SHA3-256 16384 thrpt 15 29.028 52.621 ops/ms
MessageDigests.getAndDigest SHA3-512 64 thrpt 15 3345.374 4828.986 ops/ms
MessageDigests.getAndDigest SHA3-512 16384 thrpt 15 16.443 28.063 ops/ms
```
[1] https://github.com/golang/go/blob/master/src/internal/cpu/cpu_arm64_darwin.go#L14
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/11382/files
- new: https://git.openjdk.org/jdk/pull/11382/files/4970f1a4..2d22a67c
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=11382&range=02
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=11382&range=01-02
Stats: 115736 lines in 2534 files changed: 59966 ins; 42543 del; 13227 mod
Patch: https://git.openjdk.org/jdk/pull/11382.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/11382/head:pull/11382
PR: https://git.openjdk.org/jdk/pull/11382
More information about the hotspot-runtime-dev
mailing list