RFR: 8297092: [macos_aarch64] Add support for SHA feature detection [v4]
Hao Sun
haosun at openjdk.org
Thu Jan 12 07:00:37 UTC 2023
> The detection of SHA features including SHA1, SHA2, SHA512 and SHA3, is missing in macOS on Apple silicon. In this patch, we add the support.
>
> * SHA1 and SHA2 are set uncoditionally to true on darwin. See [1].
>
> * SHA512 and SHA3 will be set depending on the existing `hw.optional` flags
>
> * Some old links are updated in this patch as well.
>
> With the detection, the corresponding SHA related intrinsics flags are enabled. See the test result on my local environment, i.e. M1.
>
>
> $ ./jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
> bool UseSHA = true {product} {default}
> bool UseSHA1Intrinsics = true {diagnostic} {default}
> bool UseSHA256Intrinsics = true {diagnostic} {default}
> bool UseSHA3Intrinsics = false {diagnostic} {default}
> bool UseSHA512Intrinsics = true {diagnostic} {default}
>
> $ ./jdk/bin/java -XX:+PrintFlagsFinal -XX:+UseSHA3Intrinsics --version | grep SHA3
> bool UseSHA3Intrinsics = true {diagnostic} {command line}
>
>
> Note that UseSHA3Intrinsics are NOT enabled by default on supported hardware because there is performance regression one some hardware. See JDK-8292894. Hence we have to enable it explicitly.
>
> Jtreg testing:
>
> We ran tier1~3 on M1 and found the following two new failures. They can be fixed once JDK-8292894 are resolved.
>
>
> compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java
>
>
> Performance testing:
>
> JMH test case MessageDigests.java is used to evaluate the performance uplifts of UseSHA*Intrinsics. Here shows the data on M1.
>
>
> Benchmark (digesterName) (length) Mode Cnt Before After Units
> MessageDigests.digest SHA-1 64 thrpt 15 2670.195 14767.116 ops/ms
> MessageDigests.digest SHA-1 16384 thrpt 15 21.756 147.816 ops/ms
> MessageDigests.digest SHA-224 64 thrpt 15 1729.686 17366.833 ops/ms
> MessageDigests.digest SHA-224 16384 thrpt 15 14.044 149.707 ops/ms
> MessageDigests.digest SHA-256 64 thrpt 15 1729.529 16915.554 ops/ms
> MessageDigests.digest SHA-256 16384 thrpt 15 14.040 149.720 ops/ms
> MessageDigests.digest SHA-384 64 thrpt 15 2551.285 8996.372 ops/ms
> MessageDigests.digest SHA-384 16384 thrpt 15 21.338 86.996 ops/ms
> MessageDigests.digest SHA-512 64 thrpt 15 2528.185 8886.926 ops/ms
> MessageDigests.digest SHA-512 16384 thrpt 15 21.298 86.840 ops/ms
> MessageDigests.digest SHA3-256 64 thrpt 15 3595.392 5628.810 ops/ms
> MessageDigests.digest SHA3-256 16384 thrpt 15 29.208 52.917 ops/ms
> MessageDigests.digest SHA3-512 64 thrpt 15 3604.652 5563.199 ops/ms
> MessageDigests.digest SHA3-512 16384 thrpt 15 16.545 28.182 ops/ms
> MessageDigests.getAndDigest SHA-1 64 thrpt 15 2467.470 11878.581 ops/ms
> MessageDigests.getAndDigest SHA-1 16384 thrpt 15 21.179 146.849 ops/ms
> MessageDigests.getAndDigest SHA-224 64 thrpt 15 1635.583 5864.632 ops/ms
> MessageDigests.getAndDigest SHA-224 16384 thrpt 15 14.023 148.556 ops/ms
> MessageDigests.getAndDigest SHA-256 64 thrpt 15 1633.913 5741.416 ops/ms
> MessageDigests.getAndDigest SHA-256 16384 thrpt 15 14.029 148.518 ops/ms
> MessageDigests.getAndDigest SHA-384 64 thrpt 15 2373.041 7634.589 ops/ms
> MessageDigests.getAndDigest SHA-384 16384 thrpt 15 21.335 86.452 ops/ms
> MessageDigests.getAndDigest SHA-512 64 thrpt 15 2366.305 7781.053 ops/ms
> MessageDigests.getAndDigest SHA-512 16384 thrpt 15 21.324 86.352 ops/ms
> MessageDigests.getAndDigest SHA3-256 64 thrpt 15 3154.196 4770.368 ops/ms
> MessageDigests.getAndDigest SHA3-256 16384 thrpt 15 29.028 52.621 ops/ms
> MessageDigests.getAndDigest SHA3-512 64 thrpt 15 3345.374 4828.986 ops/ms
> MessageDigests.getAndDigest SHA3-512 16384 thrpt 15 16.443 28.063 ops/ms
>
>
> [1] https://github.com/golang/go/blob/master/src/internal/cpu/cpu_arm64_darwin.go#L14
Hao Sun has updated the pull request incrementally with one additional commit since the last revision:
Improve the comment
1) As suggested by aph, state the performance result of
UseSHA3Intrinsics on existing hardware with SHA3 feature.
2) As suggest by gerard-ziemski, use Apple developer document and XNU
kernel code base as the reference, and improve the ASIMD and LSE feature
detection (i.e. checking both two parameters of sysctlname function).
Besides, we also add assertions for AES/PMULL/SHA1/SHA2 features.
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/11382/files
- new: https://git.openjdk.org/jdk/pull/11382/files/2d22a67c..727d2bef
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=11382&range=03
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=11382&range=02-03
Stats: 31 lines in 2 files changed: 20 ins; 5 del; 6 mod
Patch: https://git.openjdk.org/jdk/pull/11382.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/11382/head:pull/11382
PR: https://git.openjdk.org/jdk/pull/11382
More information about the hotspot-runtime-dev
mailing list