RFR: 8297092: [macos_aarch64] Add support for SHA feature detection [v3]
Andrew Haley
aph at openjdk.org
Wed Jan 11 09:59:21 UTC 2023
On Tue, 10 Jan 2023 22:02:50 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:
>> Hao Sun has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>>
>> - Update the copyright year to 2023
>> - Merge branch 'master' into jdk-8297092
>> - Enable UseSHA3Intrinsics by default on Apple silicon
>>
>> Here shows the test on M1.
>>
>> ```
>> $ ./build-fastdebug/images/jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
>> bool UseSHA = true {product} {default}
>> bool UseSHA1Intrinsics = true {diagnostic} {default}
>> bool UseSHA256Intrinsics = true {diagnostic} {default}
>> bool UseSHA3Intrinsics = true {diagnostic} {default}
>> bool UseSHA512Intrinsics = true {diagnostic} {default}
>> ```
>> - 8297092: [macos_aarch64] Add support for SHA feature detection
>>
>> The detection of SHA features including SHA1, SHA2, SHA512 and SHA3, is
>> missing in macOS on Apple silicon. In this patch, we add the support.
>>
>> * SHA1 and SHA2 are set uncoditionally to true on darwin. See [1].
>>
>> * SHA512 and SHA3 will be set depending on the existing `hw.optional`
>> flags
>>
>> * Some old links are updated in this patch as well.
>>
>> With the detection, the corresponding SHA related intrinsics flags are
>> enabled. See the test result on my local environment, i.e. M1.
>>
>> ```
>> $ ./jdk/bin/java -XX:+PrintFlagsFinal --version | grep SHA
>> bool UseSHA = true {product} {default}
>> bool UseSHA1Intrinsics = true {diagnostic} {default}
>> bool UseSHA256Intrinsics = true {diagnostic} {default}
>> bool UseSHA3Intrinsics = false {diagnostic} {default}
>> bool UseSHA512Intrinsics = true {diagnostic} {default}
>>
>> $ ./jdk/bin/java -XX:+PrintFlagsFinal -XX:+UseSHA3Intrinsics --version | grep SHA3
>> bool UseSHA3Intrinsics = true {diagnostic} {command line}
>> ```
>>
>> Note that UseSHA3Intrinsics are NOT enabled by default on supported
>> hardware because there is performance regression one some hardware. See
>> JDK-8292894. Hence we have to enable it explicitly.
>>
>> Jtreg testing:
>>
>> We ran tier1~3 on M1 and found the following two new failures. They can
>> be fixed once JDK-8292894 are resolved.
>>
>> ```
>> compiler/intrinsics/sha/cli/TestUseSHA3IntrinsicsOptionOnSupportedCPU.java
>> ```
>>
>> Performance testing:
>>
>> JMH test case MessageDigests.java is used to evaluate the performance
>> uplifts of UseSHA*Intrinsics. Here shows the data on M1.
>>
>> ```
>> Benchmark (digesterName) (length) Mode Cnt Before After Units
>> MessageDigests.digest SHA-1 64 thrpt 15 2670.195 14767.116 ops/ms
>> MessageDigests.digest SHA-1 16384 thrpt 15 21.756 147.816 ops/ms
>> MessageDigests.digest SHA-224 64 thrpt 15 1729.686 17366.833 ops/ms
>> MessageDigests.digest SHA-224 16384 thrpt 15 14.044 149.707 ops/ms
>> MessageDigests.digest SHA-256 64 thrpt 15 1729.529 16915.554 ops/ms
>> MessageDigests.digest SHA-256 16384 thrpt 15 14.040 149.720 ops/ms
>> MessageDigests.digest SHA-384 64 thrpt 15 2551.285 8996.372 ops/ms
>> MessageDigests.digest SHA-384 16384 thrpt 15 21.338 86.996 ops/ms
>> MessageDigests.digest SHA-512 64 thrpt 15 2528.185 8886.926 ops/ms
>> MessageDigests.digest SHA-512 16384 thrpt 15 21.298 86.840 ops/ms
>> MessageDigests.digest SHA3-256 64 thrpt 15 3595.392 5628.810 ops/ms
>> MessageDigests.digest SHA3-256 16384 thrpt 15 29.208 52.917 ops/ms
>> MessageDigests.digest SHA3-512 64 thrpt 15 3604.652 5563.199 ops/ms
>> MessageDigests.digest SHA3-512 16384 thrpt 15 16.545 28.182 ops/ms
>> MessageDigests.getAndDigest SHA-1 64 thrpt 15 2467.470 11878.581 ops/ms
>> MessageDigests.getAndDigest SHA-1 16384 thrpt 15 21.179 146.849 ops/ms
>> MessageDigests.getAndDigest SHA-224 64 thrpt 15 1635.583 5864.632 ops/ms
>> MessageDigests.getAndDigest SHA-224 16384 thrpt 15 14.023 148.556 ops/ms
>> MessageDigests.getAndDigest SHA-256 64 thrpt 15 1633.913 5741.416 ops/ms
>> MessageDigests.getAndDigest SHA-256 16384 thrpt 15 14.029 148.518 ops/ms
>> MessageDigests.getAndDigest SHA-384 64 thrpt 15 2373.041 7634.589 ops/ms
>> MessageDigests.getAndDigest SHA-384 16384 thrpt 15 21.335 86.452 ops/ms
>> MessageDigests.getAndDigest SHA-512 64 thrpt 15 2366.305 7781.053 ops/ms
>> MessageDigests.getAndDigest SHA-512 16384 thrpt 15 21.324 86.352 ops/ms
>> MessageDigests.getAndDigest SHA3-256 64 thrpt 15 3154.196 4770.368 ops/ms
>> MessageDigests.getAndDigest SHA3-256 16384 thrpt 15 29.028 52.621 ops/ms
>> MessageDigests.getAndDigest SHA3-512 64 thrpt 15 3345.374 4828.986 ops/ms
>> MessageDigests.getAndDigest SHA3-512 16384 thrpt 15 16.443 28.063 ops/ms
>> ```
>>
>> [1] https://github.com/golang/go/blob/master/src/internal/cpu/cpu_arm64_darwin.go#L14
>
> src/hotspot/os_cpu/bsd_aarch64/vm_version_bsd_aarch64.cpp line 75:
>
>> 73: _features |= CPU_CRC32;
>> 74: }
>> 75: if (cpu_has("hw.optional.armv8_1_atomics")) {
>
> Why aren't we using `hw.optional.arm.FEAT_LSE` instead of `hw.optional.armv8_1_atomics` as per https://developer.apple.com/documentation/kernel/1387446-sysctlbyname/determining_instruction_set_characteristics ?
This is Apple AArch64. Is there any point to testing for FEAT_LSE?
-------------
PR: https://git.openjdk.org/jdk/pull/11382
More information about the hotspot-runtime-dev
mailing list