RFR: 8295698: AArch64: test/jdk/sun/security/ec/ed/EdDSATest.java failed with -XX:+UseSHA3Intrinsics

Yudi Zheng yzheng at openjdk.org
Thu Jul 20 07:06:56 UTC 2023


On Wed, 2 Nov 2022 03:06:21 GMT, Dong Bo <dongbo at openjdk.org> wrote:

> In JDK-8252204, when implemented SHA3 intrinsics, we use `digest_length` to differentiate SHA3-224, SHA3-256, SHA3-384, SHA3-512 and calculate `block_size` with `block_size = 200 - 2 * digest_length`.
> However, there are two extra SHA3 instances, SHAKE256 and SHAKE128, allowing an arbitrary `digest_length`:
> 
> 	digest_length	block_size
> SHA3-224	28	144
> SHA3-256	32	136
> SHA3-384	48	104
> SHA3-512	64	72
> SHAKE128	variable	168
> SHAKE256	variable	136
> 
> 
> This causes SIGSEGV crash or hash code mismatch with `test/jdk/sun/security/ec/ed/EdDSATest.java`. The test calls `SHAKE256` in `Ed448`.
> 
> The main idea of the patch is to pass the `block_size` to differentiate SHA3 instances.
> Tests `test/jdk/sun/security/ec/ed/EdDSATest.java` and `./test/jdk/sun/security/provider/MessageDigest/SHA3.java` both passed.
> And tier1~3 passed on SHA3 supported hardware.
> 
> The SHA3 intrinsics still deliver 20%~40% performance improvement on our pre-silicon simulated platform.
> The latency and throughput of crypto SHA3 ops are designed to be 1 cpu cycle and 2 execution pipes respectively.
> 
> Compared with the main stream code, the performance change with this patch are negligible on real hardware and simulation platform.
> Based on the JMH results of SHA3 intirinsics, performance can be improved by ~50% on some hardware, while some hardware have ~30% regression.
> These performance details are available in the comments of the issue page.
> I guess the performance benefit of SHA3 intrinsics is dependent on the micro architecture, it should be switched on/off based on the running platform.

src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 3910:

> 3908: 
> 3909:     // block_size == 136, bit4 == 0 and bit5 == 0, SHA3-256 or SHAKE256
> 3910:     __ andw(c_rarg5, block_size, 48);

does `c_rarg5` serve as a temporary register here?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/10939#discussion_r1269021039


More information about the hotspot-compiler-dev mailing list