RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v2]

Antonios Printezis tonyp at openjdk.org
Mon Dec 18 16:43:46 UTC 2023


On Mon, 18 Dec 2023 14:04:52 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you review this patch to implement SHA-1 intrinsic for riscv?
>> Thanks!
>> 
>> 
>> ## Test
>> 
>> ### Functionality
>> 
>> tests under `test/hotspot/jtreg/compiler/intrinsics/sha`
>> tests found via `find test/jdk -iname "*SHA1*.java"`
>> 
>> ### Performance
>> 
>> tested on `T-HEAD Light Lichee Pi 4A`
>> 
>> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`.
>> 
>> **when intrinsic is enabled**
>> 
>> o.o.b.java.security.GetMessageDigest.cloneInstance                    N/A         N/A           SHA-1       N/A         N/A  avgt   10        489.860 ?       6.277  ns/op
>> o.o.b.java.security.GetMessageDigest.getInstance                      N/A         N/A           SHA-1       N/A         N/A  avgt   10       3477.197 ?     204.203  ns/op
>> o.o.b.java.security.GetMessageDigest.getInstanceWithProvider          N/A         N/A           SHA-1       N/A         N/A  avgt   10       4111.164 ?     108.861  ns/op
>> o.o.b.java.security.MessageDigests.digest                             N/A         N/A           SHA-1        64     DEFAULT  avgt   10       3454.207 ?      53.924  ns/op
>> o.o.b.java.security.MessageDigests.digest                             N/A         N/A           SHA-1     16384     DEFAULT  avgt   10     184063.834 ?     677.635  ns/op
>> o.o.b.java.security.MessageDigests.getAndDigest                       N/A         N/A           SHA-1        64     DEFAULT  avgt   10       8260.011 ?     150.045  ns/op
>> o.o.b.java.security.MessageDigests.getAndDigest                       N/A         N/A           SHA-1     16384     DEFAULT  avgt   10     191325.246 ?    3298.882  ns/op
>> o.o.b.javax.crypto.full.MacBench.mac                             HmacSHA1         128             N/A       N/A              avgt   10       8220.886 ?      53.684  ns/op
>> o.o.b.javax.crypto.full.MacBench.mac                             HmacSHA1        1024             N/A       N/A              avgt   10      18006.955 ?      92.432  ns/op
>> o.o.b.javax.crypto.small.MessageDigestBench.digest                   SHA1     1048576             N/A       N/A              avgt   10   11688843.558 ?   34924.678  ns/op
>> 
>> 
>> **when intrinsic is disabled**
>> 
>> o.o.b.java.security.GetMessageDigest.cloneInstance                    N/A         N/A           SHA-1       N/A         N/A  avgt   10        496.890 ?       6.695  ns/op
>> o.o.b.java.security.GetMessageDigest.getInstance           ...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add some comments

I'll finish the review tomorrow. But posting the comments I have so far.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4442:

> 4440:   //    M't,                                      0 <=  t <= 15
> 4441:   //    ROTL'1(W't-3 ^ W't-8 ^ W't-14 ^ W't-16),  16 <= t <= 79
> 4442:   void sha1_prepare_w(int round, Register cur_w, Register ws[], Register buf, Register tmp) {

I'd just use `t0` directly instead of passing the temp register as an arg.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4448:

> 4446: 
> 4447:       if (round%2 == 0) {
> 4448:         __ ld(ws[round/2], Address(buf, 0));

Instead of incrementing `buf` 8 times, could you just increment the offset (0, 8, 16, etc.) and only increment `buf` once per loop iteration?

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4585:

> 4583:   }
> 4584: 
> 4585:   void sha1_reserve_prev_abcde(Register a, Register b, Register c, Register d, Register e,

I think it's safe to just use t0 and t1 for intermediate results without passing them as args. I used to pass them as args too, but I changed that for md5.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4592:

> 4590: 
> 4591:     __ slli(tmp1, b, 32);
> 4592:     __ andi(prev_ab, a, mask32, tmp2);

I think this will materialize `mask32` in `tmp2` twice, once per `andi`, given that the value won't work as an intermediate. I'd do `__ mv(tmp2, mask32)` and use `__ andr(prev_ab, a, tmp2)` and `__ andr(prev_cd, c, tmp2)`. I think it will save 2-3 instructions here. No idea how performance-critical this section is, though! I assume not much?

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4627:

> 4625: 
> 4626:     // c_rarg0 - c_rarg3: x10 - x13
> 4627:     Register buf    = c_rarg0;

You could copy the four arguments to a different set of registers and use a0 -> a3 for some of the other values to see if you can increase the number of compressed instructions that can be used. Unclear whether it's worth it or not.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4664:

> 4662: 
> 4663:     RegSet saved_regs = RegSet::range(x18, x27);
> 4664:     saved_regs += RegSet::of(t2);

Do you need to save t2?

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4673:

> 4671:     __ srli(d, c, 32);
> 4672:     __ lw(e, Address(state, 16));
> 4673: 

(nit) extra whitespace

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4713:

> 4711:     __ sd(c, Address(state, 8));
> 4712:     __ sw(e, Address(state, 16));
> 4713: 

(nit) extra whitespace

-------------

PR Review: https://git.openjdk.org/jdk/pull/17130#pullrequestreview-1786898719
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430181504
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430361118
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430162672
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430175000
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430156869
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430146051
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430147125
PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430148290


More information about the hotspot-dev mailing list