RFR: 8319716: RISC-V: Add SHA-2
Robbin Ehn
rehn at openjdk.org
Mon Nov 20 15:33:13 UTC 2023
On Wed, 15 Nov 2023 07:41:34 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> Hi, please consider.
>>
>> Main author is @luhenry, I only fixed some minor things and tested it.
>>
>> Such as:
>> test/hotspot/jtreg/compiler/intrinsics/sha/
>> test/jdk/java/security/MessageDigest/
>> test/jdk/jdk/security/
>> tier1
>>
>> And still running some test.
>
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3926:
>
>> 3924: //--------------------------------------------------------------------------------
>> 3925: // Quad-round 1 (+1, v11->v12->v13->v10)
>> 3926: __ vl1re32_v(v15, consts);
>
> I am still worried about the vector load latency if we do one `vl1re3_v` to get the consts for each round even for single pass. Preloading the constants into vectors is less likely to have this issue, right? We should have enough vector registers for that purpose.
Depending on hardware pipeline depth this load can actually be executed after
"__ vadd_vv(v14, v15, v10);" thus that instruction maybe already be retired when reaching round 1.
Preloading these, depending on the number of V-load ports, the preloading it self can be very costly as they can't be executed out-of-order in parallel.
So hiding the load in previous round can be faster, therefore my fast conclusion without numbers was at least for single pass no preloading *should* be better on bigger hardware.
I guess I need to get those numbers :)
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1399372865
More information about the hotspot-dev
mailing list