RFR: 8333867: SHA3 performance can be improved

Tue Jun 11 08:06:14 UTC 2024

On Mon, 10 Jun 2024 21:09:16 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> src/java.base/share/classes/sun/security/provider/SHA3.java line 100:
>> 
>>> 98:         b2lLittle(b, ofs, longBuf, 0, blockSize);
>>> 99:         for (int i = 0; i < blockSize / 8; i++) {
>>> 100:             state[i] ^= longBuf[i];
>> 
>> Clever. So the intrinsic (C2 code) still generates code corresponding original loop with `byte b[]` array. This will be confusing. It will also slowdown execution in Interpreter so - additional array copy.
>> 
>> New code also assumes that `buffer.length == blockSize` and `(buffer.length % 8) == 0`. I hope there is some assertions/checks in java code to verify that.
>> 
>> Some one from core-libs have to review this.
>
> Well, the intrinsic function treats the input and state as long arrays anyways, and so it only works on little endian architectures, where the conversion is a no-op. There is no additional array copy, this b2lLittle() call used to be in the keccak() method (along with the conversion back to byte array), the point of this whole change is that only one of these conversions should be done with every keccak() call (an additional benefit is that the xor and the corresponding loads+store is done on longs, not on bytes).

Oh, and about the length: buffer is allocated in the constructor of the parent class (DigestBase) like this:
  buffer = new byte[blockSize];
Here blockSize is one of { 72, 104, 136, 144, 168 }, so divisible by 8.
buffer.length was used before probably because blockSize was declared private in DigestBase. I made it protected, because in my opinion it is easier to read the code this way.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19632#discussion_r1634390741