RFR: 8318756 Create better internal buffer for AEADs
Daniel Jeliński
djelinski at openjdk.org
Thu Nov 23 12:27:21 UTC 2023
On Fri, 3 Nov 2023 04:08:27 GMT, Anthony Scarpino <ascarpino at openjdk.org> wrote:
> Hi,
>
> I need a review for a new internal buffer class called AEADBufferStream. AEADBufferStream extends ByteArrayOutputStream, but eliminates some data checking and copying that are not necessary for what GaloisCounterMode.java and ChaCha20Cipher.java need.
>
> The changes greatest benefit is with decryption operations. ChaCha20-Poly1305 had larger performance gains by adopting similar techniques that AES/GCM already uses.
>
> The new buffer shows up to 21% bytes/sec performance increase for decryption for ChaCha20-Poly1305 and 12% for AES/GCM. 16K data sizes saw a memory usage reduction of 46% with and 83% with ChaCha20-Poly1305. These results come from the JMH tests updated in this request and memory usage using the JMH gc profile gc.alloc.rate.norm entry
>
> thanks
>
> Tony
Thanks for doing this! A few comments in line.
src/java.base/share/classes/com/sun/crypto/provider/AEADBufferedStream.java line 45:
> 43: * Create an instance with no buffer
> 44: */
> 45: public AEADBufferedStream() {
Never used. And once you remove it, `buf` is never null
src/java.base/share/classes/com/sun/crypto/provider/AEADBufferedStream.java line 55:
> 53:
> 54: public AEADBufferedStream(int len) {
> 55: buf = new byte[len];
Suggestion:
super(len);
otherwise buf will be initialized twice, once here, once in the base class constructor
src/java.base/share/classes/com/sun/crypto/provider/AEADBufferedStream.java line 56:
> 54: public AEADBufferedStream(int len) {
> 55: buf = new byte[len];
> 56: count = 0;
no need to zero-initialize, it's zero by default
src/java.base/share/classes/com/sun/crypto/provider/AEADBufferedStream.java line 67:
> 65: */
> 66: @Override
> 67: public synchronized byte[] toByteArray() {
remove this `synchronized`; the new `write` methods are not synchronized, so this does not add value.
src/java.base/share/classes/com/sun/crypto/provider/AEADBufferedStream.java line 68:
> 66: @Override
> 67: public synchronized byte[] toByteArray() {
> 68: if (buf.length > count) {
Can this ever happen? And if it can, would it be better to return the whole buffer and have the caller extract the necessary range?
src/java.base/share/classes/com/sun/crypto/provider/AEADBufferedStream.java line 94:
> 92: } else {
> 93: if (buf.length < (count + len)) {
> 94: buf = Arrays.copyOf(buf, count + len);
this will copy a lot if the user performs many small writes, for example when decrypting with CipherInputStream; see `AESGCMCipherOutputStream` benchmark.
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 699:
> 697: * @throws ShortBufferException if the buffer {@code out} does not have
> 698: * enough space to hold the resulting data.
> 699: */ @Override
Suggestion:
*/
@Override
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 792:
> 790:
> 791: /*
> 792: * Optimized version of bufferCrypt from CipherSpi.java. Direct
Can you document the optimizations done in this function?
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 844:
> 842: total = engine.doFinal(inArray, inOfs, inLen, outArray, outOfs);
> 843: }
> 844: } catch (BadPaddingException | KeyException e) {
Preexisting, and probably out of scope for this PR, but we should expose the "Counter exhausted" exception in javax.crypto.Cipher
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 1427:
> 1425: input.get(in);
> 1426: byte[] out = new byte[in.length];
> 1427: doUpdate(in, 0, in.length, out, out.length);
Suggestion:
byte[] out = in;
doUpdate(in, 0, in.length, out, 0);
I guess we need more test coverage here
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 1505:
> 1503: byte[] in = new byte[input.remaining()];
> 1504: input.get(in);
> 1505: byte[] out = new byte[in.length];
Suggestion:
byte[] out = in;
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 1516:
> 1514: input.get(in);
> 1515: byte[] out = new byte[in.length + TAG_LENGTH];
> 1516: doFinal(in, 0, in.length, out, 0);
A bit more complex, but you can avoid one allocation here too; something like (untested):
Suggestion:
int inLen = input.remaining();
byte[] out = new byte[inLen + TAG_LENGTH];
byte[] in = out;
input.get(in, 0, inLen);
doFinal(in, 0, inLen, out, 0);
src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java line 1652:
> 1650: int ctLen = getBufferedLength() + inLen;
> 1651: if (ctLen == 0) {
> 1652: throw new AEADBadTagException("Tag mismatch");
Suggestion:
if (ctLen < TAG_LENGTH) {
throw new AEADBadTagException("Input too short - need tag");
test/micro/org/openjdk/bench/javax/crypto/full/AESGCMBench.java line 43:
> 41: AlgorithmParameterSpec getNewSpec() {
> 42: iv_index = (iv_index + 1) % IV_MODULO;
> 43: return new GCMParameterSpec(96, iv, iv_index, 12);
Can you also change tag length to 128 bits? TLS uses 128, and 128-bit tag generates a slightly different flamegraph.
test/micro/org/openjdk/bench/javax/crypto/full/AESGCMByteBuffer.java line 44:
> 42: AlgorithmParameterSpec getNewSpec() {
> 43: iv_index = (iv_index + 1) % IV_MODULO;
> 44: return new GCMParameterSpec(96, iv, iv_index, 12);
Also use 128-bit tag here.
test/micro/org/openjdk/bench/javax/crypto/full/BenchBase.java line 122:
> 120: public void decrypt() throws Exception {
> 121: decryptCipher.init(Cipher.DECRYPT_MODE, ks,
> 122: encryptCipher.getParameters().
Consider using a different method to obtain parameter spec here; the flamegraph suggests that `getParameters` is responsible for up to 20% of the time spent in these benchmarks
test/micro/org/openjdk/bench/javax/crypto/small/AESGCMBench.java line 48:
> 46: AlgorithmParameterSpec getNewSpec() {
> 47: iv_index = (iv_index + 1) % IV_MODULO;
> 48: return new GCMParameterSpec(96, iv, iv_index, 12);
Also use 128-bit tag here.
test/micro/org/openjdk/bench/javax/crypto/small/AESGCMByteBuffer.java line 48:
> 46: AlgorithmParameterSpec getNewSpec() {
> 47: iv_index = (iv_index + 1) % IV_MODULO;
> 48: return new GCMParameterSpec(96, iv, iv_index, 12);
Also use 128-bit tag here.
-------------
Changes requested by djelinski (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/16487#pullrequestreview-1746152716
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403191252
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403308026
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403191602
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403193417
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403194450
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403196426
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403196854
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403224371
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403205814
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403238862
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403239235
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403244203
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403252213
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403271442
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403271853
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403303743
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403274634
PR Review Comment: https://git.openjdk.org/jdk/pull/16487#discussion_r1403274860
More information about the security-dev
mailing list