RFR: 8296507: GCM using more memory than necessary with in-place operations

Wed Nov 16 16:36:56 UTC 2022

On Sun, 13 Nov 2022 02:54:10 GMT, Anthony Scarpino <ascarpino at openjdk.org> wrote:

> I would like a review of an update to the GCM code.  A recent report showed that GCM memory usage for TLS was very large.  This was a result of in-place buffers, which TLS uses, and how the code handled the combined intrinsic method during decryption.  A temporary buffer was used because the combined intrinsic does gctr before ghash which results in a bad tag.  The fix is to not use the combined intrinsic during in-place decryption and depend on the individual GHASH and CounterMode intrinsics.  Direct ByteBuffers are not affected as they are not used by the intrinsics directly.
> 
> The reduction in the memory usage boosted performance back to where it was before despite using slower intrinsics (gctr & ghash individually).  The extra memory allocation for the temporary buffer out-weighted the faster intrinsic.
> 
> 
>     JDK 17:   122913.554 ops/sec
>     JDK 19:    94885.008 ops/sec
>     Post fix: 122735.804 ops/sec 
> 
> There is no regression test because this is a memory change and test coverage already existing.

Thanks for looking into this, @ascarpino!

In testing this using a local build, it improves performance in cases using heap buffers (a super-set of the socket case), however servers which use direct byte-buffers still exhibit a similar performance regression (heavy allocation compared to jdk17, ~10% slower TLS performance in HTTP+TLS benchmarks). It's possible that has a different root cause, but the outcome is strikingly similar.

-------------

PR: https://git.openjdk.org/jdk/pull/11121