com.sun.crypto.provider.GHASH performance fix

Tim Whittington jdk-security-dev at whittington.net.nz
Thu Nov 20 22:37:53 UTC 2014


On 19/08/2014, at 12:32 am, Florian Weimer <fweimer at redhat.com> wrote:

> This change addresses a severe performance regression, first introduced in JDK 8, triggered by the negotiation of a GCM cipher suite in the TLS implementation.  This regression is a result of the poor performance of the implementation of the GHASH function.
> 
> I first tried to eliminate just the allocations in blockMult while still retaining the byte arrays.  This did not substantially increase performance in my micro-benchmark.  I then replaced the 16-byte arrays with longs, replaced the inner loops with direct bit fiddling on the longs, eliminated data-dependent conditionals (which are generally frowned upon in cryptographic algorithms due to the risk of timing attacks), and split the main loop in two, one for each half of the hash state.  This is the result:
> 
>  <https://fweimer.fedorapeople.org/openjdk/ghash-performance/>
> 
> Performance is roughly ten times faster.  My test download over HTTPS is no longer CPU-bound, and GHASH hardly shows up in profiles anymore. (That's why I didn't consider further changes, lookup tables in particular.)  Micro-benchmarking shows roughly a ten-fold increase in throughput, but this is probably underestimating it because of the high allocation rate of the old code.
> 

Hi Florian

It looks like your GHASH implementation as posted isn’t passing the tests in TestGHASH.java.
The existing JDK implementation does, and the Bouncy Castle GHASH produces the same results.

Can you reproduce that?

cheers
tim


> The performance improvement on 32-bit architectures is probably a bit less, but I suspect that using four ints instead of two longs would penalize 64-bit architectures.
> 
> -- 
> Florian Weimer / Red Hat Product Security



More information about the security-dev mailing list