RFR: 8073108: GHASH Intrinsics [need second reviewer]

John Rose john.r.rose at oracle.com
Wed Jun 17 18:40:22 UTC 2015


On Jun 17, 2015, at 10:40 AM, Anthony Scarpino <anthony.scarpino at oracle.com> wrote:
> 
> On 06/15/2015 05:20 PM, John Rose wrote:
>> Thanks for taking this on.
>> 
>> It looks good, except for one thing. The intrinsic does not need to be
>> an instance method, and doing so creates an undesirable coupling between
>> the JVM and JDK. Specifically, the JDK should not need to know about
>> subkeyH and state fields. The Java code should pass those as plain
>> (array long[2]) arguments to the intrinsic method processBlocks, which
>> should be adjusted to be static. The domain check routine should be
>> adjusted to be static also.
>> 
>> On my wish list for the future (but not now) is even less coupling
>> with the JVM. The loop code for processBlocks should be written in Java,
>> with various intrinsics (xmulx*) for dealing with single operations on
>> 128-bit values (stored in long[2] boxes and 64-bit registers).
> 
> I forgot the exact numbers, but having the loop in assembly instead of java resulted in about 10-15% performance improvement.  The tighter loop was definitely beneficial.

That's good information; please put it in the RFE.

IMO the best (overall) way to get that 10-15% back is to get the JIT to tighten the loop.
If that works, it will of course benefit all Java loops.

> The
>> Unsafe misaligned access routines could help simplify this also, if the
>> coding were done in Java. This is not too hard to express in Java and
>> compile to excellent code. There will be a little extra awkwardness
>> working with 64x2-vectors in a way that will compile naturally to a
>> range of ALUs (both 64- and 128-bit).
> 
> I would have to look back at that again.. At first I was going to use Unsafe, but it seemed more complicated coding-wise compared to the assembly I saw that was in AES and SHA already.

Yes; that makes perfect sense.  I want to get to the place, eventually, when the next coder looks at how "stuff gets done", they will see less assembly and more Java, and take the Java route.

> 
>> If we get it right we can reduce
>> the amount of assembly code in the JVM and get even more timely access
>> to new data-processing instructions. Would you please file a followup
>> bug (low pri. for now) to track this, at least for GHASH and other
>> crypto loops?
>> 
>> — John
> 
> Sure, I can file them.

Good; thanks.

— John


More information about the hotspot-dev mailing list