review request: 7184394: add intrinsics to use AES instructions
Deneau, Tom
tom.deneau at amd.com
Thu Jul 26 15:17:04 PDT 2012
I have submitted
http://cr.openjdk.java.net/~tdeneau/aes-intrinsics/webrev.03
which
* incorporates some feedback from Vladimir regarding the
global flags I was using
* corrects some misunderstanding on my part about xmm register
saving requirements on 32-bit windows.
-- Tom
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Thursday, July 19, 2012 6:07 PM
To: Deneau, Tom
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: Re: review request: add intrinsics to use AES instructions
I will ask my colleagues to look on this changes.
Thanks,
Vladimir
Deneau, Tom wrote:
> The 32-bit stubs have been added, the new webrev is at
> http://cr.openjdk.java.net/~tdeneau/aes-intrinsics/webrev.02
>
> The stubGenerator files were basically the only changes.
>
> While adding the 32-bit stubs, I noticed that the 64-bit stubs
> could be cleaned up quite a bit, I used symbolic names rather
> than raw names for registers, etc.
>
> -- Tom
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Monday, July 16, 2012 6:14 PM
> To: Deneau, Tom
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: review request: add intrinsics to use AES instructions
>
> Deneau, Tom wrote:
>> Vladimir --
>>
>> OK I see now that the stubroutines_x86_xxx are bitness-dependent.
>> And are you saying that you would prefer that the intrinsics actually
>> be supported on 32-bit, not just that it builds and runs without support on 32-bit?
>
> Yes, please, add the support on 32-bit (when AES is present). The stubs code
> should be the same except incoming arguments.
>
> Vladimir
>
>> -- Tom
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Monday, July 16, 2012 2:08 PM
>> To: Deneau, Tom
>> Cc: hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: review request: add intrinsics to use AES instructions
>>
>> You can't execute 32bit VM with missing 32bit changes because flags are set but
>> stubs are missing. And, yes, 32 bit VM is still used.
>>
>> Vladimir
>>
>> Deneau, Tom wrote:
>>> Vladimir --
>>>
>>> Right I didn't include 32-bit changes thinking that the majority
>>> of users of AES encryption/decryption would be 64-bit servers.
>>>
>>> But there is no technical reason why 32-bit couldn't be added.
>>> Do you feel 32-bit support is important?
>>>
>>> -- Tom
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Monday, July 16, 2012 12:40 PM
>>> To: Deneau, Tom
>>> Cc: hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: review request: add intrinsics to use AES instructions
>>>
>>> Thank you, Tom
>>>
>>> I created next RFE and will sponsor changes. But I don't see 32 bit changes.
>>>
>>> 7184394: add intrinsics to use AES instructions
>>>
>>> Vladimir
>>>
>>> Deneau, Tom wrote:
>>>> Please review the following webrev which adds intrinsic support to
>>>> allow some of the com/sun/crypto/provider methods to use AES
>>>> instructions when a processor supports such instructions.
>>>>
>>>> http://cr.openjdk.java.net/~tdeneau/aes-intrinsics/webrev.01/
>>>>
>>>> I do not have a bug number for this change but a description would be
>>>> something like the following:
>>>>
>>>> Modern x86 processors have AES instructions to accelerate AES
>>>> encryption and decryption but Hotspot does not have a way to
>>>> generate such instructions. There is a way to hook in a native
>>>> crypto library using PKCS11 and there are a few native libraries
>>>> that support hardware AES instructions. However, these native
>>>> PKCS11 libraries
>>>>
>>>> * do not scale well with multiple threads
>>>> * are not supported on all platforms, for instance Hotspot does
>>>> not have PKCS11 support on 64-bit Windows.
>>>> * can be confusing to configure.
>>>>
>>>> Since this webrev adds intrinsic support for the default
>>>> com/sun/crypto/provider classes, they are supported on all platforms
>>>> and there is no additional configuration required. Measurements have
>>>> shown that they scale very well will multiple threads.
>>>>
>>>> The rest of this mail describes the scope of the intrinsics and
>>>> summarizes the source file changes.
>>>>
>>>> -- Tom Deneau
>>>>
>>>> Scope of the Intrinsics
>>>> -----------------------
>>>> When creating a cipher the application specifies a "transformation"
>>>> consisting of "algorithm/mode/padding". For more details see
>>>> http://docs.oracle.com/javase/7/docs/api/javax/crypto/Cipher.html
>>>>
>>>> * These intrinsics kick in only when the algorithm part is "AES". A
>>>> single block in AES is always 16 bytes and there are intrinsics
>>>> for encrypting or decrypting a single block. These single-block
>>>> intrinsics can work with any mode that uses AES and with any of
>>>> the three AES key sizes (128, 192 or 256 bit).
>>>>
>>>> * A more optimized multi-block intrinsic can kick in if the
>>>> algorithm/mode is "AES/CBC" (Cipher Block Chaining). Again all
>>>> three AES key sizes are supported. There is no technical reason
>>>> why we couldn't do multi-block intrinsics for the other modes
>>>> (eg, ECB) but I want to get some feedback from the reviewers on
>>>> the implementation before charging off on this path.
>>>>
>>>> * The padding part is handled by java routines outside of these
>>>> intrinsics.
>>>>
>>>> Summary of Changes
>>>> ------------------
>>>> src/cpu/x86/vm/assembler_x86.cpp, hpp
>>>> Defined the aes instructions which are used by the stub routines.
>>>>
>>>> src/cpu/x86/vm/stubGenerator_x86_64.cpp,
>>>> Actual stub code for the aes intrinsics. As described earlier there
>>>> are both single-block and multi-block intrinsic stubs.
>>>>
>>>> Note that the stubs make use of the "expanded key" which gets
>>>> created each time the key changes. The expanded key is used by both
>>>> the java code and the intrinsic AES instructions.
>>>>
>>>> The java code stores the "expanded key" in big-endian 32-bit
>>>> integers. The x86 AES instructions require the expanded key to be
>>>> in little-endian 128-bit words. Hence the pshufb instructions to
>>>> get the key into the little-endian format
>>>>
>>>> src/cpu/x86/vm/vm_version_x86.cpp, hpp
>>>> Detect and store the aes capability bit in cpuid. A global boolean
>>>> command line flag UseAES can be used to turn off AES even if the
>>>> hardware supports it.
>>>>
>>>> src/share/vm/classfile/vmSymbols.hpp
>>>> src/share/vm/opto/runtime.cpp, hpp
>>>> The usual definitions of class names, method names and signatures
>>>> for the java methods that are being intrinsified and the signatures
>>>> for the stubs
>>>>
>>>> src/share/vm/oops/methodOop.cpp
>>>> Up until now, every intrinsic was replacing a routine that was
>>>> loaded by the "default" (NULL) class loader.
>>>> com/sun/crypto/provider is not loaded by the default class
>>>> loader so we had to add a check here.
>>>>
>>>> src/share/vm/opto/escape.cpp
>>>> escape analysis knows about certain stubs, but if it sees a leaf
>>>> stub it also checks against a predefined list. So the new intrinsic
>>>> names were added to the list.
>>>>
>>>> src/share/vm/opto/library_call.cpp
>>>> src/share/vm/opto/callGenerator.cpp
>>>> src/share/vm/opto/doCall.cpp
>>>>
>>>> The main logic for building up the calls to the stubs at compile
>>>> time, assuming the platform has a stub and the global flags have
>>>> not turned these intrinsics off.
>>>>
>>>> A new helper routine to load a field from an object was added since
>>>> we ended up loading fields in a few places.
>>>>
>>>> For best performance, we wanted to hook into the multi-block
>>>> encrypt and decrypt methods such as in CipherBlockChaining.java.
>>>> This code is not AES-specific but handles CBC mode for any
>>>> algorithm. (The algorithm part is handled by the enclosed
>>>> "embeddedCipher" object).
>>>>
>>>> Thus at runtime we want to do the equivalent of an instanceof check
>>>> on embeddedCipher and either call the stub (if it is AESCrypt) or
>>>> call the original java code (if it is some other algorithm
>>>> type). For the CipherBlockChaining.decrypt there is a further
>>>> runtime check that the source and destination are not the same
>>>> array which, because of the way CBC works would require cloning the
>>>> source (cipher).
>>>>
>>>> Vladimir added some infrastructure to generate predicated
>>>> intrinsics to solve the above problem. A particular intrinsic need
>>>> only specify that it is predicated, and generate the particular
>>>> guard node which if false will take the Java path. This
>>>> infrastructure can be used for future intrinsics that have to make
>>>> such a runtime choice. These changes from Vladimir are in
>>>> callGenerator.cpp, doCall.cpp, and a small bit in library_call.cpp.
>>>>
>>>> src/share/vm/runtime/globals.hpp
>>>> global flags were added to
>>>> * turn off either AES encryption or AES decryption intrinsics separately
>>>> * turn off the multi-block CBC/AES intrinsics.
>>>>
>>>> By default all of the above are on. These are really there for
>>>> testing, for example one could encrypt using Java and decrypt using
>>>> the intrinsics.
>>>>
>>>> Also, a UseAES flag to ignore the hardware capability as described above.
>>>>
>>
>
>
More information about the hotspot-compiler-dev
mailing list