RFR: 8073108: GHASH Intrinsics

Anthony Scarpino anthony.scarpino at oracle.com
Tue Feb 17 23:57:15 UTC 2015

Ok, I'll add these checks.. Thanks for looking through this.


On 02/17/2015 02:30 PM, Vladimir Kozlov wrote:
> The concern was about java code and not intrinsic. If intrinsic is not
> used we will get additional range checks which were not present before.
> By indirect load I mean object->array_oop->element instead of
> object->field.
> But in blockMult() you access them outside main loops. As result the
> code without intrinsic should not be affected much.
> So, please, ignore my suggestion about "keep 'subkey' and 'state'
> original values". You are right for intrinsic it is better to use arrays.
> Anyway, all is fine except you may need to check that size of arrays is
> at least 2 before calling processBlocks().
> thanks,
> Vladimir
> On 2/17/15 2:03 PM, Anthony Scarpino wrote:
>> On 02/17/2015 01:14 PM, Vladimir Kozlov wrote:
>>> Florian's concern is valid.
>>> "Range check elimination" means that C2 moves checks from a loop. Checks
>>> are still present. Since 'state' array is not final we can't eliminate
>>> range check.
>> I don't understand the concern here.  The arrays are private and are
>> only used by math calculations either in the same GHASH object or by the
>> intrinsics.  What ranges check issues are we expecting to occur given
>> they are inaccessible by an API?
>>> An other thing is an additional indirect load to access
>>> arrays elements.
>>> I would suggest to keep original code for 'subkey' and 'state' which use
>>> separate values instead of arrays.
>> By the indirect loading, you referring to the loading done by
>> get_vars_from_ghash_object() in library_call.cpp?  If so, I thought it
>> would be faster to load the 128bit values directly into one register
>> instead of loading one long from the class to a register, shift left,
>> loading the second long and xor'ing into the register.  The assembly
>> code is doing the math on the whole 128bit value, while the GHASH.java
>> side needs to split it into 2 values.  Putting it into an array makes in
>> easy, coding wise, to access the whole 128bit value at once.
>> If the suggestion also is to put the 4 longs on the argument list, that
>> would be 7 arguments for processBlocks, which would be ugly and not
>> useful for the non-intrinsic side since GHASH.java would ignore the
>> longs.
>>  >
>>> Regards,
>>> Vladimir
>>> On 2/17/15 1:00 PM, Anthony Scarpino wrote:
>>>> On 02/17/2015 12:05 PM, Florian Weimer wrote:
>>>>> On 02/17/2015 08:59 PM, Anthony Scarpino wrote:
>>>>>> On 02/17/2015 12:57 AM, Florian Weimer wrote:
>>>>>>> On 02/16/2015 10:11 PM, Anthony Scarpino wrote:
>>>>>>>> http://cr.openjdk.java.net/~ascarpino/8073108/jdk/webrev/
>>>>>>> I think the “state” field in GHASH should be final.  Is C2 able to
>>>>>>> eliminate the array bounds checks?  (Although it's not in the inner
>>>>>>> loop
>>>>>>> and thus probably not relevant for performance.)
>>>>>> I'm not sure want you asking about in regard to the bounds checking?
>>>>>> Are
>>>>>> you asking about checking the bounds of "state"?
>>>>> state[0] and state[1]—I wonder if those expressions need array bounds
>>>>> checks when compiled.
>>>> The glossary says C2 eliminates array range checking.
>>>> http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html
>>>> Tony

More information about the security-dev mailing list