RFR: 8073108: GHASH Intrinsics
Anthony Scarpino
anthony.scarpino at oracle.com
Tue Feb 17 23:57:15 UTC 2015
Ok, I'll add these checks.. Thanks for looking through this.
Tony
On 02/17/2015 02:30 PM, Vladimir Kozlov wrote:
> The concern was about java code and not intrinsic. If intrinsic is not
> used we will get additional range checks which were not present before.
>
> By indirect load I mean object->array_oop->element instead of
> object->field.
>
> But in blockMult() you access them outside main loops. As result the
> code without intrinsic should not be affected much.
>
> So, please, ignore my suggestion about "keep 'subkey' and 'state'
> original values". You are right for intrinsic it is better to use arrays.
>
> Anyway, all is fine except you may need to check that size of arrays is
> at least 2 before calling processBlocks().
>
> thanks,
> Vladimir
>
> On 2/17/15 2:03 PM, Anthony Scarpino wrote:
>> On 02/17/2015 01:14 PM, Vladimir Kozlov wrote:
>>> Florian's concern is valid.
>>>
>>> "Range check elimination" means that C2 moves checks from a loop. Checks
>>> are still present. Since 'state' array is not final we can't eliminate
>>> range check.
>>
>> I don't understand the concern here. The arrays are private and are
>> only used by math calculations either in the same GHASH object or by the
>> intrinsics. What ranges check issues are we expecting to occur given
>> they are inaccessible by an API?
>>
>>> An other thing is an additional indirect load to access
>>> arrays elements.
>>>
>>> I would suggest to keep original code for 'subkey' and 'state' which use
>>> separate values instead of arrays.
>>
>> By the indirect loading, you referring to the loading done by
>> get_vars_from_ghash_object() in library_call.cpp? If so, I thought it
>> would be faster to load the 128bit values directly into one register
>> instead of loading one long from the class to a register, shift left,
>> loading the second long and xor'ing into the register. The assembly
>> code is doing the math on the whole 128bit value, while the GHASH.java
>> side needs to split it into 2 values. Putting it into an array makes in
>> easy, coding wise, to access the whole 128bit value at once.
>>
>> If the suggestion also is to put the 4 longs on the argument list, that
>> would be 7 arguments for processBlocks, which would be ugly and not
>> useful for the non-intrinsic side since GHASH.java would ignore the
>> longs.
>>
>> >
>>> Regards,
>>> Vladimir
>>>
>>>
>>> On 2/17/15 1:00 PM, Anthony Scarpino wrote:
>>>> On 02/17/2015 12:05 PM, Florian Weimer wrote:
>>>>> On 02/17/2015 08:59 PM, Anthony Scarpino wrote:
>>>>>> On 02/17/2015 12:57 AM, Florian Weimer wrote:
>>>>>>> On 02/16/2015 10:11 PM, Anthony Scarpino wrote:
>>>>>>>> http://cr.openjdk.java.net/~ascarpino/8073108/jdk/webrev/
>>>>>>>
>>>>>>> I think the “state” field in GHASH should be final. Is C2 able to
>>>>>>> eliminate the array bounds checks? (Although it's not in the inner
>>>>>>> loop
>>>>>>> and thus probably not relevant for performance.)
>>>>>>
>>>>>> I'm not sure want you asking about in regard to the bounds checking?
>>>>>> Are
>>>>>> you asking about checking the bounds of "state"?
>>>>>
>>>>> state[0] and state[1]—I wonder if those expressions need array bounds
>>>>> checks when compiled.
>>>>
>>>> The glossary says C2 eliminates array range checking.
>>>> http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html
>>>>
>>>> Tony
>>>>
>>
More information about the hotspot-dev
mailing list