RFR: 8073108: GHASH Intrinsics
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Feb 17 22:30:08 UTC 2015
The concern was about java code and not intrinsic. If intrinsic is not
used we will get additional range checks which were not present before.
By indirect load I mean object->array_oop->element instead of object->field.
But in blockMult() you access them outside main loops. As result the
code without intrinsic should not be affected much.
So, please, ignore my suggestion about "keep 'subkey' and 'state'
original values". You are right for intrinsic it is better to use arrays.
Anyway, all is fine except you may need to check that size of arrays is
at least 2 before calling processBlocks().
thanks,
Vladimir
On 2/17/15 2:03 PM, Anthony Scarpino wrote:
> On 02/17/2015 01:14 PM, Vladimir Kozlov wrote:
>> Florian's concern is valid.
>>
>> "Range check elimination" means that C2 moves checks from a loop. Checks
>> are still present. Since 'state' array is not final we can't eliminate
>> range check.
>
> I don't understand the concern here. The arrays are private and are
> only used by math calculations either in the same GHASH object or by the
> intrinsics. What ranges check issues are we expecting to occur given
> they are inaccessible by an API?
>
>> An other thing is an additional indirect load to access
>> arrays elements.
>>
>> I would suggest to keep original code for 'subkey' and 'state' which use
>> separate values instead of arrays.
>
> By the indirect loading, you referring to the loading done by
> get_vars_from_ghash_object() in library_call.cpp? If so, I thought it
> would be faster to load the 128bit values directly into one register
> instead of loading one long from the class to a register, shift left,
> loading the second long and xor'ing into the register. The assembly
> code is doing the math on the whole 128bit value, while the GHASH.java
> side needs to split it into 2 values. Putting it into an array makes in
> easy, coding wise, to access the whole 128bit value at once.
>
> If the suggestion also is to put the 4 longs on the argument list, that
> would be 7 arguments for processBlocks, which would be ugly and not
> useful for the non-intrinsic side since GHASH.java would ignore the longs.
>
> >
>> Regards,
>> Vladimir
>>
>>
>> On 2/17/15 1:00 PM, Anthony Scarpino wrote:
>>> On 02/17/2015 12:05 PM, Florian Weimer wrote:
>>>> On 02/17/2015 08:59 PM, Anthony Scarpino wrote:
>>>>> On 02/17/2015 12:57 AM, Florian Weimer wrote:
>>>>>> On 02/16/2015 10:11 PM, Anthony Scarpino wrote:
>>>>>>> http://cr.openjdk.java.net/~ascarpino/8073108/jdk/webrev/
>>>>>>
>>>>>> I think the “state” field in GHASH should be final. Is C2 able to
>>>>>> eliminate the array bounds checks? (Although it's not in the inner
>>>>>> loop
>>>>>> and thus probably not relevant for performance.)
>>>>>
>>>>> I'm not sure want you asking about in regard to the bounds checking?
>>>>> Are
>>>>> you asking about checking the bounds of "state"?
>>>>
>>>> state[0] and state[1]—I wonder if those expressions need array bounds
>>>> checks when compiled.
>>>
>>> The glossary says C2 eliminates array range checking.
>>> http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.html
>>>
>>> Tony
>>>
>
More information about the hotspot-dev
mailing list