RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm

Suchismith Roy sroy at openjdk.org
Mon Jan 6 13:49:37 UTC 2025


On Fri, 20 Dec 2024 17:14:32 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>> 
>> Currently acceleration code for GHASH is missing for PPC64. 
>> 
>> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
>
> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 686:
> 
>> 684:   Label L_end, L_aligned;
>> 685: 
>> 686:   static const unsigned char perm_pattern[16] __attribute__((aligned(16))) = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
> 
> This pattern can be produced by `lvsl`. Loading it from memory is not needed.

Hi @TheRealMDoerr  

I had tried something like 
 __ lvsl(loadOrder, 0);
 
 This generated a pattern as below 
 {0xf, 0xe, 0xd, 0xc, 0xb, 0xa, 0x9, 0x8, 0x7, 0x6, 0x5, 0x4, 0x3, 
    0x2, 0x1, 0x0}}
 This causes the the data to be loaded into vector in wrong order. 
 
 The desired pattern is    {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} 
 
 Since the data is stored in bytes and we don't have lxvb16x in power8, the pattern has to be enforced. 
 
 Is there a better way  to do this ?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1904171740


More information about the hotspot-dev mailing list