RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm
Suchismith Roy
sroy at openjdk.org
Mon Jan 6 13:49:37 UTC 2025
On Fri, 20 Dec 2024 17:14:32 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:
>> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>>
>> Currently acceleration code for GHASH is missing for PPC64.
>>
>> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
>
> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 686:
>
>> 684: Label L_end, L_aligned;
>> 685:
>> 686: static const unsigned char perm_pattern[16] __attribute__((aligned(16))) = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
>
> This pattern can be produced by `lvsl`. Loading it from memory is not needed.
Hi @TheRealMDoerr
I had tried something like
__ lvsl(loadOrder, 0);
This generated a pattern as below
{0xf, 0xe, 0xd, 0xc, 0xb, 0xa, 0x9, 0x8, 0x7, 0x6, 0x5, 0x4, 0x3,
0x2, 0x1, 0x0}}
This causes the the data to be loaded into vector in wrong order.
The desired pattern is {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
Since the data is stored in bytes and we don't have lxvb16x in power8, the pattern has to be enforced.
Is there a better way to do this ?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1904171740
More information about the hotspot-dev
mailing list