RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v26]
Martin Doerr
mdoerr at openjdk.org
Fri Feb 21 15:25:57 UTC 2025
On Thu, 20 Feb 2025 15:41:12 GMT, Suchismith Roy <sroy at openjdk.org> wrote:
>> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>>
>> Currently acceleration code for GHASH is missing for PPC64.
>>
>> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
>
> Suchismith Roy has updated the pull request incrementally with two additional commits since the last revision:
>
> - change branch and remove not needed variables
> - change branch and remove not needed variables
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 686:
> 684: __ bind(L_aligned_loop);
> 685: __ lvx(vH, temp1, data);
> 686: __ vec_perm(vH, vH, vH, loadOrder);
I think this instruction is only needed on Big Endian and can be optimized out on Little Endian (like in https://github.com/openjdk/jdk/blob/dfcd0df60c60cf89dc01682264a573ad39e61a17/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp#L4155).
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 703:
> 701: __ vec_perm(vTmp4, vHigh, vHigh, loadOrder);
> 702: __ vec_perm(vTmp5, vLow, vLow, loadOrder);
> 703: __ vec_perm(vH, vTmp5, vTmp4, vPerm);
Can we compute a different vPerm such that we only need one `vec_perm` instruction in the loop?
src/hotspot/cpu/ppc/vm_version_ppc.cpp line 314:
> 312: } else if (UseGHASHIntrinsics) {
> 313: if (!FLAG_IS_DEFAULT(UseGHASHIntrinsics))
> 314: warning("GHASH intrinsics are not available on this CPU");
Coding style: hotspot uses curly braces (like above).
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1965667282
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1965675196
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1965672939
More information about the hotspot-dev
mailing list