RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v26]

Martin Doerr mdoerr at openjdk.org
Fri Feb 21 15:25:57 UTC 2025


On Thu, 20 Feb 2025 15:41:12 GMT, Suchismith Roy <sroy at openjdk.org> wrote:

>> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>> 
>> Currently acceleration code for GHASH is missing for PPC64. 
>> 
>> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
>
> Suchismith Roy has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - change branch and remove not needed variables
>  - change branch and remove not needed variables

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 686:

> 684:     __ bind(L_aligned_loop);
> 685:       __ lvx(vH, temp1, data);
> 686:       __ vec_perm(vH, vH, vH, loadOrder);

I think this instruction is only needed on Big Endian and can be optimized out on Little Endian (like in https://github.com/openjdk/jdk/blob/dfcd0df60c60cf89dc01682264a573ad39e61a17/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp#L4155).

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 703:

> 701:       __ vec_perm(vTmp4, vHigh, vHigh, loadOrder);
> 702:       __ vec_perm(vTmp5, vLow, vLow, loadOrder);
> 703:       __ vec_perm(vH, vTmp5, vTmp4, vPerm);

Can we compute a different vPerm such that we only need one `vec_perm` instruction in the loop?

src/hotspot/cpu/ppc/vm_version_ppc.cpp line 314:

> 312:   } else if (UseGHASHIntrinsics) {
> 313:     if (!FLAG_IS_DEFAULT(UseGHASHIntrinsics))
> 314:       warning("GHASH intrinsics are not available on this CPU");

Coding style: hotspot uses curly braces (like above).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1965667282
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1965675196
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1965672939


More information about the hotspot-dev mailing list