RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm
Martin Doerr
mdoerr at openjdk.org
Fri Dec 20 17:25:40 UTC 2024
On Thu, 18 Jul 2024 14:31:57 GMT, Suchismith Roy <sroy at openjdk.org> wrote:
> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>
> Currently acceleration code for GHASH is missing for PPC64.
>
> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
Thanks for implementing it! Reviewing the algorithm will take more time. I already have some comments and suggestions.
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 634:
> 632: }
> 633:
> 634: // Generate stub for ghash process blocks.
There are multiple double-whitespaces in the new comments. Please clean them up!
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 639:
> 637: // state: R3_ARG1
> 638: // subkeyH: R4_ARG2
> 639: // data: R5_ARG3
Argument "blocks" missing.
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 684:
> 682: VectorRegister vS = VR25;
> 683: VectorSRegister vXS = VSR33;
> 684: Label L_end, L_aligned;
I suggest to declare `VectorRegisters` only and using `->to_vsr()` below. This should improve readability.
Non-volatile VectorRegisters need to be preserved. See https://github.com/openjdk/jdk/blob/bcb1bdaae772c752d54939dae3a0d95892acc228/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp#L3866
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 686:
> 684: Label L_end, L_aligned;
> 685:
> 686: static const unsigned char perm_pattern[16] __attribute__((aligned(16))) = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
This pattern can be produced by `lvsl`. Loading it from memory is not needed.
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 713:
> 711: __ vxor(vTmp1, vTmp1, vTmp1);
> 712: __ vxor(vZero, vZero, vZero);
> 713: __ mtctr(blocks);
Can `blocks` be 0?
`blocks` is an int. The higher half of the register may possibly contain garbage and should be cleared. (Can be combined with 0 check if needed.)
src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 759:
> 757: __ bdnz(loop);
> 758: __ stxvd2x(vZero->to_vsr(), state);
> 759: __ blr(); // Return from function
Some empty lines would improve readability.
-------------
PR Review: https://git.openjdk.org/jdk/pull/20235#pullrequestreview-2517971876
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1894178520
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1894193680
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1894184131
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1894189657
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1894194165
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1894197698
More information about the hotspot-dev
mailing list