RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v28]
Martin Doerr
mdoerr at openjdk.org
Sat Mar 8 18:18:00 UTC 2025
On Fri, 7 Mar 2025 16:59:30 GMT, Suchismith Roy <sroy at openjdk.org> wrote:
>> Your version extracts 2 8 Byte parts and feeds them into separate xor instructions. My proposal performs both 8 Byte xor operations with one vxor instruction by selecting the input bits accordingly. It furthermore avoids swapping halves forth and back (I swap the halves of vReducedLow instead).
>> Have you tried?
>
> @TheRealMDoerr Yes. The tests do not pass with this.
> Trying to find a scope to reduce instructions.
> masm->vsldoi(vLowProduct, vLowProduct, vLowProduct, 8); // Swap
> masm->vxor(vLowProduct, vLowProduct, vReducedLow); // Reduction using constant
> masm->vsldoi(vCombinedResult, vLowProduct, vLowProduct, 8); // Swap
>
>
> can be brought down to 2 instructions.
> Still looking for scope to reduce. Let me know your inputs
I still find it hard to read. Can you describe the algorithm in pseudo code or mathematical equations? We can try to map it to a shorter instruction sequence.
Btw. the comment looks wrong here: vxor(vLowProduct, vLowProduct, vReducedLow); // Reduction using constant
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1986127977
More information about the hotspot-dev
mailing list