RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v28]

Martin Doerr mdoerr at openjdk.org
Sat Mar 8 18:18:00 UTC 2025


On Fri, 7 Mar 2025 16:59:30 GMT, Suchismith Roy <sroy at openjdk.org> wrote:

>> Your version extracts 2 8 Byte parts and feeds them into separate xor instructions. My proposal performs both 8 Byte xor operations with one vxor instruction by selecting the input bits accordingly. It furthermore avoids swapping halves forth and back (I swap the halves of vReducedLow instead).
>> Have you tried?
>
> @TheRealMDoerr  Yes. The tests do not pass with this. 
> Trying to find a scope to reduce instructions. 
> masm->vsldoi(vLowProduct, vLowProduct, vLowProduct, 8);           // Swap
>     masm->vxor(vLowProduct, vLowProduct, vReducedLow);                // Reduction using constant
>     masm->vsldoi(vCombinedResult, vLowProduct, vLowProduct, 8);       // Swap 
>    
>    
>    can be brought down to 2 instructions. 
>    Still looking for scope to reduce. Let me know your inputs

I still find it hard to read. Can you describe the algorithm in pseudo code or mathematical equations? We can try to map it to a shorter instruction sequence.
Btw. the comment looks wrong here: vxor(vLowProduct, vLowProduct, vReducedLow); // Reduction using constant

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20235#discussion_r1986127977


More information about the hotspot-dev mailing list