RFR(S): 8218991: s390: Add intrinsic for GHASH algorithm

Schmidt, Lutz lutz.schmidt at sap.com
Fri Feb 15 09:23:31 UTC 2019


Hi Martin,

this is a nice improvement! Thanks a lot for implementing. Looks like easy harvested fruit. __

Your change looks good, overall. But remember, I’m not a reviewer. 

I have two comments, though:
Why didn’t you use the MVC instruction when copying mem2mem? It’s available exactly for that purpose and could provide some extra ticks saved. Example:

    // Copy back result and free parameter block.
    __ z_lg( Z_R0, Address(Z_R1));
    __ z_stg(Z_R0, Address(state));
    __ z_lg( Z_R0, Address(Z_R1, 8));
    __ z_stg(Z_R0, Address(state, 8));
    __ z_aghi(Z_SP, frame_resize);

would become

    // Copy back result and free parameter block.
    __ z_mvc(0, 8-1, state, 0, R1);
    __ z_mvc(8, 8-1, state, 8, R1);
    __ z_aghi(Z_SP, frame_resize);

or even

    // Copy back result and free parameter block.
    __ z_mvc(0, 2*8-1, state, 0, R1);
    __ z_aghi(Z_SP, frame_resize);

Looks pretty and compact, doesn't it? A similar transformation (two MVC instructions) is possible for "Fill parameter block".

Second: how about a
    __ z_xc(16, 2*8-1, R1, 16, R1)
to remove the key from stack?

Watch out to not forget the (-1) in the len field. The instruction uses an 8-bit len field to code lengths 1..256.

Thanks,
Lutz


From: "Doerr, Martin (martin.doerr at sap.com)" <martin.doerr at sap.com>
Date: Thursday, 14. February 2019 at 18:28
To: "'hotspot-compiler-dev at openjdk.java.net'" <hotspot-compiler-dev at openjdk.java.net>
Cc: Lutz Schmidt <lutz.schmidt at sap.com>
Subject: RFR(S): 8218991: s390: Add intrinsic for GHASH algorithm

Hi,
 
I’d like to contribute a GHASH stub for s390 to fix a SSL performance bottleneck.
 
Webrev:
http://cr.openjdk.java.net/~mdoerr/8218991_s390_ghash/webrev.00/
 
TestAESMain improves by about factor 3 with the following setup on z13 hardware: algorithm=AES, mode=GCM, paddingStr=PKCS5Padding, msgSize=12000, keySize=128, noReinit=false, checkOutput=false, encInputOffset=0, encOutputOffset=0, decOutputOffset=0, lastChunkSize=32
 
Please review.
 
Best regards,
Martin
 



More information about the hotspot-compiler-dev mailing list