RFR(M): 8216060: [PPC64] Vector CRC implementation should be used by interpreter and be faster for short arrays

Doerr, Martin martin.doerr at sap.com
Thu Jan 3 17:34:57 UTC 2019


Hi Gustavo,

thanks for testing and your feedback. I just fixed the comment typos in place.

Unfortunately, I can't reproduce the crash. TestCRC32C works stable on our machine (with fastdbg build).
I guess that the frameless spills mess up the stack. Can you check if the patch below helps?

Best regards,
Martin


diff -r a33f49d5998c src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp
--- a/src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp  Thu Jan 03 17:30:03 2019 +0100
+++ b/src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp  Thu Jan 03 18:33:16 2019 +0100
@@ -1924,6 +1924,9 @@
       __ addi(data, data, arrayOopDesc::base_offset_in_bytes(T_BYTE));
     }

+    // Restore caller sp for c2i case.
+    __ mr(R1_SP, R21_sender_SP); // Cut the stack back to where the caller started.
+
     StubRoutines::ppc64::generate_load_crc_table_addr(_masm, table);

     if (!VM_Version::has_vpmsumb()) {
@@ -1933,8 +1936,6 @@
       __ kernel_crc32_vpmsum(crc, data, dataLen, table, t0, t1, t2, t3, tc0, tc1, tc2, true);
     }

-    // Restore caller sp for c2i case and return.
-    __ mr(R1_SP, R21_sender_SP); // Cut the stack back to where the caller started.
     __ blr();

     // Generate a vanilla native entry as the slow path.
@@ -2014,6 +2015,9 @@
       __ addi(data, data, arrayOopDesc::base_offset_in_bytes(T_BYTE));
     }

+    // Restore caller sp for c2i case.
+    __ mr(R1_SP, R21_sender_SP); // Cut the stack back to where the caller started.
+
     StubRoutines::ppc64::generate_load_crc32c_table_addr(_masm, table);

     if (!VM_Version::has_vpmsumb()) {
@@ -2023,8 +2027,6 @@
       __ kernel_crc32_vpmsum(crc, data, dataLen, table, t0, t1, t2, t3, tc0, tc1, tc2, false);
     }

-    // Restore caller sp for c2i case and return.
-    __ mr(R1_SP, R21_sender_SP); // Cut the stack back to where the caller started.
     __ blr();

     BLOCK_COMMENT("} CRC32C_update{Bytes|DirectByteBuffer}");


-----Original Message-----
From: Gustavo Romero <gromero at linux.vnet.ibm.com> 
Sent: Donnerstag, 3. Januar 2019 17:13
To: Doerr, Martin <martin.doerr at sap.com>; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR(M): 8216060: [PPC64] Vector CRC implementation should be used by interpreter and be faster for short arrays

Hi Martin,

oh that's nice. You removed the 512-byte block constraint and also wired it up to the Interpreter :)

For the worst case, unaligned 512 byte array, I see the gap to aligned 512 byte array reduced by about ~5.7x.

On the Interpreter I see an improvement of at least 50% for 1024 bytes.

This is all for the CRC32 class.

On CRC32C I'm getting a SIGSEV that can be reproduced running against ./test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java.

I've upload a full log into http://cr.openjdk.java.net/~gromero/logs/crc32c_sigsegv/

I'm leaving for the lunch and I'll take a closer look when back. But probably you will figure it out before I sit to appreciate the meal :)

Finally, since the change does some cleanup, I wonder if it would be worth fixing the following typos:

I think it's Barrett const., not Barret. Probably 'barret' is used in the code as a short version
for Barrett but it should be changed in

+  // Point to Barret constants
+  add_const_optimized(cur_const, constants, outer_consts_size + inner_consts_size);
+    

?

s/not/note/ in:
cpu/ppc/macroAssembler_ppc.cpp:3977:// A not on the lookup table address(es):

d/lives/ in:
cpu/ppc/macroAssembler_ppc.cpp:4265:  mtvrwz(VCRC, crc); // crc lives lives in VCRC, now

Best regards,
Gustavo

On 01/03/2019 12:17 PM, Doerr, Martin wrote:
> Hi,
> 
> the JVM on PPC64 currently misses usage of the fast vector implementation in the interpreter code.
> 
> In addition, performance is not good for short arrays (unaligned 512 byte arrays or shorter arrays) because the current vector implementation needs at least 512 bytes.
> 
> Bug:
> 
> https://bugs.openjdk.java.net/browse/JDK-8216060
> 
> I have addressed these 2 issues + some cleanup with the following webrev:
> 
> http://cr.openjdk.java.net/~mdoerr/8216060_PPC64_CRC/webrev.00/ <http://cr.openjdk.java.net/%7Emdoerr/8216060_PPC64_CRC/webrev.00/>
> 
> Please review.
> 
> Best regards,
> 
> Martin
> 



More information about the hotspot-compiler-dev mailing list