RFR (M) 8175369: [ppc] Provide intrinsic implementation for CRC32C
Schmidt, Lutz
lutz.schmidt at sap.com
Wed Mar 8 14:38:17 UTC 2017
Martin, thanks for reviewing my change!
All,
Sorry for the delayed response. I had to take care of some urgent private stuff the last few days.
After rebasing, I have updated the comment in kernel_crc32_1word. It was outdated anyway. I would like to postpone the decision about the fate of kernel_crc32_2word.
The conflict in share/vm/c1/c1_compiler.cpp is somewhat tricky. I decided to remove the file from my (8175368, s390) webrev and to enable both cpus (ppc and s390) with this change. This “trick” avoids conflicting changes to the same source code line.
Please find the most recent webrev here: http://cr.openjdk.java.net/~lucy/webrevs/8175369.01/
Thank you,
Lutz
On 03.03.17, 17:33, "Doerr, Martin" <martin.doerr at sap.com<mailto:martin.doerr at sap.com>> wrote:
Hi Lutz,
thank you very much for implementing it for PPC64, too.
Unfortunately, the patch doesn’t apply cleanly any more (due to clashes with other changes).
But it should be easy to fix.
The copyright change of templateInterpreterGenerator_ppc.cpp doesn’t fit any more (same is true for s390 webrev).
In addition, the PPC64 change was not build on top of the s390 one. The shared code change in c1_Compiler.cpp needs to be adapted in this change and pushed after the s390 one.
As the change is very similar to what you’ve done on s390, it already contains what I had requested in the other review.
I only have a comment on the macroAssembler part:
I didn't review kernel_crc32_2word as it is no longer used and should get removed sooner or later (not necessarily with this change).
kernel_crc32_1word contains an s390 comment "ahgi".
Besides that, it looks good to me.
Thanks and best regards,
Martin
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Schmidt, Lutz
Sent: Donnerstag, 2. März 2017 17:24
To: hotspot-compiler-dev at openjdk.java.net
Subject: RFR (M) 8175369: [ppc] Provide intrinsic implementation for CRC32C
Hi all,
may I kindly request reviews for my medium size enhancement? Further down the road, I would need a sponsor, too.
Bug: https://bugs.openjdk.java.net/browse/JDK-8175369
Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8175369/
Description:
This intrinsic implementation provides some performance benefit over the standard Java implementation. It uses only well-known “standard” instructions, available on all supported POWER cpus. Being very similar to the CRC32 intrinsics, it was tried to share as much code as possible between these two.
Performance was observed to increase up to 2.0x, depending on the length of the byte array fed into CRC32C. Short byte arrays benefit more. That is due to the fact that the Java implementation of CRC32C is well inlined and optimized by the JIT, at least with my simple micro benchmarks.
Thanks,
Lutz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20170308/df1edc2f/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list