[10] RFR (S): 8189177 - AARCH64: Improve _updateBytesCRC32C intrinsic

Dmitry Chuyko dmitry.chuyko at bell-sw.com
Thu Nov 2 21:07:18 UTC 2017


Similar to CRC32 I added private 
MacroAssembler::kernel_crc32c_using_crc32c().

webrev: http://cr.openjdk.java.net/~dchuyko/8189177/webrev.01/

-Dmitry


On 10/20/2017 08:45 PM, Dmitry Chuyko wrote:
> Hello,
>
> Please review an improvement of CRC32C calculation on AArch64. It is 
> done pretty similar to a change for JDK-8189176 described in [1].
>
> MacroAssembler::kernel_crc32c gets unused table registers. They can be 
> used to make neighbor loads and CRC calculations independent. Adding 
> prologue and epilogue for main by-64 loop makes it applicable starting 
> from len=128 so additional by-32 loop is added for smaller lengths.
>
> rfe: https://bugs.openjdk.java.net/browse/JDK-8189177
> webrev: http://cr.openjdk.java.net/~dchuyko/8189177/webrev.00/
> benchmark: 
> http://cr.openjdk.java.net/~dchuyko/8189177/crc32c/CRC32CBench.java
>
> Results for T88 and A53 [2] are similar to CRC32 change (good), but 
> again splitting pair loads may slow down other CPUs so measurements on 
> different HW are welcome.
>
> -Dmitry
>
> [1] 
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-October/027225.html
> [2] 
> https://bugs.openjdk.java.net/browse/JDK-8189177?focusedCommentId=14124535&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14124535
>



More information about the hotspot-compiler-dev mailing list