[10] RFR: 8189176 - AARCH64: Improve _updateBytesCRC32 intrinsic
Dmitry Chuyko
dmitry.chuyko at bell-sw.com
Wed Oct 11 16:30:54 UTC 2017
Hello,
Please review an improvement of CRC32 calculation on AArch64.
MacroAssembler::kernel_crc32 gets table registers that are not used on
-XX:+UseCRC32 path. They can be used to make neighbor loads and CRC
calculations independent. Adding prologue and epilogue for main by-64
loop makes it applicable starting from len=128 so additional by-32 loop
is added for smaller lengths.
rfe: https://bugs.openjdk.java.net/browse/JDK-8189176
webrev: http://cr.openjdk.java.net/~dchuyko/8189176/webrev.00/
benchmark: http://cr.openjdk.java.net/~dchuyko/8189176/crc32/CRC32Bench.java
Results for T88 and A53 are good, but splitting pair loads may slow down
other CPUs so measurements on different HW are highly welcome.
-Dmitry
More information about the hotspot-compiler-dev
mailing list