8248830 : RFR[S] : C2 : Rotate API intrinsification for X86

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Fri Jul 10 17:32:00 UTC 2020


> WebRev: http://cr.openjdk.java.net/~jbhateja/8248830/webrev.01/

Nice work, Jatin!

High-level comment: so far, there were no pressing need in explicitly 
marking the methods as intrinsics. ROR/ROL instructions were selected 
during matching [1]. Now the patch introduces dedicated nodes 
(RotateLeft/RotateRight) specifically for intrinsics which partly 
duplicates existing logic.

As a consequence, while ROL/ROR instructions can be utilized without 
using the dedicated API methods, auto-vectorization won't handle 
rotations unless the intrinsics are used.

It would be nice to unify the approaches and get rid of the duplication. 
(Either by folding scalar operations into Rotate nodes or by extending 
auto-vectorizer to detect vector rotates in a similar way scalar rotates 
are handled.)

Otherwise, looks good. I'll submit it for testing.

Minor comments:

src/hotspot/share/opto/countbitsnode.hpp

Thought the nodes look like in the right company, formally speaking, 
RotateLeft/RotateRight aren't subtypes of CountBitsNode. Maybe rename 
countbitsnode.hpp or move RotateLeft/RotateRight declarations to 
src/hotspot/share/opto/intrinsicnode.hpp?

Best regards,
Vladimir Ivanov

[1] 
http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/cpu/x86/x86_64.ad#l8970
> 
> AVX512 offers 8 new vector rotate instructions [1], these can accept both immediate and variable rotate count
> arguments. Patch exploits both these flavors of instructions.
> 
> Following are the benchmarks results
> 
> Before:
> UseAVX=3
> Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
> RotateBenchmark.testRotateLeftI        20        1024  thrpt    2  13336.170          ops/ms
> RotateBenchmark.testRotateLeftL        20        1024  thrpt    2   8897.930          ops/ms
> RotateBenchmark.testRotateRightI       20        1024  thrpt    2  13447.273          ops/ms
> RotateBenchmark.testRotateRightL       20        1024  thrpt    2   8783.535          ops/ms
> 
> After:
> UseAVX=3
> Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
> RotateBenchmark.testRotateLeftI        20        1024  thrpt    2  20438.609          ops/ms
> RotateBenchmark.testRotateLeftL        20        1024  thrpt    2  11238.110          ops/ms
> RotateBenchmark.testRotateRightI       20        1024  thrpt    2  20306.805          ops/ms
> RotateBenchmark.testRotateRightL       20        1024  thrpt    2  11190.639          ops/ms
> 
> Kindly review the patch.
> 
> Best Regards,
> Jatin
> 
> [1] : https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvq
>               https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvq
> 
> 


More information about the hotspot-compiler-dev mailing list