8248830 : RFR[S] : C2 : Rotate API intrinsification for X86
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Fri Jul 10 17:32:00 UTC 2020
> WebRev: http://cr.openjdk.java.net/~jbhateja/8248830/webrev.01/
Nice work, Jatin!
High-level comment: so far, there were no pressing need in explicitly
marking the methods as intrinsics. ROR/ROL instructions were selected
during matching [1]. Now the patch introduces dedicated nodes
(RotateLeft/RotateRight) specifically for intrinsics which partly
duplicates existing logic.
As a consequence, while ROL/ROR instructions can be utilized without
using the dedicated API methods, auto-vectorization won't handle
rotations unless the intrinsics are used.
It would be nice to unify the approaches and get rid of the duplication.
(Either by folding scalar operations into Rotate nodes or by extending
auto-vectorizer to detect vector rotates in a similar way scalar rotates
are handled.)
Otherwise, looks good. I'll submit it for testing.
Minor comments:
src/hotspot/share/opto/countbitsnode.hpp
Thought the nodes look like in the right company, formally speaking,
RotateLeft/RotateRight aren't subtypes of CountBitsNode. Maybe rename
countbitsnode.hpp or move RotateLeft/RotateRight declarations to
src/hotspot/share/opto/intrinsicnode.hpp?
Best regards,
Vladimir Ivanov
[1]
http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/cpu/x86/x86_64.ad#l8970
>
> AVX512 offers 8 new vector rotate instructions [1], these can accept both immediate and variable rotate count
> arguments. Patch exploits both these flavors of instructions.
>
> Following are the benchmarks results
>
> Before:
> UseAVX=3
> Benchmark (SHIFT) (TESTSIZE) Mode Cnt Score Error Units
> RotateBenchmark.testRotateLeftI 20 1024 thrpt 2 13336.170 ops/ms
> RotateBenchmark.testRotateLeftL 20 1024 thrpt 2 8897.930 ops/ms
> RotateBenchmark.testRotateRightI 20 1024 thrpt 2 13447.273 ops/ms
> RotateBenchmark.testRotateRightL 20 1024 thrpt 2 8783.535 ops/ms
>
> After:
> UseAVX=3
> Benchmark (SHIFT) (TESTSIZE) Mode Cnt Score Error Units
> RotateBenchmark.testRotateLeftI 20 1024 thrpt 2 20438.609 ops/ms
> RotateBenchmark.testRotateLeftL 20 1024 thrpt 2 11238.110 ops/ms
> RotateBenchmark.testRotateRightI 20 1024 thrpt 2 20306.805 ops/ms
> RotateBenchmark.testRotateRightL 20 1024 thrpt 2 11190.639 ops/ms
>
> Kindly review the patch.
>
> Best Regards,
> Jatin
>
> [1] : https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvq
> https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvq
>
>
More information about the hotspot-compiler-dev
mailing list