RFR: 8257483: C2: Split immediate vector rotate from RotateLeftV and RotateRightV nodes
Dong Bo
dongbo at openjdk.java.net
Wed Dec 2 06:39:55 UTC 2020
On Tue, 1 Dec 2020 23:56:11 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:
> Why do you introduce new ideal nodes (`RotateLeftImmV`/`RotateRightImmV`)? You already extended `is_vector_rotate_supported` to take the shift argument into account.
>
I think extended `is_vector_rotate_supported` can't offer what we are going after.
Aarch64 supports `OrV/LShiftV/URShiftV`, so `is_vector_rotate_supported` returns true anyway with or without `Rotate*ImmV`.
If we do not introduce `RotateLeftImmV/RotateRightImmV` nodes, `RotateLeftV/RotateRightV`nodes will be created for both immediate and variable rotation.
As mentioned before, variable rotation matching rules should no exsit in `aarch64.ad`, due to the performance regression.
It leads to a crash at `src/hotspot/share/opto/matcher.cpp:1657` with message `assert(false) failed: bad AD file`.
If extended `is_vector_rotate_supported` return false anyhow for variable rotations like:
bool VectorNode::is_vector_rotate_supported(int vopc, uint vlen, BasicType bt, bool is_imm) {
assert(vopc == Op_RotateLeftV || vopc == Op_RotateRightV, "wrong opcode");
+ AARCH64_ONLY(if(!is_imm) return false);
+
// Get the opcode for rotate immediate
if (is_imm) {
vopc = vopc == Op_RotateLeftV ? Op_RotateLeftImmV : Op_RotateRightImmV;
The variable roations would not be vectorized by `OrV/LShiftV/URShiftV` combinations.
With `RotateLeftImmV/RotateRightImmV` nodes, we can match them separately and optimized in a CPU and keep the variable rotation the same as before.
I'am sorry if I miss something.
> Also, you mention aarch64, but I see only x86 changes.
>
I tried to optimize immediate vector rotations with shift&insert in JDK-8256820 for aarch64.
WIth `RotateLeftImmV/RotateRightImmV`, an initial optimization for aarch64 is shown in attachment [optimize_rotateImm_aarch64.txt](https://github.com/openjdk/jdk/files/5627342/optimize_rotateImm_aarch64.txt)
, tests are good.
But I also want to discuss some other modifications (for aarch64 only) in that issue, e.g. trivial fix for absolute and accumulate with small vector length, move shifting operations to aarch64_neon.ad, etc.
To make this progress clean and clear, and I think this splitting immediate roatation can be discussed separately, so I raised up this new PR. Hope this would not be a bother.
-------------
PR: https://git.openjdk.java.net/jdk/pull/1532
More information about the hotspot-compiler-dev
mailing list