RFR: 8257483: C2: Split immediate vector rotate from RotateLeftV and RotateRightV nodes

Wed Dec 2 11:54:54 UTC 2020

On Wed, 2 Dec 2020 11:17:26 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>>> Why do you introduce new ideal nodes (`RotateLeftImmV`/`RotateRightImmV`)? You already extended `is_vector_rotate_supported` to take the shift argument into account.
>>> 
>> 
>> I think extended `is_vector_rotate_supported` can't offer what we are going after.
>> 
>> Aarch64 supports `OrV/LShiftV/URShiftV`, so `is_vector_rotate_supported`  returns true anyway with or without `Rotate*ImmV`.
>> If we do not introduce `RotateLeftImmV/RotateRightImmV` nodes, `RotateLeftV/RotateRightV`nodes will be created for both immediate and variable rotation.
>> As mentioned before, variable rotation matching rules should no exsit in `aarch64.ad`, due to the performance regression.
>> It leads to a crash at `src/hotspot/share/opto/matcher.cpp:1657` with message `assert(false) failed: bad AD file`.
>> 
>> If extended `is_vector_rotate_supported` return false anyhow for variable rotations like:
>>  bool VectorNode::is_vector_rotate_supported(int vopc, uint vlen, BasicType bt, bool is_imm) {
>>    assert(vopc == Op_RotateLeftV || vopc == Op_RotateRightV, "wrong opcode");
>> 
>> +  AARCH64_ONLY(if(!is_imm) return false);
>> +
>>    // Get the opcode for rotate immediate
>>    if (is_imm) {
>>      vopc = vopc == Op_RotateLeftV ? Op_RotateLeftImmV : Op_RotateRightImmV;
>> The variable roations would not be vectorized by `OrV/LShiftV/URShiftV` combinations.
>> 
>> With `RotateLeftImmV/RotateRightImmV` nodes, we can match them separately and optimized in a CPU and keep the variable rotation the same as before.
>> I'am sorry if I miss something.
>> 
>>> Also, you mention aarch64, but I see only x86 changes.
>>>
>> 
>> I tried to optimize immediate vector rotations with shift&insert in JDK-8256820 for aarch64.
>> WIth `RotateLeftImmV/RotateRightImmV`,  an initial optimization for aarch64 is shown in attachment [optimize_rotateImm_aarch64.txt](https://github.com/openjdk/jdk/files/5627342/optimize_rotateImm_aarch64.txt)
>> , tests are good.
>> But I also want to discuss some other modifications (for aarch64 only) in that issue, e.g. trivial fix for absolute and accumulate with small vector length, move shifting operations to aarch64_neon.ad, etc.
>> To make this progress clean and clear, and I think this splitting immediate roatation can be discussed separately, so I raised up this new PR. Hope this would not be a bother.
>
> Thanks for the detailed answer, but I'm still confused why `RotateLeftImmV` is more powerful  than `RotateLeftV` + a check that shift is constant. Why can't `RotateLeftV`/`RotateRightV` vector nodes be allowed on some platforms only with constant shifts?

It looks like you implicitly depend on the check that there are rules present which match that particular opcode:
const bool Matcher::match_rule_supported(int opcode) {
  if (!has_match_rule(opcode)) {
    return false; // no match rule present
  }

But you don't have to and I don't see why additional checks in `VectorNode::is_vector_rotate_supported()` can't be extended to cover that for you.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1532