RFR: 8321308: AArch64: Fix matching predication for cbz/cbnz

Andrew Dinn adinn at openjdk.org
Mon Feb 12 11:24:03 UTC 2024


On Fri, 9 Feb 2024 20:29:59 GMT, Dean Long <dlong at openjdk.org> wrote:

>> @dean-long I am not sure what you mean by 'so that it uses the unsigned condition code values'. Are you suggesting that it would be inappropriate for a `CmpU` node to employ an `le` condition? That is certainly a possibility that arises with the current opto code.
>> 
>> n.b. Field '$cmp$$cmpcode' of the CmpUNode is actually a value of the enum type `BoolTest`. So, strictly the cast on line 16187 ought to perform the conversion using
>> 
>>     Assembler::Condition cond = to_assembler_cond(cmp$$cmpcode);
>> 
>> but the result will be the same since the values in enum `Assembler::Condition` mirror those in enum `BoolTest`.
>> 
>> In this specific case the compiler installs `BoolTest::gt` into the `CmpUNode` during bytecode parsing because that is the test specified in the example code provided by @fg1417. This condition will be flipped to `BoolTest::le` as part of normalization of the resulting `IfNode` (the true and false branches attached to the if are switched as part of this change). So, after parsing and normalization we end up with a node that looks like `CmpU[le](LoadRange, IntConstant(0))` and this is what is embedded as an operand of the `If` node that gets considered as a match candidate by the current rule.
>> 
>> So, while there is nothing to stop an unsigned comparison passed through to the back end from employing a `BoolTest::le` comparison against 0, by contrast the compiler should never pass through a `BoolTest::lt` unsigned comparison against 0 because it will always evaluate to false. I am not certain but I believe from my brief look at the code that this case gets detected as part of the IfNode normalization and the if is eliminated as a dead node.
>> 
>> Granted then that the back end can see `CmpU[le](op1, IntConstant(0))`, the current operand and rule do not cover that case. Instead of `LE` they actually test for an `LT` against 0 -- which we are never going to match. If we change the rule to match for `LE` then it *will* match and insert an appropriate compare and branch. With the current match failure the `CmpU` and `If` get translated independently leading to the inefficiency @fg1417 describes.
>
> No, sorry for the confusion, that's not what I meant.  I'm OK with the backend using BoolTest values, but if it is going to use Assembler::Condition values, why not use LO/LS/HI/HS like operand cmpOpU does?  To be specific, I'm talking about the interface(COND_INTER)  table.

@dean-long Ah, apologies for missing your point. Yes, I looked into how the generated ad code uses the interface definitions to translate the BoolTest codes and what you say makes sense.

Operand `CmpOpUEqNeLeGt` really ought to be translating the `BoolTest` values using the same translation scheme as `CmpOpU` i.e.

    0 == BoolTest::eq          --> equal         --> 0 == Assembler::Condition::EQ
    1 == BoolTest::gt          --> greater       --> 8 == Assembler::Condition::HI
    3 == BoolTest::lt          --> less          --> 3 == Assembler::Condition::LO
    4 == BoolTest::ne          --> not_equal     --> 1 == Assembler::Condition::NE
    5 == BoolTest::le          --> less_equal    --> 9 == Assembler::Condition::LS
    7 == BoolTest::ge          --> greater_equal --> 2 == Assembler::Condition::HS
    2 == BoolTest::overflow    --> overflow      --> 6 == Assembler::Condition::VS
    6 == BoolTest::no_overflow --> no_overflow   --> 7 == Assembler::Condition::VC

These are the correct codes for an unsigned comparison whether or not it is comparing two values or one value against zero.

The current (revised) definition of `CmpOpUEqNeLeGt` is translating the `BoolTest` values as follows:

    0 == BoolTest::eq          --> equal         -->  0 == Assembler::Condition::EQ
    1 == BoolTest::gt          --> greater       --> 12 == Assembler::Condition::GT
    3 == BoolTest::lt          --> less          --> 11 == Assembler::Condition::LT
    4 == BoolTest::ne          --> not_equal     -->  1 == Assembler::Condition::NE
    5 == BoolTest::le          --> less_equal    --> 13 == Assembler::Condition::LE
    7 == BoolTest::ge          --> greater_equal --> 10 == Assembler::Condition::GE
    2 == BoolTest::overflow    --> overflow      -->  6 == Assembler::Condition::VS
    6 == BoolTest::no_overflow --> no_overflow   -->  7 == Assembler::Condition::VC

The amended rules propose that when we have `EQ` or `LE` we generate `cbz` and otherwise (when we have `NE` or `GT`) we generate `cbnz`. However, the rules do not really need changing.

If we amend the `interface(COND_ITER)` definition for `CmpOpUEqNeLeGt` so it is the same as the one provided for `CmpOpU` then the `BoolTest` conditions get translated into the correct Assembler conditions for an unsigned comparison.

    operand cmpOpUEqNeLeGt()
    %{
      match(Bool);
      op_cost(0);
    
      predicate(n->as_Bool()->_test._test == BoolTest::eq
                || n->as_Bool()->_test._test == BoolTest::ne
                || n->as_Bool()->_test._test == BoolTest::le
                || n->as_Bool()->_test._test == BoolTest::gt);
    
      format %{ "" %}
      interface(COND_INTER) %{
        equal(0x0, "eq");
        not_equal(0x1, "ne");
        less(0x3, "lo");
        greater_equal(0x2, "hs");
        less_equal(0x9, "ls");
        greater(0x8, "hi");
        overflow(0x6, "vs");
        no_overflow(0x7, "vc");
      %}
    %}

The above change means both the current rules that perform unsigned compare against zero are correct. When we find `EQ` or `LS` we generate `cbz` and otherwise (meaning we have `NE` or `HI`) we would generate `cbnz` e.g.

    instruct cmpUI_imm0_branch(cmpOpUEqNeLtGe cmp, iRegIorL2I op1, immI0 op2, label labl, rFlagsRegU cr) %{
      match(If cmp (CmpU op1 op2));
      effect(USE labl);
    
      ins_cost(BRANCH_COST);
      format %{ "cbw$cmp   $op1, $labl" %}
      ins_encode %{
        Label* L = $labl$$label;
        Assembler::Condition cond = (Assembler::Condition)$cmp$$cmpcode;
        if (cond == Assembler::EQ || cond == Assembler::LS)
          __ cbzw($op1$$Register, *L);
        else
          assert (cond == Assembler::NE || cond == Assembler::HI);
          __ cbnzw($op1$$Register, *L);
      %}
      ins_pipe(pipe_cmp_branch);
    %}

So, neither of the rules needs 'fixing'. They would only benefit from one change, the addition of an assert for the else case.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16989#discussion_r1486045573


More information about the hotspot-compiler-dev mailing list