RFR: 8249893: AARCH64: optimize the construction of the value from the bits of the other two [v5]

Andrew Dinn adinn at openjdk.java.net
Wed Nov 4 17:44:58 UTC 2020


On Tue, 3 Nov 2020 20:56:14 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

>> Let me revive the change request [3] to C2 and AArch64 that applies Bitfield Insert instruction in the expression "(v1 & 0xFF) | ((v2 & 0xFF) << 8)". 
>>  
>> Compared to the last round of review [2] I updated the transformation to apply BFI in more cases and added a jtreg test.
>> 
>> As before, compared to the original patch [1], the transformation logic is now in the common C2 code: a new BitfieldInsert node has been introduced to replace Or+Shift+And sequence when possible, on AARCH a single BFI instruction is emitted for the new node.
>>  
>> [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-July/039161.html 
>> [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-August/039653.html
>> [3] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-August/039792.html
>
> Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update comments and function name to make them clearer

src/hotspot/share/opto/addnode.cpp line 872:

> 870:           return new BitfieldInsertINode(dst, value, phase->intcon(offset), phase->intcon(width));
> 871:         }
> 872:       }

This code and its accompanying comments need to be made much clearer:

 - `dst` and `src` is a rather perverse choice of names for the inputs of the Or node. `l` and `r` or `left` and `right`,  following the convention in the preceding code, would be better.

 - You mention `dst` in the comment to identify it as the left `Or` input but do not clearly identify `src` with the two alternative matched patterns for the the right hand side of the `Or`.

 - The name `shift `is used in those two patterns for an operand that actually has to be a constant bit mask for the transformation to be applicable (so why not use `mask`, `1s_mask` or `constmask`?).

 - Your comment incoherently employs different notations: i.e. you use a term for the `BitfieldInsert` expression and refer to the outermost node using the term name `Or` but you specify the patterns using the infix C language operators `&` and `<<`.

 - The qualifiying comment 'if the Or argument value range masks do not overlap' states a condition for the replacement without properly explaining the meaning of that condition i.e. that the various subexpressions operate on disjoint bitfields of the integral value being computed.

 - You use var `mask` for the const mask node and then reuse it for the constant value it identifies (directly after using `mask` to compute `width` which is used to define `mask`, making it look like mask is used to define itself).

 - Most importantly, the introductory comment does not provide a clear summary of what sort of graph shape is being replaced and how the match and bit range constraints legitimize that replacement.

I would suggest the following as a replacement:

    if (can_reshape && !phase->C->major_progress() && Matcher::match_rule_supported(Op_BitfieldInsertI)) {
      // If the right input of this Or is an And with mask or an LShifted
      // And with mask and the left and right inputs can be determined
      // to construct values lying in disjoint bit ranges then the Or
      // can be replaced with BitfieldInsert.
      //
      // There are two substitution rules:
      //
      // 1) (Or left (And value mask)) => (BitfieldInsert left value width 0))
      //    where width == bitcount(mask) AND
      //         (value_range_mask(left) & mask) == 0
      //
      // 2) (Or left (LShift (And value mask) offset) => (BitfieldInsert left value width 0)
      //    where width == bitcount(mask) AND
      //          (value_range_mask(left) & (mask << offset)) == 0
      // n.b.
      // mask is an integer constant comprising a contiguous sequence of 1s
      // value_range_mask(node) computes a mask identifying the range of bits
      // that could be set by its argument

      Node *left = in(1);
      Node *right = in(2);
      Node *andi = NULL;
      int offset = 0;

      if (right->Opcode() == Op_LShiftI && right->in(1)->Opcode() == Op_AndI && right->in(2)->is_Con()) {
        andi   = right->in(1);
        offset = right->in(2)->get_int();
      } else if (right->Opcode() == Op_AndI) {
        andi = right;
      }
      if (andi != NULL) {
        Node* mask  = andi->in(2);
        if (mask->is_Con() && is_power_of_2(mask->get_int() + 1)) {
          Node* value = andi->in(1);
          int width = exact_log2(mask->get_int() + 1);
          int maskval = ((1 << width) - 1) << offset;
          if (width + offset <= 32 && ((value_range_mask(phase, left) & maskval) == 0)) {
            return new BitfieldInsertINode(left, value, phase->intcon(offset), phase->intcon(width));
          }
        }

Note that I am reusing the SEXPR format used in the ad files to describe the graph patterns and using (C-like) pseudo-code to define the substitution conditions.

src/hotspot/share/opto/addnode.cpp line 963:

> 961:       }
> 962:     }
> 963:   }

This needs reworking as above. I suggest you comment it by referring the reader back to the previous method.

-------------

PR: https://git.openjdk.java.net/jdk/pull/511


More information about the hotspot-compiler-dev mailing list