RFR: 8332856: C2: Add new transform for bool eq/ne (cmp (and (urshift X const1) const2) 0)

Tobias Hotz duke at openjdk.org
Tue May 28 18:04:04 UTC 2024


On Tue, 28 May 2024 16:26:05 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> This PR adds a new ideal optimization for the following pattern:
>> 
>> public boolean testFunc(int a) {
>>     int mask = 0b101;
>>     int shift = 12;
>>     return ((a >> shift) & mask) == 0;
>> }
>> 
>> Where the mask and shift are constant values and a is a variable. For this optimization to work, the right shift has to be idealized to a unsinged right shift earlier in the pipeline, which here: https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/hotspot/share/opto/mulnode.cpp#L731
>> If the shift is already an unsiged bit shift, it works as well.
>> On AMD64 CPUs, this means that this whole line computation can be reduced to a simple `test` instruction.
>
> I'm looking at `AndINode::Ideal`.
> 
> `AndIL_add_shift_and_mask` seems to do something similar, but not exactly this. Maybe it could be extended.
> 
> Ha, what about this?
> 
>   // Masking off sign bits?  Dont make them!
>   if( lop == Op_RShiftI ) {
>     const TypeInt *t12 = phase->type(load->in(2))->isa_int();
>     if( t12 && t12->is_con() ) { // Shift is by a constant
>       int shift = t12->get_con();
>       shift &= BitsPerJavaInteger-1;  // semantics of Java shifts
>       const int sign_bits_mask = ~right_n_bits(BitsPerJavaInteger - shift);
>       // If the AND'ing of the 2 masks has no bits, then only original shifted
>       // bits survive.  NO sign-extension bits survive the maskings.
>       if( (sign_bits_mask & mask) == 0 ) {
>         // Use zero-fill shift instead
>         Node *zshift = phase->transform(new URShiftINode(load->in(1),load->in(2)));
>         return new AndINode( zshift, in(2) );
>       }
>     }
>   }

Hi @eme64! Thanks for the quick look.
This patch only transforms bool eq/ne 0 cases because the general transformation of `AndI(Shift X shift, mask)` -> `AndI(X, new_mask)` is not valid. As a small example on 4-bit numbers, `(0b0010 >>> 1) & 0b1` would result in `0b0001`, but `0b0010 & 0b0010` would result in `0b0010`, which is obviously not that same result.
It always works for scenarios where only a eq/ne test against zero is performed, as the position of the bits doesn't matter there.

The transform of `RShift` nodes to `URShiftI` Nodes in the `AndI` Ideal func actually already helps with this transform, as we only need to match `URShiftI` now.

As for an example: Take a look at java.awt.Color, especially the `getGreen` or similar functions:
https://github.com/openjdk/jdk/blob/da6aa2a86c86ba5fce747b36dcb2d6001cfcc44e/src/java.desktop/share/classes/java/awt/Color.java#L548-L550
If a caller only wants to check that if the green color channel is present, it would call `color.getGreen() != 0`. After inlining, this optimization could be applied in this case, removing the need for the shift. I've also seen this optimization being triggered on real-life applications and will provide a benchmark shortly.

I hope this answers your questions/concerns!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19310#issuecomment-2135829419


More information about the hotspot-compiler-dev mailing list