RFR: 8332856: C2: Add new transform for bool eq/ne (cmp (and (urshift X const1) const2) 0) [v2]
Andrew Haley
aph at openjdk.org
Sun Jun 30 16:46:20 UTC 2024
On Tue, 28 May 2024 20:11:35 GMT, Tobias Hotz <duke at openjdk.org> wrote:
>> This PR adds a new ideal optimization for the following pattern:
>>
>> public boolean testFunc(int a) {
>> int mask = 0b101;
>> int shift = 12;
>> return ((a >> shift) & mask) == 0;
>> }
>>
>> Where the mask and shift are constant values and a is a variable. For this optimization to work, the right shift has to be idealized to a unsinged right shift earlier in the pipeline, which here: https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/hotspot/share/opto/mulnode.cpp#L731
>> If the shift is already an unsiged bit shift, it works as well.
>> On AMD64 CPUs, this means that this whole line computation can be reduced to a simple `test` instruction.
>
> Tobias Hotz has updated the pull request incrementally with two additional commits since the last revision:
>
> - LF endings...
> - Add a benchmark to measure effect of new ideal transformation
There is a minimal-to-nothing speed difference on a fast AArch64 system (Apple M3) because the benchmark is dominated by the latency to load the data from memory, which is far longer than the time to do the work. However, the code is smaller , so this is a win.
add x17, x15, w7, sxtw #2;*ia add x3, x7, w13, sxtw #2;
;
;
ldr w10, [x17, #0x10] ldr w15, [x3, #0x10]
ldr w12, [x17, #0x2c]
ldp w1, w21, [x3, #0x24]
ldp w1, w11, [x17, #0x14] ldr w2, [x3, #0x20]
ldr w16, [x3, #0x2c]
ldp w14, w13, [x17, #0x1c] tst w15, #0xff00
ldr w14, [x3, #0x14]
ldp w16, w17, [x17, #0x24] cset w15, eq ;
ubfx w2, w17, #8, #8
ubfx w17, w1, #8, #8
ubfx w3, w10, #8, #8
ubfx w1, w11, #8, #8 ldp w0, w17, [x3, #0x18]
ubfx w10, w12, #8, #8 tst w21, #0xff00
ubfx w11, w14, #8, #8 cset w3, eq
ubfx w12, w16, #8, #8 tst w1, #0xff00
ubfx w14, w13, #8, #8 cset w15, eq
cmp w3, #0 tst w2, #0xff00
cset w13, eq cset w2, eq
cmp w12, #0 tst w17, #0xff00
cset w12, eq cset w17, eq
cmp w14, #0 tst w0, #0xff00
cset w14, eq cset w1, eq
cmp w11, #0 tst w14, #0xff00
cset w11, eq cset w14, eq
tst w16, #0xff00 ;
cmp w1, #0
cset w16, eq
cmp w17, #0 add w13, w13, #8 ;
cset w13, eq
cmp w10, #0
cset w0, eq ;
cset w17, eq
cmp w13, w12
b.lt #0x1100dd420 ;
cmp w2, #0
add w7, w7, #8
cset w10, eq
cmp w7, w22
b.lt #0x10be2ea80
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19310#issuecomment-2198617427
More information about the hotspot-compiler-dev
mailing list