RFR: 8332856: C2: Add new transform for bool eq/ne (cmp (and (urshift X const1) const2) 0) [v2]
Jasmine Karthikeyan
jkarthikeyan at openjdk.org
Sat Jun 29 18:11:21 UTC 2024
On Tue, 28 May 2024 20:11:35 GMT, Tobias Hotz <duke at openjdk.org> wrote:
>> This PR adds a new ideal optimization for the following pattern:
>>
>> public boolean testFunc(int a) {
>> int mask = 0b101;
>> int shift = 12;
>> return ((a >> shift) & mask) == 0;
>> }
>>
>> Where the mask and shift are constant values and a is a variable. For this optimization to work, the right shift has to be idealized to a unsinged right shift earlier in the pipeline, which here: https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/hotspot/share/opto/mulnode.cpp#L731
>> If the shift is already an unsiged bit shift, it works as well.
>> On AMD64 CPUs, this means that this whole line computation can be reduced to a simple `test` instruction.
>
> Tobias Hotz has updated the pull request incrementally with two additional commits since the last revision:
>
> - LF endings...
> - Add a benchmark to measure effect of new ideal transformation
I had a chance to test the patch on a Raspberry Pi 4B (Cortex-A72 CPU), and I got these results on the attached benchmark:
Baseline Patch Improvement
Benchmark Mode Cnt Score Error Units Score Error Units
ShiftAndCmpZeroIdeal.testConstantShiftMask avgt 9 232.528 ± 2.385 ns/op 166.497 ± 2.292 ns/op + 33.1%
So it does seem like there's a similar benefit there! I think it would be nice to test RISC-V as well. FWIW, it would be possible to match this pattern in the ad files with a `predicate(...)` block, like how we match `cmov(0, 1)` into a `setcc` instruction ([here](https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L6126-L6130)). Though, if the pattern is beneficial on most architectures it might be easier to keep it in the middle-end.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19310#issuecomment-2198280869
More information about the hotspot-compiler-dev
mailing list