RFR: 8324655: Identify integer minimum and maximum patterns created with if statements [v3]
Jasmine Karthikeyan
jkarthikeyan at openjdk.org
Tue Feb 27 18:26:54 UTC 2024
On Tue, 27 Feb 2024 17:59:10 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> @eme64 I've designed this benchmark, which has a cheap common path and an expensive rare path:
>>
>> private static final Random RANDOM = new Random();
>> @Benchmark
>> public void testCheapCommon(Blackhole blackhole) {
>> for (int i = 0; i < 300; i++) {
>> int a = i & 1; // Cheap
>> int b = RANDOM.nextInt(33); // Expensive
>> int c = b >= a ? a : b;
>> blackhole.consume(c + 20);
>> }
>> }
>>
>> The condition `b >= a` should be true in the majority of cases, selecting `a`. I made sure that C2 was creating a MinI here, and it was. Here are the results I got:
>>
>> Benchmark Mode Cnt Score Error Units
>> IfMinMax.testCheapCommon avgt 15 1127.771 ± 10.838 ns/op // Baseline
>> IfMinMax.testCheapCommon avgt 15 1104.754 ± 4.097 ns/op // Patch
>>
>> It seems that there isn't a regression here, and is in fact marginally better (but this may just be noise). Is this what you had in mind for the benchmark?
>>
>>> I'm asking @jaskarth if it is easier to transform a CMove into a Min/Max instead of trying to look for matching Phis.
>>
>> @merykitty I did a bit of experimentation with the current benchmarks, and it seems that in many cases the `min/max` patterns aren't turned into `CMove`s, as the `CMove` heuristic is generally quite conservative. Since we're interested in a subset of `CMove`s that are simpler, I think it would be better to group this optimization with the others that operate on CFG diamonds, to find more cases that could be optimized. Though, in general I think this group of optimizations should be refactored to apply to both CFG diamonds and `CMove`s, since a `CMove` is a sort of CFG diamond in itself- but I think this cleanup should be done in a future RFE.
>>
>>> While you are handling this, following Identity transforms can be added to integer and long Max/Min.
>>
>> @jatin-bhateja I think this is a good idea, I can take a look at this in a later RFE as suggested.
>
> @jaskarth
>> I've designed this benchmark
>
> Nice. Can you also post the generated assembly for Baseline/Patch?
> I'm just worried that there is some method call, or something else that does not get cleanly inlined and could mess with the benchmark.
@eme64 Sure, here is the assembly for the baseline: https://gist.github.com/jaskarth/1fe6f00a5b37fe3efb0dd6a2d24840e0
And after: https://gist.github.com/jaskarth/99c56e2f081f996987b96d7e866aca6c
I must have missed this originally when evaluating the benchmark, but looking at the assembly it seems like the baseline JDK creates a `CMove` for that ternary already. I made a quick patch to disable where `PhaseIdealLoop::conditional_move` is called, and the performance still stays the same on the benchmark. I've also attached that assembly if it's of interest: https://gist.github.com/jaskarth/7b12b688f82a3b8e854785f1827b0c20
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17574#issuecomment-1967344016
More information about the hotspot-compiler-dev
mailing list