RFR: 8324655: Identify integer minimum and maximum patterns created with if statements [v3]
Jasmine Karthikeyan
jkarthikeyan at openjdk.org
Tue Feb 27 17:57:44 UTC 2024
On Tue, 27 Feb 2024 17:19:06 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> @eme64 I thought the merged values are the ones used in the comparison, which means that the 2 branches are empty? Maybe there are cases that I am missing here?
>
> @merykitty I am thinking of a case like this:
>
> a = <something very expensive>
> b = <something cheap>
> x = (a < b) ? a : b;
>
> If in most cases we take `b` (because it is larger), then we might speculatively assign `x = b`, before we have finished computing `a`. That way we can already continue (speculatively) with `x = b`, while `a` is still computing. If the speculation is wrong, then the CPU flushes the pipeline.
>
> If this is converted to `max/min`, then we need to wait for `a` to be computed.
>
> @merykitty @jaskarth does this make sense as an example for a potential performance regression?
@eme64 I've designed this benchmark, which has a cheap common path and an expensive rare path:
private static final Random RANDOM = new Random();
@Benchmark
public void testCheapCommon(Blackhole blackhole) {
for (int i = 0; i < 300; i++) {
int a = i & 1; // Cheap
int b = RANDOM.nextInt(33); // Expensive
int c = b >= a ? a : b;
blackhole.consume(c + 20);
}
}
The condition `b >= a` should be true in the majority of cases, selecting `a`. I made sure that C2 was creating a MinI here, and it was. Here are the results I got:
Benchmark Mode Cnt Score Error Units
IfMinMax.testCheapCommon avgt 15 1127.771 ± 10.838 ns/op // Baseline
IfMinMax.testCheapCommon avgt 15 1104.754 ± 4.097 ns/op // Patch
It seems that there isn't a regression here, and is in fact marginally better (but this may just be noise). Is this what you had in mind for the benchmark?
> I'm asking @jaskarth if it is easier to transform a CMove into a Min/Max instead of trying to look for matching Phis.
@merykitty I did a bit of experimentation with the current benchmarks, and it seems that in many cases the `min/max` patterns aren't turned into `CMove`s, as the `CMove` heuristic is generally quite conservative. Since we're interested in a subset of `CMove`s that are simpler, I think it would be better to group this optimization with the others that operate on CFG diamonds, to find more cases that could be optimized. Though, in general I think this group of optimizations should be refactored to apply to both CFG diamonds and `CMove`s, since a `CMove` is a sort of CFG diamond in itself- but I think this cleanup should be done in a future RFE.
> While you are handling this, following Identity transforms can be added to integer and long Max/Min.
@jatin-bhateja I think this is a good idea, I can take a look at this in a later RFE as suggested.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17574#issuecomment-1967299149
More information about the hotspot-compiler-dev
mailing list