RFR: 8324655: Identify integer minimum and maximum patterns created with if statements [v3]

Emanuel Peter epeter at openjdk.org
Tue Feb 27 18:01:44 UTC 2024


On Tue, 27 Feb 2024 17:54:53 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

>> @merykitty I am thinking of a case like this:
>> 
>> a = <something very expensive>
>> b = <something cheap>
>> x = (a < b) ? a : b;
>> 
>> If in most cases we take `b` (because it is larger), then we might speculatively assign `x = b`, before we have finished computing `a`. That way we can already continue (speculatively) with `x = b`, while `a` is still computing. If the speculation is wrong, then the CPU flushes the pipeline.
>> 
>> If this is converted to `max/min`, then we need to wait for `a` to be computed.
>> 
>> @merykitty @jaskarth does this make sense as an example for a potential performance regression?
>
> @eme64 I've designed this benchmark, which has a cheap common path and an expensive rare path:
> 
> private static final Random RANDOM = new Random();
> @Benchmark
> public void testCheapCommon(Blackhole blackhole) {
>     for (int i = 0; i < 300; i++) {
>         int a = i & 1;               // Cheap
>         int b = RANDOM.nextInt(33);  // Expensive
>         int c = b >= a ? a : b;
>         blackhole.consume(c + 20);
>     }
> }
> 
> The condition `b >= a` should be true in the majority of cases, selecting `a`. I made sure that C2 was creating a MinI here, and it was. Here are the results I got:
> 
> Benchmark                  Mode  Cnt     Score   Error  Units
> IfMinMax.testCheapCommon  avgt   15  1127.771 ± 10.838  ns/op // Baseline
> IfMinMax.testCheapCommon  avgt   15  1104.754 ± 4.097   ns/op // Patch
> 
> It seems that there isn't a regression here, and is in fact marginally better (but this may just be noise). Is this what you had in mind for the benchmark?
> 
>> I'm asking @jaskarth if it is easier to transform a CMove into a Min/Max instead of trying to look for matching Phis.
> 
> @merykitty I did a bit of experimentation with the current benchmarks, and it seems that in many cases the `min/max` patterns aren't turned into `CMove`s, as the `CMove` heuristic is generally quite conservative. Since we're interested in a subset of `CMove`s that are simpler, I think it would be better to group this optimization with the others that operate on CFG diamonds, to find more cases that could be optimized. Though, in general I think this group of optimizations should be refactored to apply to both CFG diamonds and `CMove`s, since a `CMove` is a sort of CFG diamond in itself- but I think this cleanup should be done in a future RFE.
> 
>> While you are handling this, following Identity transforms can be added to integer and long Max/Min.
> 
> @jatin-bhateja I think this is a good idea, I can take a look at this in a later RFE as suggested.

@jaskarth 
> I've designed this benchmark

Nice. Can you also post the generated assembly for Baseline/Patch?
I'm just worried that there is some method call, or something else that does not get cleanly inlined and could mess with the benchmark.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17574#issuecomment-1967305832


More information about the hotspot-compiler-dev mailing list