RFR: 8324655: Identify integer minimum and maximum patterns created with if statements [v7]
Jasmine Karthikeyan
jkarthikeyan at openjdk.org
Fri Mar 8 06:14:57 UTC 2024
On Fri, 8 Mar 2024 03:26:12 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:
>> Hi all, I've created this patch which aims to convert common integer mininum and maximum patterns created using if statements into Min and Max nodes. These patterns are usually in the form of `a > b ? a : b` and similar, as well as patterns such as `if (a > b) b = a;`. While this transform doesn't generally improve code generation it's own, it simplifies control flow and creates new opportunities for vectorization.
>>
>> I've created a benchmark for the PR, and I've attached some data from my (Zen 3) machine:
>>
>> Baseline Patch Improvement
>> Benchmark Mode Cnt Score Error Units Score Error Units
>> IfMinMax.testReductionInt avgt 15 500.307 ± 16.687 ns/op 509.383 ± 32.645 ns/op (no change)*
>> IfMinMax.testReductionLong avgt 15 493.184 ± 17.596 ns/op 513.587 ± 28.339 ns/op (no change)*
>> IfMinMax.testSingleInt avgt 15 3.588 ± 0.540 ns/op 2.965 ± 1.380 ns/op (no change)
>> IfMinMax.testSingleLong avgt 15 3.673 ± 0.128 ns/op 3.506 ± 0.590 ns/op (no change)
>> IfMinMax.testVectorInt avgt 15 340.425 ± 13.123 ns/op 59.689 ± 7.509 ns/op + 5.7x
>> IfMinMax.testVectorLong avgt 15 326.420 ± 15.554 ns/op 117.190 ± 5.622 ns/op + 2.8x
>>
>>
>> * After writing this benchmark I discovered that the compiler doesn't seem to create some simple min/max reductions, even when using Math.min/max() directly. Is this known or should I create a followup RFE for this?
>>
>> The patch passes tier 1-3 testing on linux x64. Reviews or comments would be appreciated!
>
> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
>
> Move logic to CMoveNode::Ideal and improve IR test
Actually, while experimenting with Min/Max identities I found a case that the current CMove code couldn't fully transform:
private static long test(long a, long b) {
return Math.max(a, Math.max(a, b));
}
Currently, only the second `Math.max` is transformed into a CMove- the first one is left as-is. The issue seems to be that the CMove code is trying to mistakenly use the more conservative loop-based heuristic instead of the one for straight-line code, even though there is no loop. This seems to happen in two places, first here:
https://github.com/openjdk/jdk/blob/de428daf9adef5afe7347319f7a6f6732e9b6c4b/src/hotspot/share/opto/loopopts.cpp#L701-L704
This logic seems to be inverted, as it's checking if the region's enclosing loop is the root of the loop tree, or otherwise not a loop. It seems to be `true` if it's *not* in a loop, and `false` when it *is* in a loop. This also looks to be corroborated by the [JVM Anatomy Quarks on CMove](https://shipilev.net/jvm/anatomy-quarks/30-conditional-moves/) linked earlier where CMove only kicks in when the branch percent is >18% or <82%, which was the logic for loop CMoves before [JDK-8319451](https://bugs.openjdk.org/browse/JDK-8319451), even though `doCall` doesn't contain loops. I think this is a pretty simple fix to just invert the boolean expression. Then, there's a second place it happens, here:
https://github.com/openjdk/jdk/blob/de428daf9adef5afe7347319f7a6f6732e9b6c4b/src/hotspot/share/opto/loopopts.cpp#L764-L775
Here, it sees if any consumers of the phi are a Cmp or Encode/DecodeNarrowOop, to delay the transform to split-if. In this case, the second If's Cmp consumes the phi so this code path is triggered. I'm less sure of what to do for this case, though. In this case I would say that it's being triggered in error, but there may be other cases where there is a benefit.
I think the min/max transform should still be done after the CMove transform, but I think it'll be a good idea to look at this separately because it could have a widespread impact.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17574#issuecomment-1985098406
More information about the hotspot-compiler-dev
mailing list