RFR: 8324655: Identify integer minimum and maximum patterns created with if statements [v3]

Emanuel Peter epeter at openjdk.org
Mon Feb 26 16:16:46 UTC 2024


On Mon, 26 Feb 2024 15:51:14 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

>>> But there are also secondary effects like computing both sides of cmov instead of one side of the branch, that would be hard to catch in benchmarks.
>> 
>> I guess you would have to create a branch that has either high latency (many chained ops) or that would saturate the CPU pipeline?
>
> Thanks for looking at this again @eme64 :)
> 
>> Just FYI: I am slowly refactoring SuperWord / AutoVectorizer, and my hope is that I could eventually do if-conversion for loops. That means I would convert CFG to CMove (with min/max as special cases). But that is probably many months in the future. If we can vectorize, then CMove and min/max is often more clearly profitable: we probably execute both branches anyway. For straight-line code, we have tradeoffs with latency vs misprediction costs.
> 
> This is really exciting! It would be really nice to eventually do this transform with knowledge of nearby vectorization opportunities, to prioritize transforming `CMove`s and min/max that would benefit vectorization directly.
> 
>> @jaskarth can you construct a benchmark with `a (a cmp b) ? a : b` where you now convert to min/max, and we very often take the `a` branch, and where the `b` value is quite expensive, and the `a` value is very cheap? 
> 
> I think this would be an interesting experiment, I'll try to design a benchmark for this and get back with the results.
> 
>> @jaskarth what would it take to do the same for `float/double`? Maybe it is not so easy, I don't remember exactly, but the comparators may have a different semantics compared to min/max, especially for NaN values. But it would be nice if you explained this here.
> 
> Indeed, the trouble comes with `NaN` and negative zero. If we have `Math.min(1.0, NaN)` and `if_min(1.0, NaN)` (representing a min created with a branch statements), then `Math.min(1.0, NaN) == NaN` but `if_min(1.0, NaN) == 1.0`. Similarly for negative zeroes, `Math.min(0.0, -0.0) == -0.0` but `if_min(0.0, -0.0) == 0.0`. Theoretically, at least on x86 it would be simpler to encode the `if_min` behavior as we would only need to output the `vminsd` without the blends and compares needed to encode the NaN and -0.0 behavior, but we would need new nodes to encode the different semantics. It didn't seem worth it to create (and eventually maintain) those floating point min/max nodes with slightly different semantics so I limited the transform to only integers, as the semantics are identical there.

@jaskarth 
> I think this would be an interesting experiment, I'll try to design a benchmark for this and get back with the results.

Thanks, that would be excellent. This is to make sure that there is no regression. And if there is one, we'd have to see what to do about it.

And thanks for the explanation around `float/double`. Yes, we would probably need additional nodes, or somehow give additional options for the existing ones. But that is out of scope for this RFE 😊

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17574#issuecomment-1964547542


More information about the hotspot-compiler-dev mailing list