RFR: 8324655: Identify integer minimum and maximum patterns created with if statements [v3]
Jasmine Karthikeyan
jkarthikeyan at openjdk.org
Mon Feb 26 15:53:56 UTC 2024
On Mon, 26 Feb 2024 11:21:08 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Jasmine Karthikeyan has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>>
>> - Merge master
>> - Apply changes from review
>> - Don't transform highly predictable branches
>> - Convert integer min/max patterns to Min/Max nodes
>
>> But there are also secondary effects like computing both sides of cmov instead of one side of the branch, that would be hard to catch in benchmarks.
>
> I guess you would have to create a branch that has either high latency (many chained ops) or that would saturate the CPU pipeline?
Thanks for looking at this again @eme64 :)
> Just FYI: I am slowly refactoring SuperWord / AutoVectorizer, and my hope is that I could eventually do if-conversion for loops. That means I would convert CFG to CMove (with min/max as special cases). But that is probably many months in the future. If we can vectorize, then CMove and min/max is often more clearly profitable: we probably execute both branches anyway. For straight-line code, we have tradeoffs with latency vs misprediction costs.
This is really exciting! It would be really nice to eventually do this transform with knowledge of nearby vectorization opportunities, to prioritize transforming `CMove`s and min/max that would benefit vectorization directly.
> @jaskarth can you construct a benchmark with `a (a cmp b) ? a : b` where you now convert to min/max, and we very often take the `a` branch, and where the `b` value is quite expensive, and the `a` value is very cheap?
I think this would be an interesting experiment, I'll try to design a benchmark for this and get back with the results.
> @jaskarth what would it take to do the same for `float/double`? Maybe it is not so easy, I don't remember exactly, but the comparators may have a different semantics compared to min/max, especially for NaN values. But it would be nice if you explained this here.
Indeed, the trouble comes with `NaN` and negative zero. If we have `Math.min(1.0, NaN)` and `if_min(1.0, NaN)` (representing a min created with a branch statements), then `Math.min(1.0, NaN) == NaN` but `if_min(1.0, NaN) == 1.0`. Similarly for negative zeroes, `Math.min(0.0, -0.0) == -0.0` but `if_min(0.0, -0.0) == 0.0`. Theoretically, at least on x86 it would be simpler to encode the `if_min` behavior as we would only need to output the `vminsd` without the blends and compares needed to encode the NaN and -0.0 behavior, but we would need new nodes to encode the different semantics. It didn't seem worth it to create (and eventually maintain) those floating point min/max nodes with slightly different semantics so I limited the transform to only integers, as the semantics are identical there.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17574#issuecomment-1964472503
More information about the hotspot-compiler-dev
mailing list