RFR: 8306302: C2 Superword fix: use VectorMaskCmp and VectorBlend instead of CMoveVF/D [v2]

Emanuel Peter epeter at openjdk.org
Thu May 11 08:45:45 UTC 2023


> **Bug**
> In `x86`, `CMoveVF/D` were not correctly implemented for the `eq` and `neq` case (leads to assert). And the `lt/le/gt/ge` cases did not all handle `NaN's` correctly (ordered vs unordered comparision, leads to wrong results).
> 
> The assert gets triggered in the code from this change: [JDK-8285973](https://bugs.openjdk.org/browse/JDK-8285973)
> On this line: https://github.com/openjdk/jdk/commit/c1db70d827f7ac81aa6c6646e2431f672c71c8dc#diff-e5266a3774f26ac663dcc67e0be18608b1735f38c0576673ce36e0cd689bab4aR4309
> 
> The problematic line wants to find a Cmp above the Bool, and compare its inputs. But we have no Cmp there, just a constant, that we have set during matching:
> https://github.com/openjdk/jdk/blob/af4d5600e37ec6d331e62c5d37491ee97cad5311/src/hotspot/share/opto/matcher.cpp#L2394
> 
> The wrong results with `NaN` are because of a bug in `x`:
> https://github.com/openjdk/jdk/commit/0485593fbc4a3264b79969de192e8e7d36e5b590#diff-7070c036c7d88ba4a8467e404d8d88aee646b97bf7bacc8b73cbc93f3ef11d2dR2106
> The cases `lt` and `le` include the `-1` case, which shoud return `true` if any comparison input is a `NaN`, just as defined for java bytecode `fcmpl/dcmpl`. But they were mapped to ordered comparison codes, not unordered ones. More [here](https://bugs.openjdk.org/browse/JDK-8306302?focusedCommentId=14579078&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14579078).
> 
> **Solution**
> @fg1417 suggested that `CMoveVF/D` is perfectly composed of `VectorMaskCmp + VectorBlend`. So instead of fixing `CMoveVF/D`, I replaced it. Performance should be the same, as it goes down to the same assembly instructions.
> 
> This has a few benefits:
>  - `VectorMaskCmp + VectorBlend` is more powerful:
>    - `CMoveVF/D` required the same inputs to the compare than to the move itself.
>    - `CMoveVF/D` on x86 was only implemented for 32 bytes. Any other size would simply fail to vectorize.
>    - `VectorMaskCmp` and `VectorBlend` can have different compare inputs, and even different types. For now, the input types must have the same data-width (`float` and `int`, `double` and `long`).
>    - We need less code (I completely removed all code for `CMoveVF/D`).
> 
> I also moved the whole `CMove` code in `SuperWord` into `SuperWord::output`, rather than the complex code `SuperWord::merge_packs_to_cmove / CMoveKit`.
> 
> As reported in [JDK-8306088](https://bugs.openjdk.org/browse/JDK-8306088) https://github.com/openjdk/jdk/pull/13354, the CMove code did not prop...

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  Improved comment on request of @fg1417

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/13493/files
  - new: https://git.openjdk.org/jdk/pull/13493/files/81d4de72..89fe29e8

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=13493&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13493&range=00-01

  Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/13493.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13493/head:pull/13493

PR: https://git.openjdk.org/jdk/pull/13493


More information about the hotspot-compiler-dev mailing list