RFR: 8313345: SuperWord fails due to CMove without matching Bool pack

Tobias Hartmann thartmann at openjdk.org
Tue Aug 8 11:00:01 UTC 2023


SuperWord fails after [JDK-8306302](https://bugs.openjdk.org/browse/JDK-8306302), when trying to convert `Bool + Cmp + CMove` packs into `VectorMaskCmp + VectorBlend` because it does not find the `Bool` (and `Cmp`) packs for a `CMoveD`:


After filter_packs
packset
Pack: 0
 align: 0 	 674  StoreD  === 691 695 678 675  [[ 669 672 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8;  Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=603,364,300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
 align: 8 	 669  StoreD  === 691 674 673 670  [[ 603 606 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8;  Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=364,300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
 align: 16 	 603  StoreD  === 691 669 607 604  [[ 367 364 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8;  Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=364,300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
 align: 24 	 364  StoreD  === 691 603 368 428  [[ 695 363 514 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8;  Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
Pack: 1
 align: 0 	 677  LoadD  === 525 695 678  [[ 675 676 676 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=606,367,193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
 align: 8 	 672  LoadD  === 525 674 673  [[ 670 671 671 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=367,193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
 align: 16 	 606  LoadD  === 525 669 607  [[ 604 605 605 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=367,193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
 align: 24 	 367  LoadD  === 525 603 368  [[ 366 366 428 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
Pack: 2
 align: 0 	 675  CMoveD  === _ 327 676 677  [[ 674 ]]  #double !orig=604,428,[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
 align: 8 	 670  CMoveD  === _ 327 671 672  [[ 669 ]]  #double !orig=428,[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
 align: 16 	 604  CMoveD  === _ 327 605 606  [[ 603 ]]  #double !orig=428,[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
 align: 24 	 428  CMoveD  === _ 327 366 367  [[ 364 ]]  #double !orig=[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
Pack: 3
 align: 0 	 676  MulD  === _ 677 677  [[ 675 ]]  !orig=605,366,273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
 align: 8 	 671  MulD  === _ 672 672  [[ 670 ]]  !orig=366,273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
 align: 16 	 605  MulD  === _ 606 606  [[ 604 ]]  !orig=366,273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
 align: 24 	 366  MulD  === _ 367 367  [[ 428 ]]  !orig=273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)


In the failing case, both the `Cmp` and the `Bool` are outside of the loop. I propose to detect this case in `SuperWord::profitable` and simply bail out. This is obviously not a profitability check but the fix for [JDK-8306302](https://bugs.openjdk.org/browse/JDK-8306302) already added similar checks just above. This should be refactored at some point. @eme64, I think you had plans for that, right?

Looking at https://github.com/openjdk/jdk/pull/13493, I noticed the following statement:

> From what I understand, we currently never introduce a CMoveF/D, unless asked for by UseCMoveUnconditionally (C->use_c_move()). If the flag is set, we attribute no cost to the CMove, else we take Matcher::float_cmove_cost(), which seems to be ConditionalMoveLimit, and so the Phi is never converted into a CMove.

This is not true, because `Matcher::float_cmove_cost()` is `0` on AArch64 and RISCV:
https://github.com/openjdk/jdk/blob/055b4b426cbc56d97e82219f3dd3aba1ebf977e4/src/hotspot/cpu/aarch64/matcher_aarch64.hpp#L70-L73

It's true on the other platforms, where we would need to set `-XX:+UseCMoveUnconditionally` to avoid the bailout. I added runs for both cases to the test.

As also stated in https://github.com/openjdk/jdk/pull/13493, this only affects `CMoveF` and `CMoveD`. For other `CMove` nodes, we bail out during `filter_packs` with "Unimplemented" because `VectorNode::implemented` -> `VectorNode::opcode` only handles `Op_CMoveF` and `Op_CMoveD`:
https://github.com/openjdk/jdk/blob/055b4b426cbc56d97e82219f3dd3aba1ebf977e4/src/hotspot/share/opto/vectornode.cpp#L84-L87

I first tried to bail out in `SuperWord::output` but that does not work because the graph was already modified. We hit an assert in IGVN due to a vector vs. non-vector type mismatch. In general, I don't understand how the `do_reserve_copy` bailouts are supposed to work because we might have already replaced nodes by vector nodes and the `do_reserve_copy` logic does not undo these changes:
https://github.com/openjdk/jdk/blob/055b4b426cbc56d97e82219f3dd3aba1ebf977e4/src/hotspot/share/opto/superword.cpp#L2690-L2694

@eme64 I slightly remember that we talked about this before, did you observe a similar issue? Do we have a tracking bug for this broken bailout logic?

Big thanks to @SirYwell for the report and test case and to @eme64 for the initial investigation!

Thanks,
Tobias

-------------

Commit messages:
 - Fix
 - 8313345: SuperWord fails due to CMove without matching Bool pack

Changes: https://git.openjdk.org/jdk/pull/15189/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15189&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8313345
  Stats: 79 lines in 2 files changed: 79 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/15189.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/15189/head:pull/15189

PR: https://git.openjdk.org/jdk/pull/15189


More information about the hotspot-compiler-dev mailing list