RFR: 8313345: SuperWord fails due to CMove without matching Bool pack
Tobias Hartmann
thartmann at openjdk.org
Tue Aug 8 11:00:01 UTC 2023
SuperWord fails after [JDK-8306302](https://bugs.openjdk.org/browse/JDK-8306302), when trying to convert `Bool + Cmp + CMove` packs into `VectorMaskCmp + VectorBlend` because it does not find the `Bool` (and `Cmp`) packs for a `CMoveD`:
After filter_packs
packset
Pack: 0
align: 0 674 StoreD === 691 695 678 675 [[ 669 672 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=603,364,300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
align: 8 669 StoreD === 691 674 673 670 [[ 603 606 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=364,300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
align: 16 603 StoreD === 691 669 607 604 [[ 367 364 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=364,300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
align: 24 364 StoreD === 691 603 368 428 [[ 695 363 514 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; Memory: @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=8; !orig=300 !jvms: Reproducer2$A::fill @ bci:17 (line 16)
Pack: 1
align: 0 677 LoadD === 525 695 678 [[ 675 676 676 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=606,367,193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
align: 8 672 LoadD === 525 674 673 [[ 670 671 671 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=367,193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
align: 16 606 LoadD === 525 669 607 [[ 604 605 605 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=367,193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
align: 24 367 LoadD === 525 603 368 [[ 366 366 428 ]] @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=8; #double !orig=193 !jvms: Reproducer2$A::fill @ bci:13 (line 16)
Pack: 2
align: 0 675 CMoveD === _ 327 676 677 [[ 674 ]] #double !orig=604,428,[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
align: 8 670 CMoveD === _ 327 671 672 [[ 669 ]] #double !orig=428,[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
align: 16 604 CMoveD === _ 327 605 606 [[ 603 ]] #double !orig=428,[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
align: 24 428 CMoveD === _ 327 366 367 [[ 364 ]] #double !orig=[393],278 !jvms: Reproducer2$A::fill @ bci:14 (line 16)
Pack: 3
align: 0 676 MulD === _ 677 677 [[ 675 ]] !orig=605,366,273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
align: 8 671 MulD === _ 672 672 [[ 670 ]] !orig=366,273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
align: 16 605 MulD === _ 606 606 [[ 604 ]] !orig=366,273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
align: 24 366 MulD === _ 367 367 [[ 428 ]] !orig=273 !jvms: Reproducer2$A::transform @ bci:2 (line 21) Reproducer2$A::fill @ bci:14 (line 16)
In the failing case, both the `Cmp` and the `Bool` are outside of the loop. I propose to detect this case in `SuperWord::profitable` and simply bail out. This is obviously not a profitability check but the fix for [JDK-8306302](https://bugs.openjdk.org/browse/JDK-8306302) already added similar checks just above. This should be refactored at some point. @eme64, I think you had plans for that, right?
Looking at https://github.com/openjdk/jdk/pull/13493, I noticed the following statement:
> From what I understand, we currently never introduce a CMoveF/D, unless asked for by UseCMoveUnconditionally (C->use_c_move()). If the flag is set, we attribute no cost to the CMove, else we take Matcher::float_cmove_cost(), which seems to be ConditionalMoveLimit, and so the Phi is never converted into a CMove.
This is not true, because `Matcher::float_cmove_cost()` is `0` on AArch64 and RISCV:
https://github.com/openjdk/jdk/blob/055b4b426cbc56d97e82219f3dd3aba1ebf977e4/src/hotspot/cpu/aarch64/matcher_aarch64.hpp#L70-L73
It's true on the other platforms, where we would need to set `-XX:+UseCMoveUnconditionally` to avoid the bailout. I added runs for both cases to the test.
As also stated in https://github.com/openjdk/jdk/pull/13493, this only affects `CMoveF` and `CMoveD`. For other `CMove` nodes, we bail out during `filter_packs` with "Unimplemented" because `VectorNode::implemented` -> `VectorNode::opcode` only handles `Op_CMoveF` and `Op_CMoveD`:
https://github.com/openjdk/jdk/blob/055b4b426cbc56d97e82219f3dd3aba1ebf977e4/src/hotspot/share/opto/vectornode.cpp#L84-L87
I first tried to bail out in `SuperWord::output` but that does not work because the graph was already modified. We hit an assert in IGVN due to a vector vs. non-vector type mismatch. In general, I don't understand how the `do_reserve_copy` bailouts are supposed to work because we might have already replaced nodes by vector nodes and the `do_reserve_copy` logic does not undo these changes:
https://github.com/openjdk/jdk/blob/055b4b426cbc56d97e82219f3dd3aba1ebf977e4/src/hotspot/share/opto/superword.cpp#L2690-L2694
@eme64 I slightly remember that we talked about this before, did you observe a similar issue? Do we have a tracking bug for this broken bailout logic?
Big thanks to @SirYwell for the report and test case and to @eme64 for the initial investigation!
Thanks,
Tobias
-------------
Commit messages:
- Fix
- 8313345: SuperWord fails due to CMove without matching Bool pack
Changes: https://git.openjdk.org/jdk/pull/15189/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15189&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8313345
Stats: 79 lines in 2 files changed: 79 ins; 0 del; 0 mod
Patch: https://git.openjdk.org/jdk/pull/15189.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/15189/head:pull/15189
PR: https://git.openjdk.org/jdk/pull/15189
More information about the hotspot-compiler-dev
mailing list