RFR: 8350756: C2 SuperWord Multiversioning: remove useless slow loop when the fast loop disappears
Christian Hagedorn
chagedorn at openjdk.org
Tue Mar 4 08:51:59 UTC 2025
On Mon, 3 Mar 2025 13:38:55 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> @rwestrel asked me for this here:
> https://github.com/openjdk/jdk/pull/22016#issuecomment-2684365921
>
> The idea: an unused `multiversion_if` (i.e. one where the `slow_loop` is still delayed, i.e. where the `fast_loop` has not yet added a runtime-check to the `multiversion_if`) can be constant-folded if the main `fast_loop` disappears. Because at that point we know that we will never add a new condition to the `multiversion_if`, and it will constant fold to true (towards the `fast_loop`) after loop-opts anyway.
>
> It also seems to fix a bug, where all multiversioned loops (fast / slow, pre/main/post) disappear, and then we are left with a if-diamond with the multiversion_if. This then hits assertion code in `PhaseIdealLoop::conditional_move`. This issue is also addressed with this patch here. I adjusted the code around that assert slightly to give better reporting and ensure we bail out of the optimization if we see an unexpected pattern.
A few comments, otherwise the fix idea looks good to me!
src/hotspot/share/opto/loopnode.cpp line 2743:
> 2741: IfTrueNode* before_predicates = predicates.entry()->isa_IfTrue();
> 2742: if (before_predicates != nullptr &&
> 2743: before_predicates->in(0)->is_If() &&
Since this is after IGVN and we have not applied any loop transformations in this round, yet, shouldn't this always hold since we check that we have a `IfTrue` as child?
src/hotspot/share/opto/loopnode.cpp line 6653:
> 6651: if (!_verify_only && n->Opcode() == Op_OpaqueMultiversioning) {
> 6652: _multiversion_opaque_nodes.push(n);
> 6653: }
I'm wondering if we should do it here or as a list in `Compile`, similar to what we do with Parse and Template Assertion Predicates:
https://github.com/openjdk/jdk/blob/1f10ffba88119caab169b1fc43ccfd143e3b85a6/src/hotspot/share/opto/compile.hpp#L372-L374
Doing it in `PhaseIdealLoop` has the advantage that we limit the lifetime of the list whereas in `Compile` it's live until the entire compilation is done - which is a little too long. The disadvantage with having the list in `PhaseIdealLoop` is that you reallocate and reinitialize the list over and over again. So, I'm not entirely clear which one is better.
Since we also do it like that for zero trip guards, we can also move forward with this and revisit it again later if we decide that we should move it to `Compile`.
While thinking about this, it would have been great to have a class `LoopOpts` where we can store all such things that should be live over multiple loop opts pass but which is then not used anymore. Anyway, that's definitely out of scope.
src/hotspot/share/opto/loopopts.cpp line 794:
> 792: return nullptr;
> 793: }
> 794: if (bol->Opcode() != Op_Bool) {
Could we use `!bol->is_Bool()` here?
src/hotspot/share/opto/loopopts.cpp line 795:
> 793: }
> 794: if (bol->Opcode() != Op_Bool) {
> 795: assert(false, "Expected Bool, but got %s", NodeClassNames[bol->Opcode()]);
Good idea to print the opcode!
src/hotspot/share/opto/opaquenode.hpp line 104:
> 102: private:
> 103: bool _is_delayed_slow_loop;
> 104: bool _is_useful;
I suggest to flip it to `_uselss` to be in line with what we have in `ParsePredicateNode`:
https://github.com/openjdk/jdk/blob/1f10ffba88119caab169b1fc43ccfd143e3b85a6/src/hotspot/share/opto/cfgnode.hpp#L487-L489
Then we could also add a `dump_spec()` method that print `#useless` if it becomes useless (similar to what we have for `ParsePredicateNode`):
https://github.com/openjdk/jdk/blob/1f10ffba88119caab169b1fc43ccfd143e3b85a6/src/hotspot/share/opto/ifnode.cpp#L2227-L2229
This is also useful to check in IGV if such an opaque node is useless or not.
Side note: I'm in the process of having such a `_useless` flag for `OpaqueTemplateAssertionPredicate` nodes as well instead of directly replacing it with a constant. This seems cleaner and does not interfere with pattern matching.
src/hotspot/share/opto/opaquenode.hpp line 121:
> 119: }
> 120:
> 121: void set_useless() {
I suggest `mark_useless()` to be in line with `ParsePredicateNode`.
Suggestion:
void mark_useless() {
test/hotspot/jtreg/compiler/loopopts/superword/TestMultiversionRemoveUselessSlowLoop.java line 38:
> 36: */
> 37:
> 38: public class TestMultiversionRemoveUselessSlowLoop {
Is it also possible to add an IR test where we can check that a `OpaqueMultiversioning` node is first in the graph and then removed? Maybe we can check that in loop opts phase n, `COUNTED_LOOP_MAIN` and `OpaqueMultiversioning` is present. Then in loop opts phase n + 1, `COUNTED_LOOP_MAIN` is removed and in n + 2, `OpaqueMultiversioning` is removed as well. But I'm not sure if this is reliable enough when some loop opts change - though could easily be fixed or dropped again if the test fails at some point.
-------------
PR Review: https://git.openjdk.org/jdk/pull/23865#pullrequestreview-2656333142
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978827019
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978904536
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978825753
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978874732
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978822824
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978871295
PR Review Comment: https://git.openjdk.org/jdk/pull/23865#discussion_r1978887601
More information about the hotspot-compiler-dev
mailing list