RFR: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory [v4]
Emanuel Peter
epeter at openjdk.org
Wed Feb 26 10:36:06 UTC 2025
On Wed, 26 Feb 2025 10:15:48 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>>> Would it be possible and make sense to remove useless slow path loops the way it's done for predicates or zero trip guards? In `PhaseIdealLoop::build_loop_late_post_work()`, collect all `OpaqueMultiversioningNode` in a list. Then iterate over all loops the way it's done in `PhaseIdealLoop::eliminate_useless_zero_trip_guard()`, find loops marked as multi version, check we can get from the loop to the `OpaqueMultiversioningNode` and mark that one as useful. Eliminate all `OpaqueMultiversioningNode` not marked as useful. That way if some transformation such as peeling makes the loop non multi version or if the expected shape breaks for some reason, the slow loop is eliminated on next loop opts pass.
>>
>> I suppose we could try that. Is it ok to do that in a separate RFE, so we are keeping this here to a more manageable size?
>>
>> I don't see it as super critical personally, as the slow_path is `delayed`, so no loop-opts are performed on it. The overhead is minimal if we keep it until after loop-opts, I think. But I'm not against trying. It would take a bit of effort to construct test cases where we have the loop fold away after multiversion_if is added, but that is probably possible.
>>
>> And would we not have similar issues with traversing from the loops to their `OpaqueMultiversioningNode`? What if some are not reachable in the meantime? Then we would just lose the `multiversion_if` early, and could not use it any more. So maybe we'd have to do that after the verification:
>> [JDK-8350637](https://bugs.openjdk.org/browse/JDK-8350637): C2: verify that main_loop finds pre_loop and that multiversion loops find the multiversion_if
>>
>> I wonder if we do not have similar issues with `PhaseIdealLoop::eliminate_useless_zero_trip_guard()` currently. Maybe it's rare enough we don't notice.
>>
>> @rwestrel What do you think?
>
>> I suppose we could try that. Is it ok to do that in a separate RFE, so we are keeping this here to a more manageable size?
>
> Ok
>
>> And would we not have similar issues with traversing from the loops to their `OpaqueMultiversioningNode`? What if some are not reachable in the meantime? Then we would just lose the `multiversion_if` early, and could not use it any more. So maybe we'd have to do that after the verification: [JDK-8350637](https://bugs.openjdk.org/browse/JDK-8350637): C2: verify that main_loop finds pre_loop and that multiversion loops find the multiversion_if
>>
>> I wonder if we do not have similar issues with `PhaseIdealLoop::eliminate_useless_zero_trip_guard()` currently. Maybe it's rare enough we don't notice.
>
> I don't think that's a problem. When that code runs the graph is in a stable shape. There's no dead condition that needs to go through igvn to be cleaned up. We've just run igvn and haven't made any change to the graph yet.
@rwestrel I filed this follow-up RFE:
[JDK-8350756](https://bugs.openjdk.org/browse/JDK-8350756): C2 SuperWord Multiversioning: remove useless slow loop when the fast loop disappears
We'll have to be careful to only fold the `slow_loop` away if it is not used, i.e. if we did not in the meantime use the `multiversion_if`, and maybe the `fast_loop` structure is only desintegrating because of some speculative assumption, maybe because of more unrolling that only happens with vectorization. It would be good to have a test-case for that. I'm writing that here so I will remember it later ;)
@rwestrel Do you have any other ideas / suggestions?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/22016#issuecomment-2684567780
More information about the hotspot-dev
mailing list