RFR: 8342692: C2: long counted loop/long range checks: don't create loop-nest for short running loops [v15]
Emanuel Peter
epeter at openjdk.org
Wed May 7 21:26:56 UTC 2025
On Wed, 7 May 2025 13:22:30 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>> test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment.java line 799:
>>
>>> 797: IRNode.ADD_VI, "> 0",
>>> 798: IRNode.STORE_VECTOR, "> 0"},
>>> 799: applyIfAnd = { "ShortRunningLongLoop", "true", "AlignVector", "false" },
>>
>> Can you just copy the IR rule, please, so that we still have a failing rule without `ShortRunningLongLoop`?
>>
>> The reason I have it here is so that I will catch these cases that are currently not properly vectorized... and it would be a shame if we lost these tests.
>>
>> Also: can we whitelist `ShortRunningLongLoop` for the IR framework? I think we should make sure that we run all these MemorySegment tests with `ShortRunningLongLoop` enabled and disabled, just to make sure everything is ok with and without.
>>
>> What do you think?
>>
>> FYI: I'm making changes to this test again in https://github.com/openjdk/jdk/pull/24278. But I don't want to hold you back here with that.
>>
>> Still: maybe you can take my approach with `NoSpeculativeAliasingCheck`, and add a run with `ShortRunningLongLoop` enabled or disabled. Just to make sure we have at least something running with both enabled and also with disabled.
>
> Wouldn't I then need to duplicate every `@run` line in the test i.e.:
>
> @run driver compiler.loopopts.superword.TestMemorySegment ByteArray
> @run driver compiler.loopopts.superword.TestMemorySegment ByteArray AlignVector
>
>
> would become:
>
>
> @run driver compiler.loopopts.superword.TestMemorySegment ByteArray
> @run driver compiler.loopopts.superword.TestMemorySegment ByteArray AlignVector
> @run driver compiler.loopopts.superword.TestMemorySegment ByteArray ShortLoop
> @run driver compiler.loopopts.superword.TestMemorySegment ByteArray AlignVector ShortLoop
>
>
> Same for `CharArray` etc...
> That seems like a lot of extra complexity. Or would it be sufficient to only add it for `ByteArray` to have the non short loop case at least minimally covered?
Yeah, I would only do it for one or two cases. Doing it for all would be a little excessive, and eventually we have too many combinations.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21630#discussion_r2078517077
More information about the hotspot-compiler-dev
mailing list