RFR: 8342692: C2: long counted loop/long range checks: don't create loop-nest for short running loops [v5]
Emanuel Peter
epeter at openjdk.org
Thu Dec 5 09:22:41 UTC 2024
On Wed, 4 Dec 2024 15:45:54 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>> Hi @rwestrel this looks very interesting!
>>
>> Which benchmarks are you referring to?
>>
>> I just gave it a quick skim, will come back to this later again.
>
>> Which benchmarks are you referring to?
>
> The one mentioned in the bug: https://github.com/openjdk/jdk/compare/master...mcimadamore:jdk:manual_mismatch_bench?expand=1
@rwestrel it would be nice to see a plot like this, with the benchmark results:
X-axis: increasing loop iterations
Y-axis: time
Similar to what I did here: https://github.com/openjdk/jdk/pull/22070

You could go over loop sizes 500-2000 in steps of 100, just to get a rough sense if your constant threshold of `1000` is roughly right.
Maybe you can even extend the benchmark I wrote there, with MemorySegment cases. That would be useful also for the other efforts where we are working on short running loops:
[JDK-8307084](https://bugs.openjdk.org/browse/JDK-8307084): C2: Vector atomic post loop is not executed for some small trip counts
[JDK-8344085](https://bugs.openjdk.org/browse/JDK-8344085): C2 SuperWord: improve vectorization for small loop iteration count
I just linked these two issues with this RFE on JBS.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21630#issuecomment-2519722350
More information about the hotspot-compiler-dev
mailing list