RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times
Emanuel Peter
epeter at openjdk.org
Mon Jun 2 10:30:51 UTC 2025
On Fri, 30 May 2025 07:43:29 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
> C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes.
> This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer
> to the detailed discussion for a related performance issue from [1].
>
> The ideal graph of such a loop typically looks like:
>
>
> /-----------|
> | |
> | ConI |
> loop | / /
> | | / /
> \ AddI /
> RangeCheck \ / |
> | \ / |
> IfTrue Phi |
> \ | |
> RangeCheck \ | |
> \ CastII / <- Range check #1
> | | /
> IfTrue | |
> \ | |
> CastII | <- Range check #2
> | /
> |-------/
>
>
>
> For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used
> by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop.
>
> This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations.
>
> Test:
> - Tested tier1, tier2, tier3, and no regressions are found.
> - An additional test case is added to verify the fix.
>
> Performance:
> Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture:
>
>
> Benchmark Mode Cnt Unit Before After Gain
> CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66
> CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62
>
>
> We can also observe the similar uplift on a x86_64 machine.
>
> [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654
test/hotspot/jtreg/compiler/c2/irTests/TestCountedLoopCastIV.java line 2:
> 1: /*
> 2: * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Can you please move the test to `test/hotspot/jtreg/compiler/loopopts`? The `irTests` directory was not the best idea, it makes more sense to have tests thematically grouped.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2120715051
More information about the hotspot-compiler-dev
mailing list