RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter

Xiaohong Gong xgong at openjdk.org
Wed Jun 4 08:06:17 UTC 2025


On Tue, 3 Jun 2025 07:17:32 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>>> @XiaohongGong I suggest you change the title from: `8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times` to `8357726: C2 recognize loops with multiple casts in trip counter` or even: `8357726: C2 recognize loops with multiple casts in trip counter: phi -> CastII* -> AddI -> phi`
>> 
>> Thanks for your suggestion! Sounds better to me. How about changing the title to `Improve C2 to recognize counted loops with multiple casts in trip counter` ?
>
>> Thanks for your suggestion! Sounds better to me. How about changing the title to Improve C2 to recognize counted loops with multiple casts in trip counter ?
> 
> @XiaohongGong Sounds good too :)

Hi @eme64 , I'v updated the IR test and JMH based on your comments. Could you please help review whether it's fine to you. Thanks for all your suggestion!

Following shows the performance data of the new JMH test on Grace (the performance gain is almost the same on my x64 machine):

Benchmark                      Mode   Cnt limit Unit    Before    Error (99.9%)    After     Error (99.9%) Gain
CountedLoopCastIV.loop_iv_int  thrpt  30  1024  ops/s 1225620.536 39505.158362   5778120.132 4781.602088   4.71
CountedLoopCastIV.loop_iv_int  thrpt  30  1536  ops/s  830600.832 14758.561182   3839404.338 3362.727083   4.62
CountedLoopCastIV.loop_iv_int  thrpt  30  2048  ops/s  618114.174 36999.511727   2890853.495  416.969862   4.67
CountedLoopCastIV.loop_iv_long thrpt  30  1024  ops/s 1063902.078  4616.608855   1314828.963 1267.470199   1.23
CountedLoopCastIV.loop_iv_long thrpt  30  1536  ops/s  714538.178   630.085477    870801.472  753.347684   1.21
CountedLoopCastIV.loop_iv_long thrpt  30  2048  ops/s  536724.086   131.313178    652775.363  539.107806   1.21


The error term is larger as before. But I don't think this is caused by the large variance of loop iterations. Does the new benchmark look fine to you? Thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2939030428


More information about the hotspot-compiler-dev mailing list