RFR: 8328528: C2 should optimize long-typed parallel iv in an int counted loop [v24]
Christian Hagedorn
chagedorn at openjdk.org
Mon Oct 21 06:24:01 UTC 2024
On Fri, 18 Oct 2024 21:04:38 GMT, Kangcheng Xu <kxu at openjdk.org> wrote:
>> Currently, parallel iv optimization only happens in an int counted loop with int-typed parallel iv's. This PR adds support for long-typed iv to be optimized.
>>
>> Additionally, this ticket contributes to the resolution of [JDK-8275913](https://bugs.openjdk.org/browse/JDK-8275913). Meanwhile, I'm working on adding support for parallel IV replacement for long counted loops which will depend on this PR.
>
> Kangcheng Xu has updated the pull request incrementally with one additional commit since the last revision:
>
> update comment pseudo code, improve readability with explicit skip
src/hotspot/share/opto/loopnode.cpp line 3973:
> 3971: // conversions are required:
> 3972: //
> 3973: // long iv2 = ((long) phi * stride_con2 / stride_con) + (init2 - ((long) init * stride_con2 / stride_con))
Thanks for updating the example. I guess we should use `iv` consistently - missed that to do in my example before:
Suggestion:
// int a = init2;
// for (int iv = init; iv < limit; iv += stride_con) {
// a += stride_con2;
// }
//
// and transforms it to:
//
// int iv2 = init2
// int iv = init
// loop:
// if ( iv >= limit ) goto exit
// iv += stride_con
// iv2 = init2 + (iv - init) * (stride_con2 / stride_con)
// goto loop
// exit:
// ...
//
// Such transformation introduces more optimization opportunities. In this
// particular example, the loop can be eliminated entirely given that
// `stride_con2 / stride_con` is exact (i.e., no remainder). Checks are in
// place to only perform this optimization if such a division is exact. This
// example will be transformed into its semantic equivalence:
//
// int iv2 = (iv * stride_con2 / stride_con) + (init2 - (init * stride_con2 / stride_con))
//
// which corresponds to the structure of transformed subgraph.
//
// However, if there is a mismatch between types of the loop and the parallel
// induction variable (e.g., a long-typed IV in an int-typed loop), type
// conversions are required:
//
// long iv2 = ((long) iv * stride_con2 / stride_con) + (init2 - ((long) init * stride_con2 / stride_con))
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18489#discussion_r1808152983
More information about the hotspot-compiler-dev
mailing list