RFR: 8328528: C2 should optimize long-typed parallel iv in an int counted loop [v24]

Christian Hagedorn chagedorn at openjdk.org
Mon Oct 21 06:24:01 UTC 2024


On Fri, 18 Oct 2024 21:04:38 GMT, Kangcheng Xu <kxu at openjdk.org> wrote:

>> Currently, parallel iv optimization only happens in an int counted loop with int-typed parallel iv's. This PR adds support for long-typed iv to be optimized. 
>> 
>> Additionally, this ticket contributes to the resolution of [JDK-8275913](https://bugs.openjdk.org/browse/JDK-8275913). Meanwhile, I'm working on adding support for parallel IV replacement for long counted loops which will depend on this PR.
>
> Kangcheng Xu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update comment pseudo code, improve readability with explicit skip

src/hotspot/share/opto/loopnode.cpp line 3973:

> 3971: // conversions are required:
> 3972: //
> 3973: //     long iv2 = ((long) phi * stride_con2 / stride_con) + (init2 - ((long) init * stride_con2 / stride_con))

Thanks for updating the example. I guess we should use `iv` consistently - missed that to do in my example before:
Suggestion:

//    int a = init2;
//    for (int iv = init; iv < limit; iv += stride_con) {
//      a += stride_con2;
//    }
//
// and transforms it to:
//
//    int iv2 = init2
//    int iv = init
//    loop:
//      if ( iv >= limit ) goto exit
//      iv += stride_con
//      iv2 = init2 + (iv - init) * (stride_con2 / stride_con)
//      goto loop
//    exit:
//    ...
//
// Such transformation introduces more optimization opportunities. In this
// particular example, the loop can be eliminated entirely given that
// `stride_con2 / stride_con` is exact  (i.e., no remainder). Checks are in
// place to only perform this optimization if such a division is exact. This
// example will be transformed into its semantic equivalence:
//
//     int iv2 = (iv * stride_con2 / stride_con) + (init2 - (init * stride_con2 / stride_con))
//
// which corresponds to the structure of transformed subgraph.
//
// However, if there is a mismatch between types of the loop and the parallel
// induction variable (e.g., a long-typed IV in an int-typed loop), type
// conversions are required:
//
//     long iv2 = ((long) iv * stride_con2 / stride_con) + (init2 - ((long) init * stride_con2 / stride_con))

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18489#discussion_r1808152983


More information about the hotspot-compiler-dev mailing list