RFR: 8316918: Optimize conversions duplicated across phi nodes

Tobias Hartmann thartmann at openjdk.org
Wed Oct 4 11:57:36 UTC 2023


On Tue, 3 Oct 2023 23:33:37 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

> Hi all,
> 
> I've created this changeset which introduces a minor optimization that de-duplicates primitive type conversion nodes when behind a phi, by replacing it with a single conversion that follows after the phi. In addition, it cleans up the conversion node classes by introducing a common superclass to host shared behavior. This transformation is beneficial as it reduces the size of the IR and the generated code, and is a fairly frequent pattern. Most notably, array creation with a non-constant size parameter contains a duplicated ConvI2L in a branch, and when this transformation is applied the entire branch is able to be removed as the transformation has allowed it to realize that there is only one unique input. In the future, I would like to do this transformation more generally with other types of pure operations with shared inputs, but I figured that this is a good starting point.
> 
> Here are some performance benchmarks from my (Zen 3) machine:
> 
>                                                           Baseline                     Patch          Improvement
> Benchmark                                 Mode   Cnt  Score    Error  Units      Score    Error  Units
> PhiDuplicatedConversion.testDouble2Float  avgt   12  679.987 ± 29.678 ns/op  /  592.162 ± 11.354 ns/op  + 12.9%
> PhiDuplicatedConversion.testDouble2Int    avgt   12  737.388 ± 24.690 ns/op  /  651.517 ± 12.950 ns/op  + 11.6%
> PhiDuplicatedConversion.testDouble2Long   avgt   12  685.582 ± 24.236 ns/op  /  662.577 ± 16.498 ns/op  + 3.3%
> PhiDuplicatedConversion.testFloat2Double  avgt   12  670.812 ± 22.945 ns/op  /  641.940 ± 15.954 ns/op  + 4.3%
> PhiDuplicatedConversion.testFloat2Int     avgt   12  703.796 ± 21.627 ns/op  /  652.882 ± 14.300 ns/op  + 7.2%
> PhiDuplicatedConversion.testFloat2Long    avgt   12  682.821 ± 22.023 ns/op  /  651.343 ± 13.281 ns/op  + 4.6%
> PhiDuplicatedConversion.testInt2Double    avgt   12  694.062 ± 15.567 ns/op  /  637.920 ±  8.959 ns/op  + 8.0%
> PhiDuplicatedConversion.testInt2Float     avgt   12  709.544 ± 20.454 ns/op  /  637.696 ±  7.011 ns/op  + 10.1%
> PhiDuplicatedConversion.testInt2Long      avgt   12  660.117 ± 22.712 ns/op  /  637.106 ± 10.776 ns/op  + 3.4%
> PhiDuplicatedConversion.testLong2Double   avgt   12  666.147 ± 18.828 ns/op  /  635.747 ±  6.524 ns/op  + 4.5%
> PhiDuplicatedConversion.testLong2Float    avgt   12  675.239 ± 16.210 ns/op  /  640.328 ±  6.551 ns/op  + 5.1%
> PhiDuplicatedConversion.testLong2Int      avgt   12  665.644 ± 13.507 ns/op  /  637.952 ± 10.94...

Thanks for the contribution! I didn't look at the changes yet but executed some quick testing. 

`compiler/c2/irTests/TestPhiDuplicatedConversion.java` fails with `-XX:UseAVX=0`:


Failed IR Rules (2) of Methods (2)
----------------------------------
1) Method "public static short compiler.c2.irTests.TestPhiDuplicatedConversion.float2HalfFloat(boolean,float,float)" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={DEFAULT}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CONV#_", "1"}, applyIfAnd={}, failOn={}, applyIfOr={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(Conv.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 0 = 1 [given]
           - No nodes matched!

2) Method "public static float compiler.c2.irTests.TestPhiDuplicatedConversion.halfFloat2Float(boolean,short,short)" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={DEFAULT}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CONV#_", "1"}, applyIfAnd={}, failOn={}, applyIfOr={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(Conv.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 0 = 1 [given]
           - No nodes matched!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16036#issuecomment-1746722176


More information about the hotspot-compiler-dev mailing list