RFR: 8250808: Re-associate loop invariants with other associative operations

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Fri Aug 7 16:45:56 UTC 2020


> Webrev: http://cr.openjdk.java.net/~xgong/rfr/8250808/webrev.00/

Looks good.

So far, testing results look good (hs-tier1/2 are clean, tier1-4 are in 
progress).

Best regards,
Vladimir Ivanov

> C2 has re-association of loop invariants. However, the current implementation
> only supports the re-associations for add and subtract with 32-bits integer type.
> For other associative expressions like multiplication and the logic operations,
> the re-association is also applicable, and also for the operations with long type.
> 
> This patch adds the missing re-associations for other associative operations
> together with the support for long type.
> 
> With this patch, the following expressions:
>    (x * inv1) * inv2
>    (x | inv1) | inv2
>    (x & inv1) & inv2
>    (x ^ inv1) ^ inv2         ; inv1, inv2 are invariants
> 
> can be re-associated to:
>    x * (inv1 * inv2)         ; "inv1 * inv2" can be hoisted
>    x | (inv1 | inv2)         ; "inv1 | inv2" can be hoisted
>    x & (inv1 & inv2)       ; "inv1 & inv2" can be hoisted
>    x ^ (inv1 ^ inv2)         ; "inv1 ^ inv2" can be hoisted
> 
> Performance:
> Here is the micro benchmark:
> http://cr.openjdk.java.net/~xgong/rfr/8250808/LoopInvariant.java
> 
> And the results on X86_64:
> Before:
> Benchmark                           (length)  Mode Cnt    Score        Error      Units
> loopInvariantAddLong          1024      avgt   15   988.142    ±  0.110   ns/op
> loopInvariantAndInt              1024      avgt   15   843.850    ±  0.522   ns/op
> loopInvariantAndLong          1024      avgt   15   990.551    ± 10.458  ns/op
> loopInvariantMulInt              1024      avgt   15  1209.003   ±  0.247   ns/op
> loopInvariantMulLong          1024      avgt   15  1213.923   ±  0.438    ns/op
> loopInvariantOrInt                1024      avgt   15   843.908    ±  0.132    ns/op
> loopInvariantOrLong             1024      avgt   15   990.710   ± 10.484  ns/op
> loopInvariantSubLong           1024      avgt   15   988.170   ±  0.159    ns/op
> loopInvariantXorInt               1024      avgt   15   806.949   ±  7.860    ns/op
> loopInvariantXorLong           1024      avgt   15   990.963   ±  8.321    ns/op
> 
> After:
> Benchmark                           (length)  Mode  Cnt    Score       Error    Units
> loopInvariantAddLong          1024      avgt   15   842.854   ±  9.036  ns/op
> loopInvariantAndInt              1024      avgt   15   698.097   ±  0.916  ns/op
> loopInvariantAndLong          1024      avgt   15   841.120   ±  0.118  ns/op
> loopInvariantMulInt              1024      avgt   15   691.000   ±  7.696  ns/op
> loopInvariantMulLong          1024      avgt   15   846.907   ±  0.189  ns/op
> loopInvariantOrInt                1024      avgt   15   698.423   ±  4.969  ns/op
> loopInvariantOrLong            1024      avgt   15   843.465   ± 10.196  ns/op
> loopInvariantSubLong          1024      avgt   15   841.314   ±  2.906  ns/op
> loopInvariantXorInt              1024      avgt   15   652.529   ±  0.556  ns/op
> loopInvariantXorLong          1024      avgt   15   841.860   ±  2.491  ns/op
> 
> Results on AArch64:
> Before:
> Benchmark                          (length)  Mode  Cnt    Score        Error     Units
> loopInvariantAddLong         1024      avgt    15   514.437    ± 0.351  ns/op
> loopInvariantAndInt            1024      avgt     15   435.301    ± 0.415  ns/op
> loopInvariantAndLong        1024      avgt     15   572.437    ± 0.057  ns/op
> loopInvariantMulInt            1024      avgt     15  1154.544   ± 0.030  ns/op
> loopInvariantMulLong        1024      avgt     15  1188.109   ± 0.299  ns/op
> loopInvariantOrInt              1024      avgt     15   435.605    ± 0.977  ns/op
> loopInvariantOrLong          1024      avgt     15   572.475     ± 0.093  ns/op
> loopInvariantSubLong        1024      avgt     15   514.340    ± 0.154  ns/op
> loopInvariantXorInt            1024      avgt     15   426.186    ± 0.105  ns/op
> loopInvariantXorLong        1024      avgt     15   572.505    ± 0.259  ns/op
> 
> After:
> Benchmark                        (length)  Mode  Cnt    Score       Error    Units
> loopInvariantAddLong       1024     avgt     15   508.179   ± 0.108  ns/op
> loopInvariantAndInt           1024    avgt     15   394.706   ± 0.199  ns/op
> loopInvariantAndLong       1024    avgt     15   434.443   ± 0.247  ns/op
> loopInvariantMulInt           1024    avgt     15   762.477   ± 0.079  ns/op
> loopInvariantMulLong       1024    avgt     15   775.975   ± 0.159  ns/op
> loopInvariantOrInt             1024    avgt     15   394.657   ± 0.156  ns/op
> loopInvariantOrLong         1024    avgt     15   434.428   ± 0.282  ns/op
> loopInvariantSubLong       1024    avgt     15   507.475   ± 0.151  ns/op
> loopInvariantXorInt           1024    avgt     15   396.000   ± 0.011  ns/op
> loopInvariantXorLong       1024    avgt     15   434.255   ± 0.099  ns/op
> 
> Tests:
> Tested jtreg hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1
> and jcstress:tests-custom, and all tests pass without new failure.
> 
> Thanks,
> Xiaohong Gong
> 


More information about the hotspot-compiler-dev mailing list