RFR: 8250808: Re-associate loop invariants with other associative operations

Tobias Hartmann tobias.hartmann at oracle.com
Mon Aug 10 07:52:35 UTC 2020


+1

Best regards,
Tobias

On 07.08.20 18:45, Vladimir Ivanov wrote:
> 
>> Webrev: http://cr.openjdk.java.net/~xgong/rfr/8250808/webrev.00/
> 
> Looks good.
> 
> So far, testing results look good (hs-tier1/2 are clean, tier1-4 are in progress).
> 
> Best regards,
> Vladimir Ivanov
> 
>> C2 has re-association of loop invariants. However, the current implementation
>> only supports the re-associations for add and subtract with 32-bits integer type.
>> For other associative expressions like multiplication and the logic operations,
>> the re-association is also applicable, and also for the operations with long type.
>>
>> This patch adds the missing re-associations for other associative operations
>> together with the support for long type.
>>
>> With this patch, the following expressions:
>>    (x * inv1) * inv2
>>    (x | inv1) | inv2
>>    (x & inv1) & inv2
>>    (x ^ inv1) ^ inv2         ; inv1, inv2 are invariants
>>
>> can be re-associated to:
>>    x * (inv1 * inv2)         ; "inv1 * inv2" can be hoisted
>>    x | (inv1 | inv2)         ; "inv1 | inv2" can be hoisted
>>    x & (inv1 & inv2)       ; "inv1 & inv2" can be hoisted
>>    x ^ (inv1 ^ inv2)         ; "inv1 ^ inv2" can be hoisted
>>
>> Performance:
>> Here is the micro benchmark:
>> http://cr.openjdk.java.net/~xgong/rfr/8250808/LoopInvariant.java
>>
>> And the results on X86_64:
>> Before:
>> Benchmark                           (length)  Mode Cnt    Score        Error      Units
>> loopInvariantAddLong          1024      avgt   15   988.142    ±  0.110   ns/op
>> loopInvariantAndInt              1024      avgt   15   843.850    ±  0.522   ns/op
>> loopInvariantAndLong          1024      avgt   15   990.551    ± 10.458  ns/op
>> loopInvariantMulInt              1024      avgt   15  1209.003   ±  0.247   ns/op
>> loopInvariantMulLong          1024      avgt   15  1213.923   ±  0.438    ns/op
>> loopInvariantOrInt                1024      avgt   15   843.908    ±  0.132    ns/op
>> loopInvariantOrLong             1024      avgt   15   990.710   ± 10.484  ns/op
>> loopInvariantSubLong           1024      avgt   15   988.170   ±  0.159    ns/op
>> loopInvariantXorInt               1024      avgt   15   806.949   ±  7.860    ns/op
>> loopInvariantXorLong           1024      avgt   15   990.963   ±  8.321    ns/op
>>
>> After:
>> Benchmark                           (length)  Mode  Cnt    Score       Error    Units
>> loopInvariantAddLong          1024      avgt   15   842.854   ±  9.036  ns/op
>> loopInvariantAndInt              1024      avgt   15   698.097   ±  0.916  ns/op
>> loopInvariantAndLong          1024      avgt   15   841.120   ±  0.118  ns/op
>> loopInvariantMulInt              1024      avgt   15   691.000   ±  7.696  ns/op
>> loopInvariantMulLong          1024      avgt   15   846.907   ±  0.189  ns/op
>> loopInvariantOrInt                1024      avgt   15   698.423   ±  4.969  ns/op
>> loopInvariantOrLong            1024      avgt   15   843.465   ± 10.196  ns/op
>> loopInvariantSubLong          1024      avgt   15   841.314   ±  2.906  ns/op
>> loopInvariantXorInt              1024      avgt   15   652.529   ±  0.556  ns/op
>> loopInvariantXorLong          1024      avgt   15   841.860   ±  2.491  ns/op
>>
>> Results on AArch64:
>> Before:
>> Benchmark                          (length)  Mode  Cnt    Score        Error     Units
>> loopInvariantAddLong         1024      avgt    15   514.437    ± 0.351  ns/op
>> loopInvariantAndInt            1024      avgt     15   435.301    ± 0.415  ns/op
>> loopInvariantAndLong        1024      avgt     15   572.437    ± 0.057  ns/op
>> loopInvariantMulInt            1024      avgt     15  1154.544   ± 0.030  ns/op
>> loopInvariantMulLong        1024      avgt     15  1188.109   ± 0.299  ns/op
>> loopInvariantOrInt              1024      avgt     15   435.605    ± 0.977  ns/op
>> loopInvariantOrLong          1024      avgt     15   572.475     ± 0.093  ns/op
>> loopInvariantSubLong        1024      avgt     15   514.340    ± 0.154  ns/op
>> loopInvariantXorInt            1024      avgt     15   426.186    ± 0.105  ns/op
>> loopInvariantXorLong        1024      avgt     15   572.505    ± 0.259  ns/op
>>
>> After:
>> Benchmark                        (length)  Mode  Cnt    Score       Error    Units
>> loopInvariantAddLong       1024     avgt     15   508.179   ± 0.108  ns/op
>> loopInvariantAndInt           1024    avgt     15   394.706   ± 0.199  ns/op
>> loopInvariantAndLong       1024    avgt     15   434.443   ± 0.247  ns/op
>> loopInvariantMulInt           1024    avgt     15   762.477   ± 0.079  ns/op
>> loopInvariantMulLong       1024    avgt     15   775.975   ± 0.159  ns/op
>> loopInvariantOrInt             1024    avgt     15   394.657   ± 0.156  ns/op
>> loopInvariantOrLong         1024    avgt     15   434.428   ± 0.282  ns/op
>> loopInvariantSubLong       1024    avgt     15   507.475   ± 0.151  ns/op
>> loopInvariantXorInt           1024    avgt     15   396.000   ± 0.011  ns/op
>> loopInvariantXorLong       1024    avgt     15   434.255   ± 0.099  ns/op
>>
>> Tests:
>> Tested jtreg hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1
>> and jcstress:tests-custom, and all tests pass without new failure.
>>
>> Thanks,
>> Xiaohong Gong
>>


More information about the hotspot-compiler-dev mailing list