RFR: 8250808: Re-associate loop invariants with other associative operations

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Mon Aug 10 08:30:59 UTC 2020


>> Webrev: http://cr.openjdk.java.net/~xgong/rfr/8250808/webrev.00/
> 
> Looks good.
> 
> So far, testing results look good (hs-tier1/2 are clean, tier1-4 are in 
> progress).

FYI test results are clean.

Best regards,
Vladimir Ivanov

>> C2 has re-association of loop invariants. However, the current 
>> implementation
>> only supports the re-associations for add and subtract with 32-bits 
>> integer type.
>> For other associative expressions like multiplication and the logic 
>> operations,
>> the re-association is also applicable, and also for the operations 
>> with long type.
>>
>> This patch adds the missing re-associations for other associative 
>> operations
>> together with the support for long type.
>>
>> With this patch, the following expressions:
>>    (x * inv1) * inv2
>>    (x | inv1) | inv2
>>    (x & inv1) & inv2
>>    (x ^ inv1) ^ inv2         ; inv1, inv2 are invariants
>>
>> can be re-associated to:
>>    x * (inv1 * inv2)         ; "inv1 * inv2" can be hoisted
>>    x | (inv1 | inv2)         ; "inv1 | inv2" can be hoisted
>>    x & (inv1 & inv2)       ; "inv1 & inv2" can be hoisted
>>    x ^ (inv1 ^ inv2)         ; "inv1 ^ inv2" can be hoisted
>>
>> Performance:
>> Here is the micro benchmark:
>> http://cr.openjdk.java.net/~xgong/rfr/8250808/LoopInvariant.java
>>
>> And the results on X86_64:
>> Before:
>> Benchmark                           (length)  Mode Cnt    Score        
>> Error      Units
>> loopInvariantAddLong          1024      avgt   15   988.142    ±  
>> 0.110   ns/op
>> loopInvariantAndInt              1024      avgt   15   843.850    ±  
>> 0.522   ns/op
>> loopInvariantAndLong          1024      avgt   15   990.551    ± 
>> 10.458  ns/op
>> loopInvariantMulInt              1024      avgt   15  1209.003   ±  
>> 0.247   ns/op
>> loopInvariantMulLong          1024      avgt   15  1213.923   ±  
>> 0.438    ns/op
>> loopInvariantOrInt                1024      avgt   15   843.908    ±  
>> 0.132    ns/op
>> loopInvariantOrLong             1024      avgt   15   990.710   ± 
>> 10.484  ns/op
>> loopInvariantSubLong           1024      avgt   15   988.170   ±  
>> 0.159    ns/op
>> loopInvariantXorInt               1024      avgt   15   806.949   ±  
>> 7.860    ns/op
>> loopInvariantXorLong           1024      avgt   15   990.963   ±  
>> 8.321    ns/op
>>
>> After:
>> Benchmark                           (length)  Mode  Cnt    Score       
>> Error    Units
>> loopInvariantAddLong          1024      avgt   15   842.854   ±  
>> 9.036  ns/op
>> loopInvariantAndInt              1024      avgt   15   698.097   ±  
>> 0.916  ns/op
>> loopInvariantAndLong          1024      avgt   15   841.120   ±  
>> 0.118  ns/op
>> loopInvariantMulInt              1024      avgt   15   691.000   ±  
>> 7.696  ns/op
>> loopInvariantMulLong          1024      avgt   15   846.907   ±  
>> 0.189  ns/op
>> loopInvariantOrInt                1024      avgt   15   698.423   ±  
>> 4.969  ns/op
>> loopInvariantOrLong            1024      avgt   15   843.465   ± 
>> 10.196  ns/op
>> loopInvariantSubLong          1024      avgt   15   841.314   ±  
>> 2.906  ns/op
>> loopInvariantXorInt              1024      avgt   15   652.529   ±  
>> 0.556  ns/op
>> loopInvariantXorLong          1024      avgt   15   841.860   ±  
>> 2.491  ns/op
>>
>> Results on AArch64:
>> Before:
>> Benchmark                          (length)  Mode  Cnt    Score        
>> Error     Units
>> loopInvariantAddLong         1024      avgt    15   514.437    ± 
>> 0.351  ns/op
>> loopInvariantAndInt            1024      avgt     15   435.301    ± 
>> 0.415  ns/op
>> loopInvariantAndLong        1024      avgt     15   572.437    ± 
>> 0.057  ns/op
>> loopInvariantMulInt            1024      avgt     15  1154.544   ± 
>> 0.030  ns/op
>> loopInvariantMulLong        1024      avgt     15  1188.109   ± 0.299  
>> ns/op
>> loopInvariantOrInt              1024      avgt     15   435.605    ± 
>> 0.977  ns/op
>> loopInvariantOrLong          1024      avgt     15   572.475     ± 
>> 0.093  ns/op
>> loopInvariantSubLong        1024      avgt     15   514.340    ± 
>> 0.154  ns/op
>> loopInvariantXorInt            1024      avgt     15   426.186    ± 
>> 0.105  ns/op
>> loopInvariantXorLong        1024      avgt     15   572.505    ± 
>> 0.259  ns/op
>>
>> After:
>> Benchmark                        (length)  Mode  Cnt    Score       
>> Error    Units
>> loopInvariantAddLong       1024     avgt     15   508.179   ± 0.108  
>> ns/op
>> loopInvariantAndInt           1024    avgt     15   394.706   ± 0.199  
>> ns/op
>> loopInvariantAndLong       1024    avgt     15   434.443   ± 0.247  ns/op
>> loopInvariantMulInt           1024    avgt     15   762.477   ± 0.079  
>> ns/op
>> loopInvariantMulLong       1024    avgt     15   775.975   ± 0.159  ns/op
>> loopInvariantOrInt             1024    avgt     15   394.657   ± 
>> 0.156  ns/op
>> loopInvariantOrLong         1024    avgt     15   434.428   ± 0.282  
>> ns/op
>> loopInvariantSubLong       1024    avgt     15   507.475   ± 0.151  ns/op
>> loopInvariantXorInt           1024    avgt     15   396.000   ± 0.011  
>> ns/op
>> loopInvariantXorLong       1024    avgt     15   434.255   ± 0.099  ns/op
>>
>> Tests:
>> Tested jtreg hotspot::hotspot_all_no_apps,jdk::jdk_core,langtools::tier1
>> and jcstress:tests-custom, and all tests pass without new failure.
>>
>> Thanks,
>> Xiaohong Gong
>>


More information about the hotspot-compiler-dev mailing list