RFR: 8281453: New optimization: convert "c-(~x)" into "x+(c+1)" and "~(c-x)" into "x+(-c-1)" [v7]

Vladimir Kozlov kvn at openjdk.java.net
Wed Apr 13 18:00:20 UTC 2022


On Wed, 13 Apr 2022 16:56:40 GMT, Zhiqiang Zang <duke at openjdk.java.net> wrote:

>> Similar to `(~x)+c` -> `(c-1)-x` and `~(x+c)` -> `(-c-1)-x` in #6858, we can also introduce similar optimizations for subtraction, `c-(~x)` -> `x+(c+1)` and `~(c-x)` -> `x+(-c-1)`.
>> 
>> The results of the microbenchmark are as follows:
>> 
>> Baseline:                                                                                                                                         
>> Benchmark                        Mode  Cnt  Score   Error  Units
>> SubIdealCMinusNotX.baselineInt   avgt   60  0.504 ± 0.011  ns/op
>> SubIdealCMinusNotX.baselineLong  avgt   60  0.484 ± 0.004  ns/op
>> SubIdealCMinusNotX.testInt1      avgt   60  0.779 ± 0.004  ns/op
>> SubIdealCMinusNotX.testInt2      avgt   60  0.896 ± 0.004  ns/op
>> SubIdealCMinusNotX.testLong1     avgt   60  0.722 ± 0.004  ns/op
>> SubIdealCMinusNotX.testLong2     avgt   60  0.720 ± 0.005  ns/op
>> 
>> Patch:
>> Benchmark                        Mode  Cnt  Score   Error  Units
>> SubIdealCMinusNotX.baselineInt   avgt   60  0.487 ± 0.009  ns/op
>> SubIdealCMinusNotX.baselineLong  avgt   60  0.486 ± 0.009  ns/op
>> SubIdealCMinusNotX.testInt1      avgt   60  0.372 ± 0.010  ns/op
>> SubIdealCMinusNotX.testInt2      avgt   60  0.365 ± 0.003  ns/op
>> SubIdealCMinusNotX.testLong1     avgt   60  0.369 ± 0.004  ns/op
>> SubIdealCMinusNotX.testLong2     avgt   60  0.399 ± 0.016  ns/op
>
> Zhiqiang Zang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
> 
>  - merge master.
>  - clean.
>  - merge tests into XXXINodeIdealizationTests
>  - clean.
>  - Merge branch 'master'.
>  - convert ~x into -1-x when ~x is part of Add and Sub.
>  - include bug id.
>  - include a microbenmark.
>  - Convert c-(~x) into x+(c+1) in SubNode and convert ~(c-x) into x+(-c-1) in XorNode.

Optimization you proposed does not match RFE description and title.

You do only:  `~x` or (x ^ (-1))` -> `(-1 - x)`

As result this should be Xor nodes ideal transformation. I don't even think you need such transformation if `rhs` and `lhs` are not constants because I assume `XOR` and `SUB` hw instructions have the same latency.

I suggest you to redo performance testing after you merged #7795 changes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7376


More information about the hotspot-compiler-dev mailing list