RFR: 8303238: Create generalizations for existing LShift ideal transforms [v3]

Jasmine K. duke at openjdk.org
Thu Mar 9 22:47:52 UTC 2023


> Hello,
> I would like to generalize two ideal transforms for bitwise shifts. Left shift nodes perform the transformations `(x >> C1) << C2 => x & (-1 << C2)` and `((x >> C1) & Y) << C2 => x & (Y << C2)`, but only when the case where `C1 == C2`. However, it is possible to use both of these rules to improve cases where the constants aren't equal, by removing one of the shifts and replacing it with a bitwise and. This transformation is profitable because typically more bitwise ands can be dispatched per cycle than bit shifts. In addition, the strength reduction from a shift to a bitwise and can allow more profitable transformations to occur. These patterns are found throughout the JDK, mainly around strings and OW2 ASM. I've attached some profiling results from my (Zen 2) machine below:
> 
>                                                  Baseline                           Patch              Improvement
> Benchmark                            Mode  Cnt    Score     Error  Units      Score    Error  Units
> LShiftNodeIdealize.testRgbaToAbgr    avgt   15    63.287 ±  1.770  ns/op  /  54.199 ±  1.408  ns/op     + 14.36%
> LShiftNodeIdealize.testShiftAndInt   avgt   15   874.564 ± 15.334  ns/op  / 538.408 ± 11.768  ns/op     + 38.44%
> LShiftNodeIdealize.testShiftAndLong  avgt   15  1017.466 ± 29.010  ns/op  / 701.356 ± 18.258  ns/op     + 31.07%
> LShiftNodeIdealize.testShiftInt      avgt   15   663.865 ± 14.226  ns/op  / 533.588 ±  9.949  ns/op     + 19.63%
> LShiftNodeIdealize.testShiftInt2     avgt   15   658.976 ± 32.856  ns/op  / 649.871 ± 10.598  ns/op     +  1.38%
> LShiftNodeIdealize.testShiftLong     avgt   15   815.540 ± 14.721  ns/op  / 689.270 ± 14.028  ns/op     + 15.48%
> LShiftNodeIdealize.testShiftLong2    avgt   15   817.936 ± 23.573  ns/op  / 810.185 ± 14.983  ns/op     +  0.95%
> 
> 
> In addition, in the process of making this PR I've found a missing ideal transform for `RShiftLNode`, so right shifts of large numbers (such as `x >> 65`) are not properly folded down, like how they are `RShiftINode` and `URShiftLNode`. I'll address this in a future RFR.
> 
> Testing: GHA, tier1 local, and performance testing
> 
> Thanks,
> Jasmine K

Jasmine K. has updated the pull request incrementally with one additional commit since the last revision:

  Update comments in IR tests

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/12734/files
  - new: https://git.openjdk.org/jdk/pull/12734/files/bd161561..9a6ff3c4

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12734&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12734&range=01-02

  Stats: 6 lines in 2 files changed: 1 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/12734.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12734/head:pull/12734

PR: https://git.openjdk.org/jdk/pull/12734


More information about the hotspot-compiler-dev mailing list