RFR: 8316918: Optimize conversions duplicated across phi nodes [v2]
Jasmine Karthikeyan
jkarthikeyan at openjdk.org
Wed Oct 4 16:46:55 UTC 2023
> Hi all,
>
> I've created this changeset which introduces a minor optimization that de-duplicates primitive type conversion nodes when behind a phi, by replacing it with a single conversion that follows after the phi. In addition, it cleans up the conversion node classes by introducing a common superclass to host shared behavior. This transformation is beneficial as it reduces the size of the IR and the generated code, and is a fairly frequent pattern. Most notably, array creation with a non-constant size parameter contains a duplicated ConvI2L in a branch, and when this transformation is applied the entire branch is able to be removed as the transformation has allowed it to realize that there is only one unique input. In the future, I would like to do this transformation more generally with other types of pure operations with shared inputs, but I figured that this is a good starting point.
>
> Here are some performance benchmarks from my (Zen 3) machine:
>
> Baseline Patch Improvement
> Benchmark Mode Cnt Score Error Units Score Error Units
> PhiDuplicatedConversion.testDouble2Float avgt 12 679.987 ± 29.678 ns/op / 592.162 ± 11.354 ns/op + 12.9%
> PhiDuplicatedConversion.testDouble2Int avgt 12 737.388 ± 24.690 ns/op / 651.517 ± 12.950 ns/op + 11.6%
> PhiDuplicatedConversion.testDouble2Long avgt 12 685.582 ± 24.236 ns/op / 662.577 ± 16.498 ns/op + 3.3%
> PhiDuplicatedConversion.testFloat2Double avgt 12 670.812 ± 22.945 ns/op / 641.940 ± 15.954 ns/op + 4.3%
> PhiDuplicatedConversion.testFloat2Int avgt 12 703.796 ± 21.627 ns/op / 652.882 ± 14.300 ns/op + 7.2%
> PhiDuplicatedConversion.testFloat2Long avgt 12 682.821 ± 22.023 ns/op / 651.343 ± 13.281 ns/op + 4.6%
> PhiDuplicatedConversion.testInt2Double avgt 12 694.062 ± 15.567 ns/op / 637.920 ± 8.959 ns/op + 8.0%
> PhiDuplicatedConversion.testInt2Float avgt 12 709.544 ± 20.454 ns/op / 637.696 ± 7.011 ns/op + 10.1%
> PhiDuplicatedConversion.testInt2Long avgt 12 660.117 ± 22.712 ns/op / 637.106 ± 10.776 ns/op + 3.4%
> PhiDuplicatedConversion.testLong2Double avgt 12 666.147 ± 18.828 ns/op / 635.747 ± 6.524 ns/op + 4.5%
> PhiDuplicatedConversion.testLong2Float avgt 12 675.239 ± 16.210 ns/op / 640.328 ± 6.551 ns/op + 5.1%
> PhiDuplicatedConversion.testLong2Int avgt 12 665.644 ± 13.507 ns/op / 637.952 ± 10.94...
Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
Fix test failure on UseAVX=0
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/16036/files
- new: https://git.openjdk.org/jdk/pull/16036/files/7eab5657..69e87f71
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=16036&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=16036&range=00-01
Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
Patch: https://git.openjdk.org/jdk/pull/16036.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/16036/head:pull/16036
PR: https://git.openjdk.org/jdk/pull/16036
More information about the hotspot-compiler-dev
mailing list