RFR: 8342393: Promote commutative vector IR node sharing [v13]
Jatin Bhateja
jbhateja at openjdk.org
Wed Jan 22 12:29:02 UTC 2025
> Patch promotes the sharing of commutative vector IR with the same inputs but different input ordering.
> Unlike scalar IR where we perform edge swapping by [sorting inputs](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/addnode.cpp#L122) based on node indices during IR idealization, for vector IR we chose a simpler approach to decorate commutative operations with a special node-level flag during IR construction thus
> obviating any dependency on explicit idealization routines. This flag is later used during GVN hashing to enable node sharing.
>
> Following are the performance stats for JMH micro included with the patch.
>
>
> Granite Rapids (P-core Xeon Server)
> Baseline :
> Benchmark (size) Mode Cnt Score Error Units
> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 8982.549 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 6072.773 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2368.856 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 15215.087 ops/ms
>
> Withopt:
> Benchmark (size) Mode Cnt Score Error Units
> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 11963.554 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 7036.088 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2906.731 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 17148.131 ops/ms
>
> Sierra Forest (E-core Xeon Server)
> Baseline:
> Benchmark (size) Mode Cnt Score Error Units
> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 2444.359 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 1710.256 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 308.766 ops/ms
> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt ...
Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision:
- Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342393
- Review suggestions incorporated.
- Adding functional verification to test points using Verify.checkEQ
- Review suggestions incorporated
- Generalizing vector size constraints covering different AVX levels and KNLSetting
- GHA fix
- Review comments resolutions.
- removing spaces
- Adding functional and performance tests
- Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342393
- ... and 1 more: https://git.openjdk.org/jdk/compare/7f8f263b...f21e30f1
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/22863/files
- new: https://git.openjdk.org/jdk/pull/22863/files/a5c35a9d..f21e30f1
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=22863&range=12
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=22863&range=11-12
Stats: 52085 lines in 3070 files changed: 19712 ins; 21103 del; 11270 mod
Patch: https://git.openjdk.org/jdk/pull/22863.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/22863/head:pull/22863
PR: https://git.openjdk.org/jdk/pull/22863
More information about the hotspot-compiler-dev
mailing list