RFR: 8342393: Promote commutative vector IR node sharing [v10]
Emanuel Peter
epeter at openjdk.org
Fri Jan 17 06:38:38 UTC 2025
On Thu, 16 Jan 2025 15:04:56 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Patch promotes the sharing of commutative vector IR with the same inputs but different input ordering.
>> Unlike scalar IR where we perform edge swapping by [sorting inputs](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/addnode.cpp#L122) based on node indices during IR idealization, for vector IR we chose a simpler approach to decorate commutative operations with a special node-level flag during IR construction thus
>> obviating any dependency on explicit idealization routines. This flag is later used during GVN hashing to enable node sharing.
>>
>> Following are the performance stats for JMH micro included with the patch.
>>
>>
>> Granite Rapids (P-core Xeon Server)
>> Baseline :
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 8982.549 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 6072.773 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2368.856 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 15215.087 ops/ms
>>
>> Withopt:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 11963.554 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 7036.088 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2906.731 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 17148.131 ops/ms
>>
>> Sierra Forest (E-core Xeon Server)
>> Baseline:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 2444.359 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 1710.256 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 308.766 ops/ms
>> VectorCommutativeOperSharingBenc...
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> Review suggestions incorporated
@vnkozlov @jatin-bhateja Ok, then we can leave it at the current `Flag_is_commutative_vector_op`. @jatin-bhateja please file a follow-up RFE and link it with this issue. You can assign it to me if you want - I will try to find someone to take care of the scalar refactoring.
Please add some result verification to the tests, as commented separately.
Once that is done, I can run testing and approve the PR.
test/hotspot/jtreg/compiler/vectorapi/VectorCommutativeOperSharingTest.java line 123:
> 121: public void testVectorIRSharingDriver2() {
> 122: for (int i = 0; i < I_SPECIES.loopBound(LENGTH); i += I_SPECIES.length()) {
> 123: testVectorIRSharing2(i);
Adding some result verification would be nice. Because now you only check that you get the right number of instructions - but that does not mean that they are connected correctly. Result verification could help here.
See:
https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/verify/Verify.java
https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/testlibrary_tests/verify/examples/TestVerifyInCheckMethod.java
Also, for random value generation we hope to use this in the future:
https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/generators/Generators.java
-------------
PR Review: https://git.openjdk.org/jdk/pull/22863#pullrequestreview-2558073506
PR Review Comment: https://git.openjdk.org/jdk/pull/22863#discussion_r1919627031
More information about the hotspot-compiler-dev
mailing list