RFR: 8342393: Promote commutative vector IR node sharing [v26]
Vladimir Ivanov
vlivanov at openjdk.org
Mon Feb 24 06:34:05 UTC 2025
On Thu, 20 Feb 2025 09:47:50 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Patch promotes the sharing of commutative vector IR with the same inputs but different input ordering.
>> Similar to scalar IR where we perform edge swapping by [sorting inputs](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/addnode.cpp#L122) based on node indices during IR idealization.
>>
>> Following are the performance stats for JMH micro included with the patch.
>>
>>
>> Granite Rapids (P-core Xeon Server)
>> Baseline :
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 8982.549 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 6072.773 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2368.856 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 15215.087 ops/ms
>>
>> Withopt:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 11963.554 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 7036.088 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2906.731 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 17148.131 ops/ms
>>
>> Sierra Forest (E-core Xeon Server)
>> Baseline:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 2444.359 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 1710.256 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 308.766 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 3902.179 ops/ms
>>
>> Withopt:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.com...
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits:
>
> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342393
> - Safety assertion added
> - Review resolutions
> - Lowering feature check to IR annotation level
> - Adding missed feature check
> - Review comments resolutions.
> - Modifed scheme not based over fragile node level flags base solution.
> - Updating comments for clarity
> - Adding a missed check to skip over commoning of predicated vector operations
> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342393
> - ... and 10 more: https://git.openjdk.org/jdk/compare/1e87ff01...acb613da
Looks even better!
src/hotspot/share/opto/vectornode.cpp line 1101:
> 1099:
> 1100: // Sort inputs of commutative non-predicated vector operations to help value numbering.
> 1101: if (should_swap_inputs_to_help_global_value_numbering()) {
It reads way too verbose to me.
I'd just shape it as:
// Sort inputs of commutative vector operations to help value numbering.
if (is_commutative()) {
if (in(1)->_idx > in(2)->_idx) {
swap_edges(1, 2);
}
}
-------------
Marked as reviewed by vlivanov (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/22863#pullrequestreview-2636095051
PR Review Comment: https://git.openjdk.org/jdk/pull/22863#discussion_r1967092266
More information about the hotspot-compiler-dev
mailing list