RFR: 8342393: Promote commutative vector IR node sharing [v14]
Emanuel Peter
epeter at openjdk.org
Thu Jan 23 10:29:50 UTC 2025
On Thu, 23 Jan 2025 10:12:30 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Patch promotes the sharing of commutative vector IR with the same inputs but different input ordering.
>> Unlike scalar IR where we perform edge swapping by [sorting inputs](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/addnode.cpp#L122) based on node indices during IR idealization, for vector IR we chose a simpler approach to decorate commutative operations with a special node-level flag during IR construction thus
>> obviating any dependency on explicit idealization routines. This flag is later used during GVN hashing to enable node sharing.
>>
>> Following are the performance stats for JMH micro included with the patch.
>>
>>
>> Granite Rapids (P-core Xeon Server)
>> Baseline :
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 8982.549 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 6072.773 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2368.856 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 15215.087 ops/ms
>>
>> Withopt:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 11963.554 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 7036.088 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 2906.731 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeShortOperationShairing 1024 thrpt 2 17148.131 ops/ms
>>
>> Sierra Forest (E-core Xeon Server)
>> Baseline:
>> Benchmark (size) Mode Cnt Score Error Units
>> VectorCommutativeOperSharingBenchmark.commutativeByteOperationShairing 1024 thrpt 2 2444.359 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeIntOperationShairing 1024 thrpt 2 1710.256 ops/ms
>> VectorCommutativeOperSharingBenchmark.commutativeLongOperationShairing 1024 thrpt 2 308.766 ops/ms
>> VectorCommutativeOperSharingBenc...
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> Adding a missed check to skip over commoning of predicated vector operations
I'm a little sad that `is_commutative_vector_operation` does not mean that it is actually commutative now. Rather, it means:
"it is commutative if it is not predicated"
I think we need to improve the naming somehow. Because it even slipped by you who work with predicated vector ops all the time - how will someone less familiar understand this and not be misled?
src/hotspot/share/opto/phaseX.cpp line 75:
> 73: // the result of the operation is corresponding lane of its first operand.
> 74: // i.e. RES = VEC1.lanewise(OPER, VEC2, MASK) is semantically equivalent to
> 75: // RES = VEC1.BLEND(VEC1.lanewise(OPER, VEC2), MASK)
I think you are missing an argument to the BLEND, right? `VEC1` should be in there for the `false` case.
src/hotspot/share/opto/vectornode.hpp line 77:
> 75:
> 76: virtual uint hash() const {
> 77: if (is_commutative_vector_operation() && !is_predicated_vector()) {
You should either put the comment here again, or even better move out the condition:
`is_commutative_vector_operation() && !is_predicated_vector()`
-------------
PR Comment: https://git.openjdk.org/jdk/pull/22863#issuecomment-2609433460
PR Review Comment: https://git.openjdk.org/jdk/pull/22863#discussion_r1926728680
PR Review Comment: https://git.openjdk.org/jdk/pull/22863#discussion_r1926732251
More information about the hotspot-compiler-dev
mailing list