RFR: 8300865: C2: product reduction in ProdRed_Double is not vectorized [v4]
Sandhya Viswanathan
sviswanathan at openjdk.org
Wed May 31 00:57:56 UTC 2023
On Wed, 31 May 2023 00:47:56 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> This PR fixes the problem with double reduction on x86_64.
>>
>> In the test compiler.loopopts.superword.ProdRed_Double, the product reduction loop in prodReductionImplement() was not getting vectorized when run as follows:
>> jtreg -XX:CompileCommand=PrintAssembly,compiler.loopopts.superword.ProdRed_Double::prodReductionImplement compiler/loopopts/superword/ProdRed_Double.java
>> The print assembly generated in the pid-xxx.log output in JTwork/scratch directory was not showing any vector_reduction_double node.
>>
>> This was happening as the ReductionNode::implemented was passed a vector size of one element. For the vector reduction implemented we need to check with at least vector size of two elements.
>>
>> With this PR the vector_reduction_double node is generated.
>>
>> Please review.
>>
>> Best Regards,
>> Sandhya
>
> Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision:
>
> Add jmh test case
The performance numbers on my desktop are:
Base runs, no vectorization happens with superword:
Benchmark (COUNT) (seed) Mode Cnt Score Error Units
VectorReduction.NoSuperword.mulRedD 512 0 avgt 4 435.795 ± 0.082 ns/op
VectorReduction.WithSuperword.mulRedD 512 0 avgt 4 434.154 ± 0.042 ns/op
With the PR reduction succeeds and vectorization of the loop happens when superword is enabled:
Benchmark (COUNT) (seed) Mode Cnt Score Error Units
VectorReduction.NoSuperword.mulRedD 512 0 avgt 4 435.897 ± 0.137 ns/op
VectorReduction.WithSuperword.mulRedD 512 0 avgt 4 405.479 ± 1.896 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14065#issuecomment-1569336052
More information about the hotspot-compiler-dev
mailing list