RFR: 8300865: C2: product reduction in ProdRed_Double is not vectorized [v4]
Emanuel Peter
epeter at openjdk.org
Wed May 31 06:33:57 UTC 2023
On Wed, 31 May 2023 00:55:22 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Add jmh test case
>
> The performance numbers on my desktop are:
> Base runs, no vectorization happens with superword:
> Benchmark (COUNT) (seed) Mode Cnt Score Error Units
> VectorReduction.NoSuperword.mulRedD 512 0 avgt 4 435.795 ± 0.082 ns/op
> VectorReduction.WithSuperword.mulRedD 512 0 avgt 4 434.154 ± 0.042 ns/op
>
> With the PR reduction succeeds and vectorization of the loop happens when superword is enabled:
> Benchmark (COUNT) (seed) Mode Cnt Score Error Units
> VectorReduction.NoSuperword.mulRedD 512 0 avgt 4 435.897 ± 0.137 ns/op
> VectorReduction.WithSuperword.mulRedD 512 0 avgt 4 405.479 ± 1.896 ns/op
@sviswa7 Thanks for adding the benchmark. The win is small, but that was to be expected given that the double reduction has to be performed in a linear order, and hence has quite a large latency.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14065#issuecomment-1569572955
More information about the hotspot-compiler-dev
mailing list