RFR: 8275317: AArch64: Support some type conversion vectorization in SLP
Tobias Hartmann
thartmann at openjdk.java.net
Wed Nov 10 14:24:35 UTC 2021
On Thu, 28 Oct 2021 03:39:42 GMT, Fei Gao <duke at openjdk.java.net> wrote:
> Current SLP vectorizer in C2 compiler doesn't support type conversion
> operations. But AArch64 has vector type conversion instructions in
> both NEON and SVE.
>
> The type conversion involves two kinds of scenarios, conversion between
> the same data sizes and conversion between different data sizes. If we
> want to support casts between different data sizes, we need to amend
> the code part for identifying adjacent memory references and the code
> part for justifying if the combination is profitable. I suppose it
> would be easier to review if we split the whole task to support type
> conversion into two separate patches, one for the same data sizes and
> the other one for different data sizes. The goal of this patch is just
> to support conversions within the same data size, including:
> int -> float
> float -> int
> long -> double
> double -> long
>
> A typical test case:
>
> for (int i = start; i < limit; i++) {
> b[i] = (float) a[i];
> }
>
> To implement it, the patch completed the necessary instructions and
> matching rules in the backend and added implementation for SLP in
> the middle end.
>
> The percentage of performance uplift on aarch64 system:
> Mode: avgt
> Cnt: 15
> Metric: (ns/op)
>
> benchmark percentage change [(After-Before)/Before]
> VectorLoop.convertD2L -48.46%
> VectorLoop.convertF2I -55.67%
> VectorLoop.convertI2F -55.27%
> VectorLoop.convertL2D -48.75%
That looks good to me but x86 supports vector instructions for these operations as well, right? Or am I missing something?
https://github.com/openjdk/jdk/blob/55b36c6f3bb7eb066daaf41f9eba46633afedf08/src/hotspot/cpu/x86/x86.ad#L6701
Do you have perf numbers for x86?
-------------
PR: https://git.openjdk.java.net/jdk/pull/6145
More information about the hotspot-compiler-dev
mailing list