RFR: 8283091: Support type conversion between different data sizes in SLP [v3]
Fei Gao
fgao at openjdk.java.net
Thu Jun 2 14:09:21 UTC 2022
On Wed, 25 May 2022 01:13:36 GMT, Fei Gao <fgao at openjdk.org> wrote:
>> @fg1417 Thank you for suggesting this optimization. I see that it was not updated for some time. Do you still intend to work on it?
>>
>> Please update to latest JDK (Loom was integrated) and run performance again. Also include % of changes.
>>
>> I have the same concern as @DamonFool about regression when vectorizing some conversions. May be we should have additional `Matcher` property we could consult when trying to **auto-vectorize**. I understand that we need `vcvt2Dto2I` when VectorAPI specifically asking to generate it but we should not enforce auto-generation.
>
>> Please update to latest JDK (Loom was integrated) and run performance again. Also include % of changes.
>>
>> I have the same concern as @DamonFool about regression when vectorizing some conversions. May be we should have additional `Matcher` property we could consult when trying to **auto-vectorize**. I understand that we need `vcvt2Dto2I` when VectorAPI specifically asking to generate it but we should not enforce auto-generation.
>
> @vnkozlov thanks for your review and kind suggestion! I'll update the patch to resolve the potential performance regression.
> @fg1417 I don't see new update in this PR. Please also show performance numbers with new changes
Here is the perf uplift data (ns/op) on different machines for the latest patch.
NEON perf change (ns/op)
convertB2D not supported
convertB2F -45.55%
convertB2L not supported
convertD2B not supported
convertD2F -42.32%
convertD2I not supported (VectorAPI supported)
convertD2S not supported
convertF2B -42.95%
convertF2D -45.28%
convertF2L -5.78%
convertF2S -51.30%
convertI2D -27.82%
convertI2L -44.54%
convertL2B not supported
convertL2F not supported (VectorAPI supported)
convertL2I -28.58%
convertL2S not supported
convertS2D not supported
convertS2F -53.37%
convertS2L not supported
SVE perf change (ns/op)
convertB2D -36.15%
convertB2F -63.48%
convertB2L -32.48%
convertD2B 0.02%
convertD2F -47.85%
convertD2I -46.42%
convertD2S -32.08%
convertF2B -59.54%
convertF2D -60.81%
convertF2L -61.81%
convertF2S -67.67%
convertI2D -60.63%
convertI2L -57.23%
convertL2B 0.04%
convertL2F -47.21%
convertL2I -34.49%
convertL2S -19.57%
convertS2D -47.20%
convertS2F -74.86%
convertS2L -49.00%
X86 perf change (ns/op)
convertB2D -64.13%
convertB2F -79.37%
convertB2L -70.97%
convertD2B not supported
convertD2F -62.69%
convertD2I not supported
convertD2S not supported
convertF2B not supported
convertF2D -68.90%
convertF2L not supported
convertF2S not supported
convertI2D -87.48%
convertI2L -69.64%
convertL2B -3.96%
convertL2F -0.11%
convertL2I -49.59%
convertL2S -24.75%
convertS2D -84.35%
convertS2F -86.09%
convertS2L -70.42%
AVX512 perf change (ns/op)
convertB2D -78.08%
convertB2F -86.39%
convertB2L -79.07%
convertD2B not supported
convertD2F -71.86%
convertD2I not supported
convertD2S not supported
convertF2B not supported
convertF2D -78.17%
convertF2L not supported
convertF2S not supported
convertI2D -90.26%
convertI2L -79.92%
convertL2B -70.75%
convertL2F -86.67%
convertL2I -80.94%
convertL2S -71.54%
convertS2D -90.84%
convertS2F -83.94%
convertS2L -80.51%
-------------
PR: https://git.openjdk.java.net/jdk/pull/7806
More information about the hotspot-compiler-dev
mailing list