RFR: 8277617: Adjust AVX3Threshold for copy/fill stubs [v6]
Jie Fu
jiefu at openjdk.java.net
Thu Dec 2 02:32:29 UTC 2021
On Wed, 1 Dec 2021 23:19:47 GMT, Jie Fu <jiefu at openjdk.org> wrote:
> > Yes, the patch doesn't change behavior on AVX2 and older AVX512 systems.
>
> Thanks for your clarification. But it still remains unknown why the 64-byte instructions shouldn't be used on CPUs which don't support `serialize`.
>
> I will test the 64-byte instructions on older AVX512 systems today and feedback here.
Here is the performance data on our older AVX512 platform which doesn't support `serialize`.
Even without `serialize` , the performance has been improved with 64-byte instructions.
E.g., for `ArrayCopy.arrayCopyObjectNonConst`, it has been improved by ~15%.
So it seems unfair only enable 64-byte instructions for the latest Intel AVX512 platforms.
Still, I would like to know why we don't use 64-byte instructions on platforms without `serialize` support.
Thanks.
---------------------------------------------------
Results with 32-byte instructions.
==> perf32-1.log <==
Benchmark Mode Cnt Score Error Units
ArrayCopy.arrayCopyObject avgt 5 24.070 ± 0.013 ns/op
ArrayCopy.arrayCopyObjectNonConst avgt 5 27.517 ± 0.023 ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward avgt 5 21.127 ± 0.008 ns/op
ArrayCopy.arrayCopyObjectSameArraysForward avgt 5 21.934 ± 0.009 ns/op
==> perf32-2.log <==
Benchmark Mode Cnt Score Error Units
ArrayCopy.arrayCopyObject avgt 5 24.511 ± 0.027 ns/op
ArrayCopy.arrayCopyObjectNonConst avgt 5 27.240 ± 0.034 ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward avgt 5 21.065 ± 0.013 ns/op
ArrayCopy.arrayCopyObjectSameArraysForward avgt 5 21.956 ± 0.161 ns/op
==> perf32-3.log <==
Benchmark Mode Cnt Score Error Units
ArrayCopy.arrayCopyObject avgt 5 25.357 ± 0.006 ns/op
ArrayCopy.arrayCopyObjectNonConst avgt 5 27.513 ± 1.468 ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward avgt 5 20.984 ± 0.024 ns/op
ArrayCopy.arrayCopyObjectSameArraysForward avgt 5 20.945 ± 1.346 ns/op
Results with 64-byte instructions.
==> perf64-1.log <==
Benchmark Mode Cnt Score Error Units
ArrayCopy.arrayCopyObject avgt 5 23.425 ± 0.003 ns/op
ArrayCopy.arrayCopyObjectNonConst avgt 5 23.530 ± 0.002 ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward avgt 5 20.174 ± 0.074 ns/op
ArrayCopy.arrayCopyObjectSameArraysForward avgt 5 19.942 ± 0.134 ns/op
==> perf64-2.log <==
Benchmark Mode Cnt Score Error Units
ArrayCopy.arrayCopyObject avgt 5 22.429 ± 0.012 ns/op
ArrayCopy.arrayCopyObjectNonConst avgt 5 25.189 ± 0.031 ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward avgt 5 20.093 ± 0.004 ns/op
ArrayCopy.arrayCopyObjectSameArraysForward avgt 5 20.400 ± 1.213 ns/op
==> perf64-3.log <==
Benchmark Mode Cnt Score Error Units
ArrayCopy.arrayCopyObject avgt 5 23.472 ± 0.002 ns/op
ArrayCopy.arrayCopyObjectNonConst avgt 5 23.534 ± 0.031 ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward avgt 5 20.232 ± 0.150 ns/op
ArrayCopy.arrayCopyObjectSameArraysForward avgt 5 21.921 ± 0.008 ns/op
-------------
PR: https://git.openjdk.java.net/jdk/pull/6512
More information about the hotspot-dev
mailing list