Vectorized array mismatch updates
Paul Sandoz
paul.sandoz at oracle.com
Thu Dec 17 15:07:22 UTC 2015
Hi,
The vectorized array mismatch implementation is now fully wired up to Arrays.equals/compare/mismatch in hs-comp and the intrinsic kicks in on x86 for C2.
There are a bunch of follow up tasks that need to be done (where appropriate i will log issues):
1) wiring up the vectorizedMismatch intrinsic stub in C1 on x86;
2) implementing the vectorizedMismatch intrinsic on other platforms, such Sparc and ARM (volunteers? the work is likely similar to that for compact string equality/comparison); and
3) from performance data cleaning up edge cases to reduce or ensure no regressions.
With regards to 3) i have uploaded a JMH benchmark project and raw results for:
- two x86 platforms supporting UseAVX=1 (AVX_1) and UseAVX=2 (AVX_2) respectively (thus AVX_1 and AVX_2 results are not directly comparable)
- C2 (-XX:-UseVectorizedMismatchIntrinsic as “Unsafe", and -XX:+UseVectorizedMismatchIntrinsic as “Vectorized")
- C1 (as “Unsafe", implicitly -XX:-UseVectorizedMismatchIntrinsic since there is no intrinsic yet for C1)
- comparing byte[] and long[]
- small (1..16) and large (2^2..12) array lengths where the content of two arrays are the same, or the last element differs (lastNEQ=false/true).
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/perf/ <http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/perf/>
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/perf/results/AVX_1/ <http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/perf/results/AVX_1/>
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/perf/results/AVX_2/ <http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/perf/results/AVX_2/>
Observations so far:
(Note for byte[] the vectorizedMismatch does not kick in for an array length < 8).
- byte[], AVX_1, C2
- No regressions for small arrays, good improvements for large arrays
- For large arrays the Vectorized performance is marginally better than the Unsafe performance.
I expect the gap to close once Roland’s fix for JDK-8145322 is pushed (which creates more
efficient address computation for unrolled Unsafe access loops)
- long[], AVX_1, C2
- For small arrays there are some regressions both for Vectorized and Unsafe
- For large arrays there are some regressions both for Vectorized and Unsafe.
For Unsafe this is due to JDK-8145322.
For Vectorized there is some variance that might be due to unlucky alignment of quadwords.
- Further investigation is required: e.g. have a threshold when vectorizedMismatch kicks in
or we somehow disable Unsafe and/or Vectorized for UseAVX=1, if we can surface constants of
vectorization/register widths etc. in a platform independent manner.
- byte[], AVX_2, C2
- For small arrays with Unsafe a small regression is observed at lengths of 11 and 15 when the contents of the arrays are equal.
This seems like a blip, but might be due to some odd code generation.
- For small arrays with Vectorized there is no regression.
- For large arrays performance is good, with Vectorized ~ 2x Unsafe once the length gets large enough (256/512 or larger)
This translates into an ~10x improvement compared to an ordinary loop.
- long[], AVX_2, C2
- For small arrays there are some regressions, like for AVX_1
- For large arrays AVX_2 starts to show a 1.5x improvement.
Again some variance is observable, perhaps due to unlucky alignment.
- byte[]. AVX_1/2, C1
(Note only Unsafe results are available)
- For small arrays there are small regressions for < 8 probably due to the length check and branch to
the ordinary loop. Not sure if there is much that can be done about.
- For large arrays the performance boost is good and can be much better if made intrinsic, e.g. ~5x to 8x
- long[]. AVX_1/2, C1
(Note only Unsafe results are available)
- For small and large arrays there are noticeable regressions. A C1 intrinsic should improve things.
Thanks,
Paul.
More information about the hotspot-dev
mailing list