RFR: 8252847: New AVX512 optimized stubs for both conjoint and disjoint arraycopy
Bhateja, Jatin
jatin.bhateja at intel.com
Thu Sep 10 16:47:25 UTC 2020
Summary:
1) New AVX3 optimized stubs for both conjoint and disjoint arraycopy.
2) Special instruction sequence blocks for copy sizes b/w 32-192 bytes.
3) Block copy operation above 192 bytes is performed using destination
address aligned PRE-MAIN-POST loop. Main loop copies 192 byte in one
iteration and tail part fall over special instruction sequence blocks. 4)
Both small copy block and aligned loop use 32 byte vector register to
prevent and frequency penalty for copy sizes less than AVX3Threshold.
5) For block size above AVX3Theshold both special blocks and loop operate
using 64 byte register. 6) In case user sets the maximum vector size to 32
bytes, forward copy (disjoint) operations are done using efficient REP MOVS
for copy sizes above 4096 bytes.
JMH Results:
System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @
2.70GHz
Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java
Baseline :
[http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_St
ubs_Baseline.txt]()
WithOpt :
[http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_St
ubs_WithOpts.txt]()
-------------
Commit messages:
- 8252847: New AVX512 optimized stubs for both conjoint and disjoint
arraycopy.
Changes: https://git.openjdk.java.net/jdk/pull/61/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=61&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8252847
Stats: 1315 lines in 12 files changed: 1213 ins; 69 del; 33 mod
Patch: https://git.openjdk.java.net/jdk/pull/61.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/61/head:pull/61
PR: https://git.openjdk.java.net/jdk/pull/61
More information about the hotspot-compiler-dev
mailing list