RFR: 8252847: New AVX512 optimized stubs for both conjoint and disjoint arraycopy
Jatin Bhateja
jbhateja at openjdk.java.net
Mon Sep 7 14:34:14 UTC 2020
Summary:
1) New AVX3 optimized stubs for both conjoint and disjoint arraycopy.
2) Special instruction sequence blocks for copy sizes b/w 32-192 bytes.
3) Block copy operation above 192 bytes is performed using destination address aligned PRE-MAIN-POST loop. Main loop
copies 192 byte in one iteration and tail part fall over special instruction sequence blocks. 4) Both small copy block
and aligned loop use 32 byte vector register to prevent and frequency penalty for copy sizes less than AVX3Threshold.
5) For block size above AVX3Theshold both special blocks and loop operate using 64 byte register. 6) In case user
sets the maximum vector size to 32 bytes, forward copy (disjoint) operations are done using efficient REP MOVS for copy
sizes above 4096 bytes.
JMH Results:
System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz
Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java
Baseline : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_Baseline.txt]()
WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()
-------------
Commit messages:
- 8252847: New AVX512 optimized stubs for both conjoint and disjoint arraycopy.
Changes: https://git.openjdk.java.net/jdk/pull/61/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=61&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8252847
Stats: 1315 lines in 12 files changed: 1213 ins; 69 del; 33 mod
Patch: https://git.openjdk.java.net/jdk/pull/61.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/61/head:pull/61
PR: https://git.openjdk.java.net/jdk/pull/61
More information about the hotspot-dev
mailing list