RFR: 8252847: New AVX512 optimized stubs for both conjoint and disjoint arraycopy

Bhateja, Jatin jatin.bhateja at intel.com
Thu Sep 10 16:47:25 UTC 2020


 Summary:
 
 1)  New AVX3 optimized stubs for both conjoint and disjoint arraycopy.
 2)  Special instruction sequence blocks for copy sizes b/w 32-192 bytes.
 3)  Block copy operation above 192 bytes is performed using destination
 address aligned PRE-MAIN-POST loop. Main loop copies 192 byte in one
 iteration and tail part fall over special instruction sequence blocks. 4)
 Both small copy block and aligned loop use 32 byte vector register to
 prevent and frequency penalty for copy sizes less than AVX3Threshold.
 5)  For block size above AVX3Theshold both special blocks and loop operate
 using 64 byte register. 6)  In case user sets the maximum vector size to 32
 bytes, forward copy (disjoint) operations are done using efficient REP MOVS
 for copy sizes above 4096 bytes.
 
 JMH Results:
   System     :  CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @
 2.70GHz
   Micros     :  test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java
   Baseline   :
 [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_St
 ubs_Baseline.txt]()
   WithOpt  :
 [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_St
 ubs_WithOpts.txt]()
 
 -------------
 
 Commit messages:
  - 8252847: New AVX512 optimized stubs for both conjoint and disjoint
 arraycopy.
 
 Changes: https://git.openjdk.java.net/jdk/pull/61/files
  Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=61&range=00
   Issue: https://bugs.openjdk.java.net/browse/JDK-8252847
   Stats: 1315 lines in 12 files changed: 1213 ins; 69 del; 33 mod
   Patch: https://git.openjdk.java.net/jdk/pull/61.diff
   Fetch: git fetch https://git.openjdk.java.net/jdk pull/61/head:pull/61
 
 PR: https://git.openjdk.java.net/jdk/pull/61


More information about the hotspot-compiler-dev mailing list