RFR: 8252847: New AVX512 optimized stubs for both conjoint and disjoint arraycopy

Jatin Bhateja jbhateja at openjdk.java.net
Mon Sep 7 14:34:14 UTC 2020


Summary:

1)  New AVX3 optimized stubs for both conjoint and disjoint arraycopy.
2)  Special instruction sequence blocks for copy sizes b/w 32-192 bytes.
3)  Block copy operation above 192 bytes is performed using destination address aligned PRE-MAIN-POST loop. Main loop
copies 192 byte in one iteration and tail part fall over special instruction sequence blocks. 4)  Both small copy block
and aligned loop use 32 byte vector register to prevent and frequency penalty for copy sizes less than AVX3Threshold.
5)  For block size above AVX3Theshold both special blocks and loop operate using 64 byte register. 6)  In case user
sets the maximum vector size to 32 bytes, forward copy (disjoint) operations are done using efficient REP MOVS for copy
sizes above 4096 bytes.

JMH Results:
  System     :  CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz
  Micros     :  test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java
  Baseline   :  [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_Baseline.txt]()
  WithOpt  :  [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()

-------------

Commit messages:
 - 8252847: New AVX512 optimized stubs for both conjoint and disjoint arraycopy.

Changes: https://git.openjdk.java.net/jdk/pull/61/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=61&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8252847
  Stats: 1315 lines in 12 files changed: 1213 ins; 69 del; 33 mod
  Patch: https://git.openjdk.java.net/jdk/pull/61.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/61/head:pull/61

PR: https://git.openjdk.java.net/jdk/pull/61


More information about the hotspot-dev mailing list