RFR: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions [v6]
Jatin Bhateja
jbhateja at openjdk.java.net
Mon Sep 28 12:21:01 UTC 2020
> Summary:
>
> 1) New AVX3 optimized stubs for both conjoint and disjoint arraycopy.
> 2) Special instruction sequence blocks for copy sizes b/w 32-192 bytes.
> 3) Block copy operation above 192 bytes is performed using destination address aligned PRE-MAIN-POST loop. Main loop
> copies 192 byte in one iteration and tail part fall over special instruction sequence blocks. 4) Both small copy block
> and aligned loop use 32 byte vector register to prevent and frequency penalty for copy sizes less than AVX3Threshold.
> 5) For block size above AVX3Theshold both special blocks and loop operate using 64 byte register. 6) In case user
> sets the maximum vector size to 32 bytes, forward copy (disjoint) operations are done using efficient REP MOVS for copy
> sizes above 4096 bytes. JMH Results:
> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz
> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java
> Baseline : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_Baseline.txt]()
> WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()
Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
8252847 : Review comments resolution
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/61/files
- new: https://git.openjdk.java.net/jdk/pull/61/files/78c4fe73..2a606276
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=61&range=05
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=61&range=04-05
Stats: 493 lines in 9 files changed: 264 ins; 200 del; 29 mod
Patch: https://git.openjdk.java.net/jdk/pull/61.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/61/head:pull/61
PR: https://git.openjdk.java.net/jdk/pull/61
More information about the core-libs-dev
mailing list