RFR: 8252847: Optimize primitive arrayCopy stubs using AVX-512 masked instructions [v6]
Vladimir Kozlov
kvn at openjdk.java.net
Fri Oct 9 17:59:12 UTC 2020
On Mon, 28 Sep 2020 12:21:01 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Summary:
>>
>> 1) New AVX3 optimized stubs for both conjoint and disjoint arraycopy.
>> 2) Special instruction sequence blocks for copy sizes b/w 32-192 bytes.
>> 3) Block copy operation above 192 bytes is performed using destination address aligned PRE-MAIN-POST loop. Main loop
>> copies 192 byte in one iteration and tail part fall over special instruction sequence blocks. 4) Both small copy block
>> and aligned loop use 32 byte vector register to prevent and frequency penalty for copy sizes less than AVX3Threshold.
>> 5) For block size above AVX3Theshold both special blocks and loop operate using 64 byte register. 6) In case user
>> sets the maximum vector size to 32 bytes, forward copy (disjoint) operations are done using efficient REP MOVS for copy
>> sizes above 4096 bytes. JMH Results:
>> System : CascadeLake Server, Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz
>> Micros : test/micro/org/openjdk/bench/java/lang/ArrayCopy*.java
>> Baseline : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_Baseline.txt]()
>> WithOpt : [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> 8252847 : Review comments resolution
Yes, this looks better. Reviewed. Before pushing let me test it. I will let you know results.
-------------
Marked as reviewed by kvn (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/61
More information about the core-libs-dev
mailing list