RFR: 8310159: Bulk copy with Unsafe::arrayCopy is slower compared to memcpy

Steve Dohrmann duke at openjdk.org
Wed Nov 15 01:19:32 UTC 2023


On Wed, 15 Nov 2023 00:39:29 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1186:
>> 
>>> 1184:     __ evmovntdquq(Address(dst, index, scale, offset + 0x40), xmm2, Assembler::AVX_512bit);
>>> 1185:     __ evmovntdquq(Address(dst, index, scale, offset + 0x80), xmm3, Assembler::AVX_512bit);
>>> 1186:     __ evmovntdquq(Address(dst, index, scale, offset + 0xC0), xmm4, Assembler::AVX_512bit);
>> 
>> These are non-temporal memory moves, to force eviction from write combining buffers we may need to emit additional fences, else a subsequent read from destination memory may see incorrect values.
>
> @jatin-bhateja There is a sfence at line 781.

Thanks, there is an store fence upon completion of the main loop for the large size code:

![image](https://github.com/openjdk/jdk/assets/3858882/3bcea3c6-3bda-458c-aa7c-29ed6010cde2)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16575#discussion_r1393511087


More information about the core-libs-dev mailing list