RFR: 8310159: Bulk copy with Unsafe::arrayCopy is slower compared to memcpy
Steve Dohrmann
duke at openjdk.org
Wed Nov 15 01:19:32 UTC 2023
On Wed, 15 Nov 2023 00:39:29 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1186:
>>
>>> 1184: __ evmovntdquq(Address(dst, index, scale, offset + 0x40), xmm2, Assembler::AVX_512bit);
>>> 1185: __ evmovntdquq(Address(dst, index, scale, offset + 0x80), xmm3, Assembler::AVX_512bit);
>>> 1186: __ evmovntdquq(Address(dst, index, scale, offset + 0xC0), xmm4, Assembler::AVX_512bit);
>>
>> These are non-temporal memory moves, to force eviction from write combining buffers we may need to emit additional fences, else a subsequent read from destination memory may see incorrect values.
>
> @jatin-bhateja There is a sfence at line 781.
Thanks, there is an store fence upon completion of the main loop for the large size code:

-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16575#discussion_r1393511087
More information about the core-libs-dev
mailing list