RFR: 8365290: [perf] x86 ArrayFill intrinsic generates SPLIT_STORE for unaligned arrays [v7]
Vladimir Kozlov
kvn at openjdk.org
Wed Oct 1 21:42:49 UTC 2025
On Wed, 1 Oct 2025 19:44:34 GMT, Vladimir Ivanov <vaivanov at openjdk.org> wrote:
>> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6014:
>>
>>> 6012: BIND(L_fill_4_bytes);
>>> 6013: subptr(count, 1 << shift);
>>> 6014: jccb(Assembler::greaterEqual, L_fill_4_bytes_loop);
>>
>> I don't think it works correctly because you can come here from lines 5998-5999 where ` count` become negative.
>
> testing for tier1, tier2 and tier3 were OK. Will review this part one more time.
> Do you have test scenario that may reproduce this issue?
Testing does not guarantee this path is tested. You need to disable AVX512 to use it otherwise `generate_fill_avx3()` will be used. I was thinking about `byte[63] array` with `-XX:UseAVX=2 -XX:-UseUnalignedLoadStores` flags to hit this path. I did small experiment but unfortunately it seems `arrayof_jbyte_fill` stub is not called with AVX2 so the path is not executed.
I will let you do further investigations to force this path be executed. Here is my small test:
$ java -XX:-TieredCompilation -Xbatch -XX:CompileOnly=TestFillArray::fill -XX:UseAVX=2 -XX:-UseUnalignedLoadStores TestFillArray
$ cat TestFillArray.java
public class TestFillArray {
private static byte[] ba;
static void fill() {
for (int i = 0; i < ba.length; i++) {
ba[i] = (byte) 123;
}
}
public static void main(String[] str) {
ba = new byte[63];
for (int i = 0; i < 10000; i++) {
fill();
}
ba = new byte[63];
fill();
for (int i = 0; i < ba.length; i++) {
if (ba[i] != (byte) 123) {
System.out.println("ba[" + i + "] (" + ba[i] + ") != 123");
}
}
}
}
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26747#discussion_r2395973509
More information about the hotspot-dev
mailing list