RFR: 8365290: [perf] x86 ArrayFill intrinsic generates SPLIT_STORE for unaligned arrays [v7]

Vladimir Kozlov kvn at openjdk.org
Wed Oct 1 21:42:49 UTC 2025


On Wed, 1 Oct 2025 19:44:34 GMT, Vladimir Ivanov <vaivanov at openjdk.org> wrote:

>> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6014:
>> 
>>> 6012:   BIND(L_fill_4_bytes);
>>> 6013:   subptr(count, 1 << shift);
>>> 6014:   jccb(Assembler::greaterEqual, L_fill_4_bytes_loop);
>> 
>> I don't think it works correctly because you can come here from lines 5998-5999 where ` count` become negative.
>
> testing for tier1, tier2 and tier3 were OK. Will review this part one more time.
> Do you have test scenario that may reproduce this issue?

Testing does not guarantee this path is tested. You need to disable AVX512 to use it otherwise `generate_fill_avx3()` will be used. I was thinking about `byte[63] array` with `-XX:UseAVX=2 -XX:-UseUnalignedLoadStores` flags to hit this path. I did small experiment but unfortunately it seems `arrayof_jbyte_fill` stub is not called with AVX2 so the path is not executed.

I will let you do further investigations to force this path be executed. Here is my small test:

$ java -XX:-TieredCompilation -Xbatch -XX:CompileOnly=TestFillArray::fill -XX:UseAVX=2 -XX:-UseUnalignedLoadStores TestFillArray

$ cat TestFillArray.java 
public class TestFillArray {
    private static byte[] ba;

    static void fill() {
        for (int i = 0; i < ba.length; i++) {
            ba[i] = (byte) 123;
        }
    }

    public static void main(String[] str) {
        ba = new byte[63];
        for (int i = 0; i < 10000; i++) {
            fill();
        }
        ba = new byte[63];
        fill();
        for (int i = 0; i < ba.length; i++) {
            if (ba[i] != (byte) 123) {
                System.out.println("ba[" + i + "] (" + ba[i] + ") != 123");
            }
        }
    }
}

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26747#discussion_r2395973509


More information about the hotspot-dev mailing list