RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F [v4]

Galder Zamarreño galder at openjdk.org
Mon Sep 1 08:19:44 UTC 2025


On Wed, 27 Aug 2025 09:56:29 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> Can you find out why we don't vectorize with AVX1 here?

This was a fun little rabbit hole. The explanation below is for `test6` but I think the same logic applies to `test9`:

The problem comes from the IR node definition, what JTreg does with that, and the what HotSpot code actually does.

The annotation definition is:

    @IR(counts = {IRNode.LOAD_VECTOR_F, "> 0",


So JTreg assumes that the regex should match a vector size of 8. With `UseAVX=1` and floats, `IRNode.getMaxElementsForTypeOnX86` returns 8 and so that's how the constraint is set:


         * Constraint 1: "(\d+(\s){2}(LoadVector.*)+(\s){2}===.*vector[A-Za-z]<F,8>)"


But the issue is that at runtime the vector size is 4:

  844  LoadVector  === ... #vectorx<F,4>


HotSpot logic is more nuanced, with the key being what happens in `SuperWord::unrolling_analysis`. The thing that JTreg doesn't know is that there are 2 types involved in the loop, float **and** int:


        for (int i = 0; i < a.length; i++) {
            a[i] = Float.floatToRawIntBits(b[i]);
        }


With `UseAVX=1`, the max vector size for floats is 8, but for ints is 4. So the JVM picks the minimum value and uses that. Hence that is how unrolling is 4... all the way to the load vector size which is 4.

IMO the right thing to do would be to fix the annotation to be:


    @IR(counts = {IRNode.LOAD_VECTOR_F, IRNode.VECTOR_SIZE_4, "> 0",


And explain it in javadoc why the expected size is 4.

The same with `test9`

WDYT @eme64?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26457#issuecomment-3241348514


More information about the core-libs-dev mailing list