RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F [v4]
Galder Zamarreño
galder at openjdk.org
Mon Sep 1 08:19:44 UTC 2025
On Wed, 27 Aug 2025 09:56:29 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> Can you find out why we don't vectorize with AVX1 here?
This was a fun little rabbit hole. The explanation below is for `test6` but I think the same logic applies to `test9`:
The problem comes from the IR node definition, what JTreg does with that, and the what HotSpot code actually does.
The annotation definition is:
@IR(counts = {IRNode.LOAD_VECTOR_F, "> 0",
So JTreg assumes that the regex should match a vector size of 8. With `UseAVX=1` and floats, `IRNode.getMaxElementsForTypeOnX86` returns 8 and so that's how the constraint is set:
* Constraint 1: "(\d+(\s){2}(LoadVector.*)+(\s){2}===.*vector[A-Za-z]<F,8>)"
But the issue is that at runtime the vector size is 4:
844 LoadVector === ... #vectorx<F,4>
HotSpot logic is more nuanced, with the key being what happens in `SuperWord::unrolling_analysis`. The thing that JTreg doesn't know is that there are 2 types involved in the loop, float **and** int:
for (int i = 0; i < a.length; i++) {
a[i] = Float.floatToRawIntBits(b[i]);
}
With `UseAVX=1`, the max vector size for floats is 8, but for ints is 4. So the JVM picks the minimum value and uses that. Hence that is how unrolling is 4... all the way to the load vector size which is 4.
IMO the right thing to do would be to fix the annotation to be:
@IR(counts = {IRNode.LOAD_VECTOR_F, IRNode.VECTOR_SIZE_4, "> 0",
And explain it in javadoc why the expected size is 4.
The same with `test9`
WDYT @eme64?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26457#issuecomment-3241348514
More information about the core-libs-dev
mailing list