RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F [v4]

Emanuel Peter epeter at openjdk.org
Mon Sep 1 08:43:44 UTC 2025


On Mon, 1 Sep 2025 08:17:08 GMT, Galder Zamarreño <galder at openjdk.org> wrote:

>> @galderz I got a failure  in out testing:
>> 
>> With VM flag: `-XX:UseAVX=1`.
>> 
>> 
>> Failed IR Rules (2) of Methods (2)
>> ----------------------------------
>> 1) Method "static java.lang.Object[] compiler.loopopts.superword.TestCompatibleUseDefTypeSize.test6(int[],float[])" - [Failed IR rules: 1]:
>>    * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={"sse4.1", "true", "asimd", "true", "rvv", "true"}, counts={"_#V#LOAD_VECTOR_F#_", "> 0", "_#STORE_VECTOR#_", "> 0", "_#VECTOR_REINTERPRET#_", "> 0"}, applyIfPlatformOr={}, applyIfPlatform={"64-bit", "true"}, failOn={}, applyIfOr={}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
>>      > Phase "PrintIdeal":
>>        - counts: Graph contains wrong number of nodes:
>>          * Constraint 1: "(\\d+(\\s){2}(LoadVector.*)+(\\s){2}===.*vector[A-Za-z]<F,8>)"
>>            - Failed comparison: [found] 0 > 0 [given]
>>            - No nodes matched!
>> 
>> 2) Method "static java.lang.Object[] compiler.loopopts.superword.TestCompatibleUseDefTypeSize.test9(long[],double[])" - [Failed IR rules: 1]:
>>    * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={"sse4.1", "true", "asimd", "true", "rvv", "true"}, counts={"_#V#LOAD_VECTOR_D#_", "> 0", "_#STORE_VECTOR#_", "> 0", "_#VECTOR_REINTERPRET#_", "> 0"}, applyIfPlatformOr={}, applyIfPlatform={"64-bit", "true"}, failOn={}, applyIfOr={}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
>>      > Phase "PrintIdeal":
>>        - counts: Graph contains wrong number of nodes:
>>          * Constraint 1: "(\\d+(\\s){2}(LoadVector.*)+(\\s){2}===.*vector[A-Za-z]<D,4>)"
>>            - Failed comparison: [found] 0 > 0 [given]
>>            - No nodes matched!
>> 
>> 
>> I suspect that `test6` with `floatToRawIntBits` and `test9` with `doubleToRawLongBits` are only supported with `AVX2`. Question is if that is really supposed to be like that, or if we should even file an RFE to extend support for `AVX1` and lower.
>> 
>> Can you find out why we don't vectorize with `AVX1` here?
>
>> Can you find out why we don't vectorize with AVX1 here?
> 
> This was a fun little rabbit hole. The explanation below is for `test6` but I think the same logic applies to `test9`:
> 
> The problem comes from the IR node definition, what JTreg does with that, and the what HotSpot code actually does.
> 
> The annotation definition is:
> 
>     @IR(counts = {IRNode.LOAD_VECTOR_F, "> 0",
> 
> 
> So JTreg assumes that the regex should match a vector size of 8. With `UseAVX=1` and floats, `IRNode.getMaxElementsForTypeOnX86` returns 8 and so that's how the constraint is set:
> 
> 
>          * Constraint 1: "(\d+(\s){2}(LoadVector.*)+(\s){2}===.*vector[A-Za-z]<F,8>)"
> 
> 
> But the issue is that at runtime the vector size is 4:
> 
>   844  LoadVector  === ... #vectorx<F,4>
> 
> 
> HotSpot logic is more nuanced, with the key being what happens in `SuperWord::unrolling_analysis`. The thing that JTreg doesn't know is that there are 2 types involved in the loop, float **and** int:
> 
> 
>         for (int i = 0; i < a.length; i++) {
>             a[i] = Float.floatToRawIntBits(b[i]);
>         }
> 
> 
> With `UseAVX=1`, the max vector size for floats is 8, but for ints is 4. So the JVM picks the minimum value and uses that. Hence that is how unrolling is 4... all the way to the load vector size which is 4.
> 
> IMO the right thing to do would be to fix the annotation to be:
> 
> 
>     @IR(counts = {IRNode.LOAD_VECTOR_F, IRNode.VECTOR_SIZE_4, "> 0",
> 
> 
> And explain it in javadoc why the expected size is 4.
> 
> The same with `test9`
> 
> WDYT @eme64?

@galderz Ah, maybe we just need to do it like here then:
`test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java:192:50:        counts = {IRNode.VECTOR_CAST_I2F, IRNode.VECTOR_SIZE + "min(max_int, max_float)", ">0"})`

When doing cast/reinterpret/move between types this always happens ;)

I think this should generalize over all platforms.

Does that work?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26457#issuecomment-3241438142


More information about the core-libs-dev mailing list