RFR: 8291809: Convert compiler/c2/cr7200264/TestSSE2IntVect.java to IR verification test [v2]

Emanuel Peter epeter at openjdk.org
Thu Jan 25 14:33:38 UTC 2024


On Thu, 25 Jan 2024 14:16:50 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:

>> Thanks for the clarification @robcasloz and @chhagedorn. I've investigated now, and they do vectorize on my machine as well. I was confused because, before the change below, the IR framework did not register the nodes (wrong vector size of 4 instead of the default of 8). Is that expected, and should we specify something else instead of the catch-all `IRNode.VECTOR_SIZE_ANY`?
>> 
>> 
>>      @Test
>> -    @IR(counts = { IRNode.ADD_VI,    "> 0",
>> -                   IRNode.RSHIFT_VI, "> 0",
>> -                   IRNode.SUB_VI,    "> 0" },
>> +    @IR(counts = { IRNode.ADD_VI,    IRNode.VECTOR_SIZE_ANY, "> 0",
>> +                   IRNode.RSHIFT_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
>> +                   IRNode.SUB_VI,    IRNode.VECTOR_SIZE_ANY, "> 0" },
>>          applyIfCPUFeatureOr = {"sse2", "true", "asimd", "true"})
>>      void test_divc(int[] a0, int[] a1) {
>>          for (int i = 0; i < a0.length; i+=1) {
>> @@ -519,9 +519,9 @@ void test_divc(int[] a0, int[] a1) {
>>      }
>>  
>>      @Test
>> -    @IR(counts = { IRNode.ADD_VI,    "> 0",
>> -                   IRNode.RSHIFT_VI, "> 0",
>> -                   IRNode.SUB_VI,    "> 0" },
>> +    @IR(counts = { IRNode.ADD_VI,    IRNode.VECTOR_SIZE_ANY, "> 0",
>> +                   IRNode.RSHIFT_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
>> +                   IRNode.SUB_VI,    IRNode.VECTOR_SIZE_ANY, "> 0" },
>>          applyIfCPUFeatureOr = {"sse2", "true", "asimd", "true"})
>>      void test_divc_n(int[] a0, int[] a1) {
>>          for (int i = 0; i < a0.length; i+=1) {
>
>> @dlunde do you understand what factors determine the length of the vector? Why is the default of IRNode.VECTOR_SIZE_MAX not working?
> 
> Perhaps C2 hits the loop unrolling limit? @dlunde you can test this by trying out a large value for `-XX:LoopUnrollLimit`. But even if this turned out to be the case, I would still suggest using `IRNode.VECTOR_SIZE_ANY` rather than forcing a higher loop unroll limit value for the tests.

Ah, I see what is the issue here: the loop does not just contain `int` vectors but also `long` vectors.
Specifically, I see a `VectorCastI2X` and `VectorCastL2X` nodes in the loop, which converst `int` to/from `long`.
Hence, if you have a `32 byte` vector, you can only have `4 long`, and so the loop-unrolling is limited to 4x.
And then you only see `4 int` vectors, when you were expecting `8 int` vectors.

You will probably be able to fix the issue with this:
`IRNode.VECTOR_SIZE + "min(max_int, max_long)"`

For more examples, check out:
`grep "IRNode.VECTOR_SIZE +" test/hotspot/jtreg/ -r`

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17428#discussion_r1466460916


More information about the hotspot-compiler-dev mailing list