RFR: 8291809: Convert compiler/c2/cr7200264/TestSSE2IntVect.java to IR verification test [v2]
Emanuel Peter
epeter at openjdk.org
Thu Jan 25 14:33:38 UTC 2024
On Thu, 25 Jan 2024 14:16:50 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> Thanks for the clarification @robcasloz and @chhagedorn. I've investigated now, and they do vectorize on my machine as well. I was confused because, before the change below, the IR framework did not register the nodes (wrong vector size of 4 instead of the default of 8). Is that expected, and should we specify something else instead of the catch-all `IRNode.VECTOR_SIZE_ANY`?
>>
>>
>> @Test
>> - @IR(counts = { IRNode.ADD_VI, "> 0",
>> - IRNode.RSHIFT_VI, "> 0",
>> - IRNode.SUB_VI, "> 0" },
>> + @IR(counts = { IRNode.ADD_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
>> + IRNode.RSHIFT_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
>> + IRNode.SUB_VI, IRNode.VECTOR_SIZE_ANY, "> 0" },
>> applyIfCPUFeatureOr = {"sse2", "true", "asimd", "true"})
>> void test_divc(int[] a0, int[] a1) {
>> for (int i = 0; i < a0.length; i+=1) {
>> @@ -519,9 +519,9 @@ void test_divc(int[] a0, int[] a1) {
>> }
>>
>> @Test
>> - @IR(counts = { IRNode.ADD_VI, "> 0",
>> - IRNode.RSHIFT_VI, "> 0",
>> - IRNode.SUB_VI, "> 0" },
>> + @IR(counts = { IRNode.ADD_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
>> + IRNode.RSHIFT_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
>> + IRNode.SUB_VI, IRNode.VECTOR_SIZE_ANY, "> 0" },
>> applyIfCPUFeatureOr = {"sse2", "true", "asimd", "true"})
>> void test_divc_n(int[] a0, int[] a1) {
>> for (int i = 0; i < a0.length; i+=1) {
>
>> @dlunde do you understand what factors determine the length of the vector? Why is the default of IRNode.VECTOR_SIZE_MAX not working?
>
> Perhaps C2 hits the loop unrolling limit? @dlunde you can test this by trying out a large value for `-XX:LoopUnrollLimit`. But even if this turned out to be the case, I would still suggest using `IRNode.VECTOR_SIZE_ANY` rather than forcing a higher loop unroll limit value for the tests.
Ah, I see what is the issue here: the loop does not just contain `int` vectors but also `long` vectors.
Specifically, I see a `VectorCastI2X` and `VectorCastL2X` nodes in the loop, which converst `int` to/from `long`.
Hence, if you have a `32 byte` vector, you can only have `4 long`, and so the loop-unrolling is limited to 4x.
And then you only see `4 int` vectors, when you were expecting `8 int` vectors.
You will probably be able to fix the issue with this:
`IRNode.VECTOR_SIZE + "min(max_int, max_long)"`
For more examples, check out:
`grep "IRNode.VECTOR_SIZE +" test/hotspot/jtreg/ -r`
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17428#discussion_r1466460916
More information about the hotspot-compiler-dev
mailing list