RFR: 8291809: Convert compiler/c2/cr7200264/TestSSE2IntVect.java to IR verification test [v8]

Mon Jan 29 15:55:46 UTC 2024

On Mon, 29 Jan 2024 14:03:11 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Change to avx2 CPU feature check
>
> I looked into the failure with `asimd` on aarch64, by writing this test:
> 
> 
> public class Test {
>     static int RANGE = 10_000;
> 
>     public static void main(String[] args) {
>         int[] a = new int[RANGE];
>         int[] b = new int[RANGE];
>         for (int i = 0; i < 10_000; i++) {
>             test1(a, b);
>             test2(a, b, i % 200 - 100);
>         }
>     }
> 
>     static void test1(int[] a, int[] b) {
>         for (int i = 0; i < a.length; i++) {
>             a[i] = b[i] / 15;
>         }
>     }
> 
>     static void test2(int[] a, int[] b, int s) {
>         for (int i = 0; i < a.length; i++) {
>             a[i] = b[i] / 7;
>         }
>     }
> }
> 
> 
> And running this command:
> `./java -XX:CompileCommand=compileonly,Test::test1 -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:+TraceNewVectors Test.java`
> 
> In the logs, I see it attempts to vectorize, crating packs like this:
> 
> ...
> Pack: 7
>  align: 0 	 678  RShiftI  === _ 679 153  [[ 671 ]]  !orig=561,154 !jvms: Test::test1 @ bci:15 (line 15)
>  align: 4 	 667  RShiftI  === _ 668 153  [[ 660 ]]  !orig=154 !jvms: Test::test1 @ bci:15 (line 15)
>  align: 8 	 561  RShiftI  === _ 562 153  [[ 554 ]]  !orig=154 !jvms: Test::test1 @ bci:15 (line 15)
>  align: 12 	 154  RShiftI  === _ 251 153  [[ 155 ]]  !jvms: Test::test1 @ bci:15 (line 15)
> Pack: 8
>  align: 0 	 676  MulL  === _ 677 144  [[ 675 ]]  !orig=559,146 !jvms: Test::test1 @ bci:15 (line 15)
>  align: 8 	 665  MulL  === _ 666 144  [[ 664 ]]  !orig=146 !jvms: Test::test1 @ bci:15 (line 15)
>  ...
> 
> 
> But then, I also see:
> 
> Unimplemented
>  559  MulL  === _ 560 144  [[ 558 ]]  !orig=146 !jvms: Test::test1 @ bci:15 (line 15)
> 
> 
> And in `src/hotspot/cpu/aarch64/aarch64_vector.ad`, I see this:
> 
>   bool Matcher::match_rule_supported_auto_vectorization(int opcode, int vlen, BasicType bt) {
>     if (UseSVE == 0) {
>       // These operations are not profitable to be vectorized on NEON, because no direct
>       // NEON instructions support them. But the match rule support for them is profitable for
>       // Vector API intrinsics.
>       if ((opcode == Op_VectorCastD2X && bt == T_INT) ||
>           (opcode == Op_VectorCastL2X && bt == T_FLOAT) ||
>           (opcode == Op_CountLeadingZerosV && bt == T_LONG) ||
>           (opcode == Op_CountTrailingZerosV && bt == T_LONG) ||
>           // The vector implementation of Op_AddReductionVD/F is for the Vector API only.
>           // It is not suitable for auto-vectorization because it does not add the elements
>           // in the same order as sequential code, and FP addition is ...

Thanks @eme64, updated now. I'll rerun the tests before integrating.

@pfustc @fg1417: @robcasloz recommended that I ask you to check that the `sve` IR checks do not fail for `test_divc` and `test_divc_n` in this changeset. Do you have machines on which you can check this?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17428#issuecomment-1914986863