RFR: 8291809: Convert compiler/c2/cr7200264/TestSSE2IntVect.java to IR verification test [v8]
Daniel Lundén
dlunden at openjdk.org
Mon Jan 29 15:55:46 UTC 2024
On Mon, 29 Jan 2024 14:03:11 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Change to avx2 CPU feature check
>
> I looked into the failure with `asimd` on aarch64, by writing this test:
>
>
> public class Test {
> static int RANGE = 10_000;
>
> public static void main(String[] args) {
> int[] a = new int[RANGE];
> int[] b = new int[RANGE];
> for (int i = 0; i < 10_000; i++) {
> test1(a, b);
> test2(a, b, i % 200 - 100);
> }
> }
>
> static void test1(int[] a, int[] b) {
> for (int i = 0; i < a.length; i++) {
> a[i] = b[i] / 15;
> }
> }
>
> static void test2(int[] a, int[] b, int s) {
> for (int i = 0; i < a.length; i++) {
> a[i] = b[i] / 7;
> }
> }
> }
>
>
> And running this command:
> `./java -XX:CompileCommand=compileonly,Test::test1 -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:+TraceNewVectors Test.java`
>
> In the logs, I see it attempts to vectorize, crating packs like this:
>
> ...
> Pack: 7
> align: 0 678 RShiftI === _ 679 153 [[ 671 ]] !orig=561,154 !jvms: Test::test1 @ bci:15 (line 15)
> align: 4 667 RShiftI === _ 668 153 [[ 660 ]] !orig=154 !jvms: Test::test1 @ bci:15 (line 15)
> align: 8 561 RShiftI === _ 562 153 [[ 554 ]] !orig=154 !jvms: Test::test1 @ bci:15 (line 15)
> align: 12 154 RShiftI === _ 251 153 [[ 155 ]] !jvms: Test::test1 @ bci:15 (line 15)
> Pack: 8
> align: 0 676 MulL === _ 677 144 [[ 675 ]] !orig=559,146 !jvms: Test::test1 @ bci:15 (line 15)
> align: 8 665 MulL === _ 666 144 [[ 664 ]] !orig=146 !jvms: Test::test1 @ bci:15 (line 15)
> ...
>
>
> But then, I also see:
>
> Unimplemented
> 559 MulL === _ 560 144 [[ 558 ]] !orig=146 !jvms: Test::test1 @ bci:15 (line 15)
>
>
> And in `src/hotspot/cpu/aarch64/aarch64_vector.ad`, I see this:
>
> bool Matcher::match_rule_supported_auto_vectorization(int opcode, int vlen, BasicType bt) {
> if (UseSVE == 0) {
> // These operations are not profitable to be vectorized on NEON, because no direct
> // NEON instructions support them. But the match rule support for them is profitable for
> // Vector API intrinsics.
> if ((opcode == Op_VectorCastD2X && bt == T_INT) ||
> (opcode == Op_VectorCastL2X && bt == T_FLOAT) ||
> (opcode == Op_CountLeadingZerosV && bt == T_LONG) ||
> (opcode == Op_CountTrailingZerosV && bt == T_LONG) ||
> // The vector implementation of Op_AddReductionVD/F is for the Vector API only.
> // It is not suitable for auto-vectorization because it does not add the elements
> // in the same order as sequential code, and FP addition is ...
Thanks @eme64, updated now. I'll rerun the tests before integrating.
@pfustc @fg1417: @robcasloz recommended that I ask you to check that the `sve` IR checks do not fail for `test_divc` and `test_divc_n` in this changeset. Do you have machines on which you can check this?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17428#issuecomment-1914986863
More information about the hotspot-compiler-dev
mailing list