[aarch64-port-dev ] RFR(M): 8213134: AArch64: vector shift failed with MaxVectorSize=8
Yang Zhang (Arm Technology China)
Yang.Zhang at arm.com
Wed Nov 7 07:07:23 UTC 2018
Hi,
When I implemented AArch64 NEON for Vector API (http://openjdk.java.net/jeps/338), I found an issue about vector shift. I have a patch which could fix this issue. Could anyone help to review this patch?
Webrev: http://cr.openjdk.java.net/~yzhang/8213134/webrev.00/
JBS: https://bugs.openjdk.java.net/browse/JDK-8213134
This patch is verified both in jdk/jdk master and panama/vectorIntrinsics, and tests are passed.
The only concern is that extra instructions are needed for some test cases.
For example:
public static void aShiftRImm(int[] a, int b, int[] c) {
for (int i = 0; i < a.length; i++) {
c[i] = (int)(a[i] >> b);
}
}
Without this patch, this method can be optimized by C2:
0x0000ffff84e75864: dup v16.16b, w10
0x0000ffff84e75868: neg v16.16b, v16.16b ------------------------------ only 1 neg instruction
0x0000ffff84e75878: ldr q17, [x12, #16]
0x0000ffff84e7587c: sshl v17.4s, v17.4s, v16.4s
0x0000ffff84e75884: str q17, [x10, #16]
0x0000ffff84e75888: ldr q17, [x12, #32]
0x0000ffff84e7588c: sshl v17.4s, v17.4s, v16.4s
0x0000ffff84e75890: str q17, [x10, #32]
0x0000ffff84e75894: ldr q17, [x12, #48]
0x0000ffff84e75898: sshl v17.4s, v17.4s, v16.4s
0x0000ffff84e7589c: str q17, [x10, #48]
0x0000ffff84e758a0: ldr q17, [x12, #64]
0x0000ffff84e758a4: sshl v17.4s, v17.4s, v16.4s
0x0000ffff84e758ac: str q17, [x10, #64]
With this patch, this method can be optimized by C2:
0x0000ffff78e708e4: dup v16.16b, w10
0x0000ffff78e708f8: ldr q17, [x12, #16]
0x0000ffff78e708fc: neg v18.16b, v16.16b ------------------------------- 4 neg instructions
0x0000ffff78e70900: sshl v17.4s, v17.4s, v18.4s
0x0000ffff78e70908: str q17, [x10, #16]
0x0000ffff78e7090c: ldr q17, [x12, #32]
0x0000ffff78e70910: neg v18.16b, v16.16b
0x0000ffff78e70914: sshl v17.4s, v17.4s, v18.4s
0x0000ffff78e70918: str q17, [x10, #32]
0x0000ffff78e7091c: ldr q17, [x12, #48]
0x0000ffff78e70920: neg v18.16b, v16.16b
0x0000ffff78e70924: sshl v17.4s, v17.4s, v18.4s
0x0000ffff78e70928: str q17, [x10, #48]
0x0000ffff78e7092c: ldr q17, [x12, #64]
0x0000ffff78e70930: neg v18.16b, v16.16b
0x0000ffff78e70934: sshl v17.4s, v17.4s, v18.4s
0x0000ffff78e7093c: str q17, [x10, #64]
Regards
Yang
More information about the aarch64-port-dev
mailing list