[aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs
Andrew Haley
aph at redhat.com
Thu Jun 4 16:09:41 UTC 2020
On 18/05/2020 06:51, Yang Zhang wrote:
> Testing:
> Full jtreg test
> Vector API tests which cover vector abs
>
> Test case:
> public static void absvs(short[] a, short[] b, short[] c) {
> for (int i = 0; i < a.length; i++) {
> c[i] = (short)Math.abs((a[i] + b[i]));
> }
> }
>
> Assembly code generated by C2:
> 0x0000ffffaca3f3ac: ldr q17, [x16, #16]
> 0x0000ffffaca3f3b0: ldr q16, [x15, #16]
> 0x0000ffffaca3f3b4: add v16.8h, v16.8h, v17.8h
> 0x0000ffffaca3f3b8: abs v16.8h, v16.8h
> 0x0000ffffaca3f3c0: str q16, [x12, #16]
>
> Similar test cases for byte/int/long are also tested and NEON abs instruction is generated by C2.
Unfortunately the test cases you provided do not include the method
absvs(short).
I'm not seeing this result. All I get with your patch applied in
the case of your test TestScalar
@Benchmark
public void testAbsI() {
for (int n = 0; n < LOOP_CNT; n++) {
for (int i = 0; i < ia.length; i += 4) {
ic[i] = Math.abs(ia[i] + ib[i]);
}
}
}
is
;; B18: # out( B18 B19 ) <- in( B17 B18 ) Loop( B18-B18 inner main of N82 strip mined) Freq: 9.69583e+08
0x0000ffff78824da0: sbfiz x11, x4, #2, #32
0x0000ffff78824da4: add x7, x0, x11 ;*iaload {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.TestScalar::testAbsI at 36 (line 44)
0x0000ffff78824da8: add xmethod, x18, x11 ;*iaload {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.TestScalar::testAbsI at 30 (line 44)
0x0000ffff78824dac: ldr w2, [x7,#16]
0x0000ffff78824db0: ldr w13, [xmethod,#16]
0x0000ffff78824db4: add w13, w13, w2
0x0000ffff78824db8: cmp w13, wzr
0x0000ffff78824dbc: cneg w1, w13, lt
0x0000ffff78824dc0: add x11, x3, x11
0x0000ffff78824dc4: str w1, [x11,#16]
0x0000ffff78824dc8: ldr w2, [x7,#32]
0x0000ffff78824dcc: ldr w1, [xmethod,#32]
0x0000ffff78824dd0: add w13, w1, w2
0x0000ffff78824dd4: cmp w13, wzr
0x0000ffff78824dd8: cneg w1, w13, lt
0x0000ffff78824ddc: str w1, [x11,#32]
0x0000ffff78824de0: ldr w13, [x7,#48]
0x0000ffff78824de4: ldr w1, [xmethod,#48]
0x0000ffff78824de8: add w1, w1, w13
0x0000ffff78824dec: cmp w1, wzr
0x0000ffff78824df0: cneg w13, w1, lt
0x0000ffff78824df4: str w13, [x11,#48] ;*iastore {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.TestScalar::testAbsI at 41 (line 44)
0x0000ffff78824df8: ldr w1, [x7,#64]
0x0000ffff78824dfc: ldr w12, [xmethod,#64]
0x0000ffff78824e00: add w12, w12, w1
0x0000ffff78824e04: cmp w12, wzr
0x0000ffff78824e08: cneg w13, w12, lt
0x0000ffff78824e0c: add w4, w4, #0x10 ;*iinc {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.TestScalar::testAbsI at 42 (line 43)
0x0000ffff78824e10: str w13, [x11,#64] ;*iastore {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.TestScalar::testAbsI at 41 (line 44)
0x0000ffff78824e14: cmp w4, w6
0x0000ffff78824e18: b.lt 0x0000ffff78824da0 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.TestScalar::testAbsI at 17 (line 43)
Please provide me with a Java program that reproduces the result
above, thanks.
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the aarch64-port-dev
mailing list