[aarch64-port-dev ] RFR(S): 8243597: AArch64: Add support for integer vector abs

Andrew Haley aph at redhat.com
Fri Jun 5 15:01:48 UTC 2020


On 05/06/2020 11:35, Yang Zhang wrote:
> Hi Andrew
> 
> Please check this java program. 
> http://cr.openjdk.java.net/~yzhang/8243597/TestAbs.java
> absvs is used to generate AbsVS node.
> Abss is used to generate AbsI node.
> 
> I update the jmh benchmarks to make them aligned with absvs and abss above. The new results are as follows:
> New vector jmh: 
> http://cr.openjdk.java.net/~yzhang/8243597/TestVectNew.java
> New scalar jmh:
> http://cr.openjdk.java.net/~yzhang/8243597/TestScalarNew.java
> 
> Before:
> Benchmark               (size)  Mode  Cnt     Score   Error  Units
> TestVectNew.testVectAbsVB    1024  avgt    5  1221.852 ± 3.336  us/op
> TestVectNew.testVectAbsVI    1024  avgt    5  1450.422 ± 6.344  us/op
> TestVectNew.testVectAbsVL    1024  avgt    5  1429.934 ± 4.901  us/op
> TestVectNew.testVectAbsVS    1024  avgt    5  1227.134 ± 2.901  us/op
> TestScalarNew.testAbsI    1024  avgt    5  3777.007 ± 10.067  us/op
> TestScalarNew.testAbsL    1024  avgt    5  3776.717 ± 13.776  us/op
> TestScalarNew.testAbsS    1024  avgt    5  3153.195 ± 10.175  us/op
> 
> After
> Benchmark               (size)  Mode  Cnt    Score    Error  Units
> TestVectNew.testVectAbsVB    1024  avgt    5  147.389 ±  0.921  us/op
> TestVectNew.testVectAbsVI    1024  avgt    5  444.318 ± 14.107  us/op
> TestVectNew.testVectAbsVL    1024  avgt    5  874.074 ±  2.224  us/op
> TestVectNew.testVectAbsVS    1024  avgt    5  224.559 ±  0.902  us/op
> TestScalarNew.testAbsI    1024  avgt    5  3087.172 ± 62.372  us/op
> TestScalarNew.testAbsL    1024  avgt    5  3113.322 ± 10.237  us/op
> TestScalarNew.testAbsS    1024  avgt    5  2723.048 ±  8.338  us/op

I tried TestAbs with a ThunderX2, and it certainly looks nice: great
improvement across the board.

Benchmark      Mode  Cnt     Score    Error  Units
TestAbs.absvb  avgt    8   971.100 ±  1.544  ns/op
TestAbs.absvs  avgt    8   983.061 ±  1.626  ns/op
TestAbs.absvi  avgt    8  1170.826 ± 11.055  ns/op
TestAbs.absvl  avgt    8  1159.936 ±  3.747  ns/op

Benchmark      Mode  Cnt    Score   Error  Units
TestAbs.absvb  avgt    8  117.981 ± 1.048  ns/op
TestAbs.absvs  avgt    8  174.949 ± 4.158  ns/op
TestAbs.absvi  avgt    8  352.012 ± 0.884  ns/op
TestAbs.absvl  avgt    8  702.076 ± 0.116  ns/op

OK, we're good to go. Thanks, approved.

> Why the improvement of scalar abs is not as obvious as vector abs is because only one instruction is reduced than before.
> Before:
>   0x0000ffff80b763d8:   cmp	w12, #0x0
>   0x0000ffff80b763dc:   neg	w11, w12
>   0x0000ffff80b763e0:   csel	w11, w11, w12, lt  // lt = tstop
> 
> After:
>   0x0000ffffa0bd7a38:   cmp	w12, wzr
>   0x0000ffffa0bd7a3c:   cneg	w13, w12, lt  // lt = tstop

That's interesting, too: we don't have a cneg pattern, which is I guess
an omission.

> Ps. The generated assembly files are also attached.
> Before this patch
> http://cr.openjdk.java.net/~yzhang/8243597/TestAbs.java.aarch64.ori.asm
> After this patch: 
> http://cr.openjdk.java.net/~yzhang/8243597/TestAbs.java.aarch64.asm
Great. Again, sorry for the slow response.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the aarch64-port-dev mailing list