[aarch64-port-dev ] RFR: 8182583: AArch64: FMA Vectorization on aarch64

Yang Zhang yang.zhang at linaro.org
Tue Jun 27 06:27:43 UTC 2017


On 27 June 2017 at 01:12, Andrew Haley <aph at redhat.com> wrote:
> On 26/06/17 08:44, Yang Zhang wrote:
>> Webrev:
>> http://cr.openjdk.java.net/~njian/8182583/webrev.00/
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8182583
>>
>> Would you please help to review it?
>
> That looks right, thanks.  How was it tested?


For correctness test:
I use the test code such as
for (int i = 0; i < loop; i++) {
      c[i] = Math.fma(a[i], b[i], c[i]);
    }
Array a, b, c can be float or double. I also test other combinations such as
c[i] = Math.fma(-a[i], b[i], c[i])
c[i] = Math.fma(a[i], -b[i], c[i])
c[i] = Math.fma(a[i], b[i], -c[i])

First, I check the generated assembly. The following SIMD instructions
can be generated in different cases:
fmla v16.4s, v17.4s, v18.4s
fmls v16.4s, v17.4s, v18.4s
fmla v16.2d, v17.2d, v20.2d
fmls v16.2d, v17.2d, v20.2d

Then I use the result of x86 as a reference to verify the output of
aarch64 locally. They are the same.
Ps. In patch AArch64: Intrinsify fused mac operations(
https://bugs.openjdk.java.net/browse/JDK-8162338 ), test file is added
as test/compiler/floatingpoint/TestFMA.java (hotspot suite). This test
is passed too.

For performance test:
I add the test cases just like above to jmh.

Regards
Yang


>
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the aarch64-port-dev mailing list