[aarch64-port-dev ] RFR: 8169697: aarch64: vectorized MLA instruction not generated for some test cases

Yang Zhang yang.zhang at linaro.org
Wed Nov 16 09:35:33 UTC 2016


Hi Andrew

The following function is a test case:
public static int vectMulAdd(
          int[] a,
          int[] b,
          int[] c,
          int[] d) {
    int total = 0;
    for (int i = 0; i < LENGTH; i++) {
      d[i] = (int)(a[i] * b[i] + c[i]);
      total += d[i];
    }
    return total;
  }

The following code snippet is produced by C2:
  0x0000007f98af88fc: ldr   q18, [x19,#32]
  0x0000007f98af8900: ldr   q17, [x4,#32]
  0x0000007f98af8904: ldr   q19, [x20,#32]
  0x0000007f98af8908: mul   v17.4s, v17.4s, v18.4s
  0x0000007f98af890c: add   v17.4s, v17.4s, v19.4s

It can be further optimized into:
  0x0000007f843485e0: ldr      q18, [x19,#16]
  0x0000007f843485e4: ldr      q17, [x20,#16]
  0x0000007f843485e8: ldr      q16, [x4,#16]
  0x0000007f843485ec: mla      v18.4s, v16.4s, v17.4s


Regards
Yang


On 16 November 2016 at 17:09, Andrew Haley <aph at redhat.com> wrote:

> On 16/11/16 08:28, Ningsheng Jian wrote:
> > Both patches tested with fastdebug jtreg.
>
> Please provide a simple test case which exercises these patterns.
>
> Thanks,
>
> Andrew.
>
>


More information about the aarch64-port-dev mailing list