[aarch64-port-dev ] [RFR] aarch64: C2 generate vectorized MLA/MLS instructions

felix yang fei.yang0953 at yahoo.com
Thu Dec 3 14:22:26 UTC 2015


Hi,
  Can someone help review and sponsor this code generation improvement for aarch64 port?  
  Bug: https://bugs.openjdk.java.net/browse/JDK-8144587
  Webrev: http://cr.openjdk.java.net/~fyang/8144587/webrev.00/

  The hotspot/test/compiler/loopopts/superword/SumRed_Int.java can server as a test case.   With this patch, the following code snippet by C2: 
    0x0000007f6cec12cc: mul v19.4s, v16.4s, v17.4s
    0x0000007f6cec12d0: mul v16.4s, v16.4s, v18.4s
    0x0000007f6cec12d4: mul v17.4s, v18.4s, v17.4s
    0x0000007f6cec12d8: add v16.4s, v19.4s, v16.4s
    0x0000007f6cec12dc: add v16.4s, v16.4s, v17.4s
  will be further optimized into: 
    0x0000007f9cdb86dc: mul      v19.4s, v16.4s, v17.4s
    0x0000007f9cdb86e0: mla      v19.4s, v16.4s, v18.4s
    0x0000007f9cdb86e4: mla      v19.4s, v17.4s, v18.4s

  About 13% performance gain achieved for the test case on my aarch64 server.  
  Tested with jtreg hotspot & langtools.  Results are the same before and after.  
  Is it OK to push?  

Felix,  
Thanks for your help.  




More information about the aarch64-port-dev mailing list