RFR: 8282541: AArch64: Auto-vectorize Math.round API [v2]

Thu Apr 14 17:43:49 UTC 2022

On Wed, 13 Apr 2022 07:01:18 GMT, Ningsheng Jian <njian at openjdk.org> wrote:

> I don't know why do we need these rules. Should "UseSVE > 0" all go to the rules in sve ad file which call to vector_round_sve()?

The freely-available Arm® Neoverse V1 Software Optimization Guide shows instructions such as ASIMD `FRINTA` having a throughput of 2 operations per clock, whereas it shows SVE `FRINTA` has a throughput of 1 operation per clock. This is true of most instructions used in `Math.round()`. I conclude that on V1, for short vectors, if we use ASIMD rather than equivalent SVE instructions, we should expect to virtually double throughput. For vectors wider than ASIMD supports, SVE should be a win.

At present, there is no reason not to use ASIMD for short vectors on all AArch64 processors. It won't significantly impair performance, and I can't think of any future circumstances in which it might.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8204