RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

Thu Jun 27 11:57:11 UTC 2024

On Thu, 6 Jun 2024 07:52:02 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   update header files for arm
>
> in progress...

Hi @Hamlin-Li , thanks for your work.

I tried to run benchmarks, [FloatMaxVector](https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/FloatMaxVector.java#L1068) and [DoubleMaxVector](https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/DoubleMaxVector.java#L1068), on different aarch64 machines.

Here is the data I got for `TANH`, with args `-i 5 -f 3 -wi 3 -foe true -jvmArgs -Xms4g -Xmx4g -XX:+AlwaysPreTouch -XX:ObjectAlignmentInBytes=16`:

// NEON machine
Benchmark             (size)   Mode     Cnt  Units     Perf gain
DoubleMaxVector.TANH   1024    thrpt    15   ops/ms     -38%
FloatMaxVector.TANH    1024    thrpt    15   ops/ms     -26%

// 128-bit sve machine (TANH also implemented with NEON)
Benchmark             (size)   Mode     Cnt  Units     Perf gain
DoubleMaxVector.TANH   1024    thrpt    15    ops/ms    -19%
FloatMaxVector.TANH    1024    thrpt    15    ops/ms    ~00%

The performance of vector stubs for `TANH` looks not quite stable on different NEON machines. Since this pr does not provide `TANH` interface on sve machines for [the performance regression](https://github.com/openjdk/jdk/pull/16234/commits/2a7730d6acbac80438a43d1502cff6a476f8b5b5#diff-9112056f732229b18fec48fb0b20a3fe824de49d0abd41fbdb4202cfe70ad114R8521-R8525), how about also disabling it on NEON for the same reason? WDYT? 

Thanks.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2194480996