RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

Hamlin Li mli at openjdk.org
Thu Jun 6 07:54:46 UTC 2024


On Wed, 8 May 2024 17:41:23 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you help to review the patch?
>> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294).
>> 
>> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time.
>> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk.
>> 
>> Besides of the code changes, one important task is to handle the legal process.
>> 
>> Thanks!
>> 
>> ## Performance
>> NOTE: 
>> * `Src` means implementation in this pr, i.e. without depenency on external sleef.
>> * `Disabled` means disable intrinsics by `-XX:-UseVectorStubs` 
>> * `system_sleef` means implementation in [previous pr 18294](https://github.com/openjdk/jdk/pull/18294), i.e. build and run jdk with depenency on external sleef.
>> 
>> Basically, the perf data below shows that 
>> * this implementation has better performance than previous version in [pr 18294](https://github.com/openjdk/jdk/pull/18294), 
>> * and both sleef versions has much better performance compared with non-sleef version.
>> 
>> |Benchmark                     |(size)|Src      |Units|system_sleef|(system_sleef-Src)/Src|Diabled  |(Disable-Src)/Src|
>> |------------------------------|------|---------|-----|------------|----------------------|---------|-----------------|
>> |3472:Double128Vector.ACOS     |1024  |8546.842 |ns/op|8516.007    |-0.004                |16799.273|0.966            |
>> |3473:Double128Vector.ASIN     |1024  |6864.656 |ns/op|6987.328    |0.018                 |16602.442|1.419            |
>> |3474:Double128Vector.ATAN     |1024  |11489.255|ns/op|12261.800   |0.067                 |26329.320|1.292            |
>> |3475:Double128Vector.ATAN2    |1024  |16661.170|ns/op|17234.472   |0.034                 |42084.100|1.526            |
>> |3476:Double128Vector.CBRT     |1024  |18999.387|ns/op|20298.458   |0.068                 |35998.688|0.895            |
>> |3477:Double128Vector.COS      |1024  |14081.857|ns/op|14846.117   |0.054                 |24420.692|0.734            |
>> |3478:Double128Vector.COSH     |1024  |12202.306|ns/op|12237.772   |0.003                 |21343.863|0.749            |
>> |3479:Double128Vector.EXP      |1024  |4553.108 |ns/op|4777.638  ...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update header files for arm

in progress...

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2151633043


More information about the build-dev mailing list