RFR: 8345125: Aarch64: Add aarch64 backend for Float16 scalar operations [v2]

Thu Apr 24 15:56:42 UTC 2025

On Wed, 26 Feb 2025 10:23:51 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Hi @shqking , thanks for your review comments. Yes I added `fabdh` and `fnmulh` to keep aligned with float and double types.
>> For adding support for FP16 `absd` we need `AbsHF` to be supported (along with SubHF) but `AbsHF` node is not implemented currently. `abs` operation is directly executed from the java code here - https://github.com/openjdk/jdk/blob/037e47112bdf2fa2324f7c58198f6d433f17d9fd/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java#L1464 and is not intrinsified or pattern matched like other FP16 operations. Same with `negate` operation for FP16 - https://github.com/openjdk/jdk/blob/037e47112bdf2fa2324f7c58198f6d433f17d9fd/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java#L1449
>> On the Valhalla repo, while these operation were being developed, I tried adding support for `AbsHF/NegHF` which emitted `fabs` and `fneg` instructions but the performance with the direct java code(bit manipulation operations) was much faster (sorry don't remember the exact number) so we decided to go with the java implementation instead.
>> I still added `fabd` here because `op21` is 0 only in `fabd` H variant and felt that it'd be better to handle it here as it belongs to this group of instructions. Please let me know your thoughts.
>
> According to the RM, fabd is in _Advanced SIMD scalar three same FP16_, but the rest are in _Floating-point data-processing (2 source)_.  The decoding scheme looks rather different.`fabd`, then, doesn't really fit here, but in a section with the rest of the three same FP16 instructions.
> The encoding scheme for _Advanced SIMD scalar three same FP16_ is pretty simple, so I suggest you create a new group for them, and put `fabd` in there.

Hi @theRealAph Thanks again for the review and apologies for the delay in responding.
I moved the three `fabd` instructions out of their current place and added them in two separate sections -  one for the single and double precision (_Advanced SIMD scalar three same_) and another for FP16 (_Advanced SIMD scalar three same FP16_). Please review the changes. Thank you!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23748#discussion_r2058766741