RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9]

Thu Jan 9 19:25:53 UTC 2025

On Thu, 9 Jan 2025 13:23:19 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>> 
>>   Updating copyright year of modified files.
>
> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 1434:
> 
>> 1432:                     return float16ToRawShortBits(valueOf(product + float16ToFloat(f16c)));
>> 1433:                 });
>> 1434:         return shortBitsToFloat16(res);
> 
> I don't understand what is happening here. But I leave this to @PaulSandoz to review

Uncertain on what bits, but i am guessing it's mostly related to the fallback code in the lambda. To avoid the intrinsics operating on Float16 instances we instead "unpack" the carrier (16bits) values and pass those as arguments to the intrinsic. The fallback (when intrinsification is not supported) also accepts those carrier values as arguments and we convert the carriers to floats, operate on then, convert to the carrier, and then back to float16 on the result.

The code in the lambda could potentially be simplified if `Float16Math.fma` accepted six arguments the first three being the carrier values used by the intrinsic, and the subsequent three being the float16 values used by the fallback. Then we could express the code in the original source in the lambda. I believe when intrinsified there would be no penalty for those extra arguments.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1909327094