RFR: 8370409: Incorrect computation in Float16 reduction loop

Emanuel Peter epeter at openjdk.org
Thu Oct 30 07:14:06 UTC 2025


On Thu, 30 Oct 2025 07:03:42 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Current floatToFloat16 intrinsic implementation always sign-extends the 16-bit short result to a 32-bit value in anticipation of safe consumption by subsequent integral (comparison) operation[s]. However, the safest way to compare two Float16 values is to use Float16.compare/compareTo method, given that floating point comparisons can also be unordered.
>> 
>> e.g., both 64512 and -1024 are equivalent bit representations of the Float16 -Inf value, but are not numerically equivalent with integral comparison.
>> jshell> Float16.compare(Float16.shortBitsToFloat16((short)-1024), Float16.shortBitsToFlot16((short)64512))
>> $3 ==> 0
>> 
>> In the scalar intrinsic of Float16.add/sub/mul/div/min/max, we always return a boxed value, which is then operated upon by the subsequent Float16 APIs. While Float.floatToFloat16 intrinsic always returns a 'short' value, this is special in the sense that even though the carrier type is 'short' but it encodes an IEEE 754 half precision value, being a short carrier if they get exposed to integral operators, then as per JVM specification, short must be sign-extended before operation.
>> 
>> Given that our Float16 binary operations inference is based on generic pattern match and is agnostic to how that graph pallet got created, i.e., either through Float16.* APIs or by explicit Float.float16ToFloat/floatToFloat16 operations, hence it's safe to sign-extend the result in all cases.
>> 
>> Kindly review the patch and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> test/hotspot/jtreg/compiler/c2/TestFloat16Reduction.java line 33:
> 
>> 31:  * @library /test/lib /
>> 32:  * @run main/othervm -XX:-TieredCompilation
>> 33:  *                   compiler.c2.TestFloat16Reduction
> 
> Was the flag required for reproducing the issue?
> If it was not required: just remove it
> If it was required: add a run without the flag, in addition to a run with the flag.

Also: the flat `-XX:-TieredCompilation` is now applied to the VM that runs the TestFramework, but that is not necessary. You could just do `framework.addFlags("-XX:-TieredCompilation")`, so that the flag only gets applied to the test VM.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27977#discussion_r2476644092


More information about the hotspot-compiler-dev mailing list