RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter [v2]

Tue Mar 7 20:28:25 UTC 2023

On Tue, 7 Mar 2023 02:53:48 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Implemented `Float.floatToFloat16` and `Float.float16ToFloat` intrinsics in Interpreter and C1 compiler to produce the same results as C2 intrinsics on x64, Aarch64 and RISC-V - all platforms where C2 intrinsics for these Java methods were implemented originally.
>> 
>> Replaced `SharedRuntime::f2hf()` and `hf2f()` C runtime functions with calls to runtime stubs which use the same HW instructions as C2 intrinsics. Only for 64-bit x64 because 32-bit x86 stub does not work: result is passed through FPU register and NaN values become different from C2 intrinsic. This runtime stub is only used to calculate constant values during C2 compilation and can be skipped.
>> 
>> I added new tests based on Tobias's `TestAll.java` And copied `jdk/lang/Float/Binary16Conversion*.java` tests to run them with `-Xcomp` to make sure code is compiled by C1 or C2. I modified `Binary16ConversionNaN.java` to compare results from Interpreter, C1 and C2.
>> 
>> Tested tier1-5, Xcomp, stress
>
> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Address review comments

We should not allow JIT compilation of `Float.float16ToFloat` and `Float.floatToFloat16` Java methods as we do for other math methods: [abstractInterpreter.hpp#L144](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/abstractInterpreter.hpp#L144)

What happens with @jatin-bhateja test is `Float` class is still not loaded when we trigger compilation for `Foo::main` method with `-Xcomp` flag.
C2 generates Uncommon trap at the very start of `main()` and code is deoptimized and goes to Interpreter which uses intrinsics.
C1 on other hand generates calls to `Float.float16ToFloat` and `Float.floatToFloat16` in compiled `Foo::main` method and then compiles `Float.float16ToFloat` and `Float.floatToFloat16` which are called.

The fix ix simple:

+++ b/src/hotspot/share/interpreter/abstractInterpreter.hpp
@@ -159,6 +159,8 @@ class AbstractInterpreter: AllStatic {
       case vmIntrinsics::_dexp  : // fall thru
       case vmIntrinsics::_fmaD  : // fall thru
       case vmIntrinsics::_fmaF  : // fall thru
+      case vmIntrinsics::_floatToFloat16       : // fall thru
+      case vmIntrinsics::_float16ToFloat       : // fall thru
       case vmIntrinsics::_Continuation_doYield : // fall thru
         return false;


test now produce the same result:

$ java -Xint Foo
2143297536
32257
$ java -Xcomp -XX:-TieredCompilation Foo
2143297536
32257
$ java -Xcomp -XX:TieredStopAtLevel=1 Foo
2143297536
32257

-------------

PR: https://git.openjdk.org/jdk/pull/12869