RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter

Wed Feb 22 21:56:28 UTC 2023

On Wed, 22 Feb 2023 21:21:42 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>>> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality:
>>> 
>>>     1. The interpreted Java code
>>> 
>>>     2. The compiled non-intrinisc sharedRuntime code
>>> 
>>>     3. The compiler intrinsic that uses a hardware instruction.
>>> 
>>> 
>>> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms.
>>> 
>>> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely?
>> 
>> From the spec (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#float16ToFloat(short))
>> 
>> "Returns the float value closest to the numerical value of the argument, a floating-point binary16 value encoded in a short. The conversion is exact; all binary16 values can be exactly represented in float. Special cases:
>> 
>>     If the argument is zero, the result is a zero with the same sign as the argument.
>>     If the argument is infinite, the result is an infinity with the same sign as the argument.
>>     If the argument is a NaN, the result is a NaN. "
>> 
>> If the float argument is a NaN, you are supposed to get a float16 NaN as a result -- that is all the specification requires. However, the implementation makes stronger guarantees to try to preserve some non-zero NaN significand bits if they are set.
>> 
>> "NaN boxing" is a technique used to put extra information into the significand bits a NaN and pass the around. It is consistent with the intended use of the feature by IEEE 754 and used in various language runtimes: e.g.,
>> 
>> https://piotrduperas.com/posts/nan-boxing
>> https://leonardschuetz.ch/blog/nan-boxing/ 
>> https://anniecherkaev.com/the-secret-life-of-nan
>> 
>> The Java specs are careful to avoid mentioning quiet vs signaling NaNs in general discussion.
>> 
>> That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called.
>
>> We don't know that all HW will produce the same NaN "payload", right? Instead, we might need interpreter intrinsics. I assume that is how the trig functions are handled that @jddarcy mentioned.
> 
> Good point. We can't guarantee that all OpenJDK ports HW do the same.
> 
> If CPU has corresponding instructions we need to generate a stub during VM startup with HW instructions and use it in all cases (or directly the same instruction in JIT compiled code).
> If CPU does not have instruction we should use runtime C++ function in all cases to be consistent.

Thanks @vnkozlov @dean-long. One last question before I withdraw the PR: As QNaN bit is supported across current architectures like x86, ARM and may be others as well for conversion, couldn't we go ahead with this PR? The architectures that behave differently could then follow the technique suggested by Vladimir Kozlov as and when they implement the intrinsic?

-------------

PR: https://git.openjdk.org/jdk/pull/12704