[lworld+fp16] RFR: 8329817: Augment prototype Float16 class [v5]

Fri Jun 14 06:22:25 UTC 2024

On Wed, 12 Jun 2024 22:38:19 GMT, Joe Darcy <darcy at openjdk.org> wrote:

>>> Hi @jddarcy , Apart from few minor comments, patch looks good to me, there is a build error due to malformed javadoc comment.
>>> 
>>> Kindly fix and integrate.
>> 
>> Oops; sorry for the build breakage -- no change is so small that you shouldn't build before pushing ;-)
>
>> Hi @jddarcy , Apart from few minor comments, patch looks good to me, there is a build error due to malformed javadoc comment.
>> 
>> Kindly fix and integrate.
> 
> Thanks; let me take a pass at writing at least some basic regression tests before pushing.
> 
> @jatin-bhateja , do you know if promoting the three operands of a Float16 fma to double, doing the operation in double, and rounding to Float16 is sufficient to correctly implement a Float16 fma? I haven't worked through all the cases yet and I'm not certain they're cannot be double-rounding issues. (If double rounding turns out to be a problem, I was thinking it would be possible to see if (a*b + c) was exact in double, and if, not add in a sticky bit to make sure the rounding occurs properly, but I haven't developed the details yet.)

> > > Hi @jddarcy , Apart from few minor comments, patch looks good to me, there is a build error due to malformed javadoc comment.
> > > Kindly fix and integrate.
> > 
> > 
> > Thanks; let me take a pass at writing at least some basic regression tests before pushing.
> > @jatin-bhateja , do you know if promoting the three operands of a Float16 fma to double, doing the operation in double, and rounding to Float16 is sufficient to correctly implement a Float16 fma? I haven't worked through all the cases yet and I'm not certain they're cannot be double-rounding issues. (If double rounding turns out to be a problem, I was thinking it would be possible to see if (a*b + c) was exact in double, and if, not add in a sticky bit to make sure the rounding occurs properly, but I haven't developed the details yet.)
> 
> Hi @jddarcy , As per specification of [Math.fma(float, float, float)](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Math.java#L2494) internal computation of constituent operation (mul and add) should be done at infinite precision and only final result should be rounded, we are now upcasting float16 to double but it will not prevent rounding happening for mul and add.

Right; the fma needs to operate as-if it used infinite precision internally. This could be implemented (slowly) using JDK classes by implementing a BigDecimal -> Float16 conversion after doing a*b+c in BigDeicmal. I've been considering adding BigDecimal -> Float16 conversion anyway for completeness in the platform.

My understanding of how fma is implemented in hardware is that for a format with P bits of precision, there is a ~2P wide internal register to hold the exact product as an intermediate result. Then the value being adding in can be aligned at the right exponent location and the final rounding back to P bits of precision can occur, with logic of any sticky bit for rounding to nearest even, etc.

There are many cases where double (P = 53) will exactly hold the product and sum of three Float16 (P = 11) operands.  However, the product can be so large or so small that rounding  occurs when the third operand is added in.

I haven't worked through if the potential round-offs are all benign with the final rounding to Float16 or if some corrective action would need to be taken to get the effect of a sticky bit. For example, if a*b is so large that the highest exponent position set is more than 53 positions away from the lowest exponent position set on c, if the final result is going to overflow anyway, the round-off in computing a*b+c in double doesn't matter. It might be problematic if a*b is much smaller than c, but that is another case I haven't fully thought through yet.

I'll give an updated on my analysis/research on this fma issue by next week.

-------------

PR Comment: https://git.openjdk.org/valhalla/pull/1117#issuecomment-2167289949