[lworld+fp16] RFR: 8329817: Augment prototype Float16 class [v13]

Jatin Bhateja jbhateja at openjdk.org
Tue Jun 18 06:59:30 UTC 2024


On Tue, 18 Jun 2024 01:11:45 GMT, Joe Darcy <darcy at openjdk.org> wrote:

>>> Why separate entry points for `add` and `sum`?
>> 
>> See previous comment https://github.com/openjdk/valhalla/pull/1117#discussion_r1640103814
>> 
>> tl;dr for this stage of the work, I didn't want to disable the existing `sum` intrinsic, but I think an `add` method works better in a larger context where methods for subtraction, multiplication, and division are being added.
>
>> > > > > Hi @jddarcy , Apart from few minor comments, patch looks good to me, there is a build error due to malformed javadoc comment.
>> > > > > Kindly fix and integrate.
>> > > > 
>> > > > 
>> > > > Thanks; let me take a pass at writing at least some basic regression tests before pushing.
>> > > > @jatin-bhateja , do you know if promoting the three operands of a Float16 fma to double, doing the operation in double, and rounding to Float16 is sufficient to correctly implement a Float16 fma? I haven't worked through all the cases yet and I'm not certain they're cannot be double-rounding issues. (If double rounding turns out to be a problem, I was thinking it would be possible to see if (a*b + c) was exact in double, and if, not add in a sticky bit to make sure the rounding occurs properly, but I haven't developed the details yet.)
>> > > 
>> > > 
>> > > Hi @jddarcy , As per specification of [Math.fma(float, float, float)](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Math.java#L2494) internal computation of constituent operation (mul and add) should be done at infinite precision and only final result should be rounded, we are now upcasting float16 to double but it will not prevent rounding happening for mul and add.
>> > 
>> > 
>> > Right; the fma needs to operate as-if it used infinite precision internally. This could be implemented (slowly) using JDK classes by implementing a BigDecimal -> Float16 conversion after doing a*b+c in BigDeicmal. I've been considering adding BigDecimal -> Float16 conversion anyway for completeness in the platform.
>> > My understanding of how fma is implemented in hardware is that for a format with P bits of precision, there is a ~2P wide internal register to hold the exact product as an intermediate result. Then the value being adding in can be aligned at the right exponent location and the final rounding back to P bits of precision can occur, with logic of any sticky bit for rounding to nearest even, etc.
>> > There are many cases where double (P = 53) will exactly hold the product and sum of three Float16 (P = 11) operands. However, the product can be so large or so small that rounding occurs when the third operand is added in.
>> > I haven't worked through if the potential round-offs are all benign with the final rounding to Float16 or if some corrective action would need to be taken to get the effect of a sticky bit. For example, if a_b is so large that the highest exponent position set is more than ...

Thanks @jddarcy , for detailed explanations. 
Sponsoring the patch.

-------------

PR Comment: https://git.openjdk.org/valhalla/pull/1117#issuecomment-2175285649


More information about the valhalla-dev mailing list