RFR: JDK-8289551: Conversions between bit representations of half precision values and floats [v6]

Raffaello Giulietti duke at openjdk.org
Sat Jul 23 20:07:13 UTC 2022


On Sat, 23 Jul 2022 18:55:03 GMT, Raffaello Giulietti <duke at openjdk.org> wrote:

>> Joe Darcy has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Implement review feedback.
>
> src/java.base/share/classes/java/lang/Float.java line 1122:
> 
>> 1120:             // binary16 (when rounding is done, could still round up)
>> 1121:             int exp = Math.getExponent(f);
>> 1122:             assert -25 <= exp && exp <= 15;
> 
> I think that both the subnormal and the normal case can be unified if we pay closer attention to the positions of the lsb, round and sticky bits in subnormals.
> 
> 
>         // Clamp exp to the [-15, 15] range while retaining the
>         // difference between the original value and -15 on clamping.
>         // This is the excess shift value in addition to 13.
>         int expdelta = Math.max(0, -15 - exp);
>         exp += expdelta;
>         assert -15 <= exp && exp <= 15;
> 
>         int f_signif_bits = doppel & 0x007f_ffff;  // original significand
>         // Significand bits as if using rounding to zero (truncation).
>         short signif_bits = (short)(f_signif_bits >> (13 + expdelta));
> 
>         // For round to nearest even, determining whether or
>         // not to round up (in magnitude) is a function of the
>         // least significant bit (LSB), the next bit position
>         // (the round position), and the sticky bit (whether
>         // there are any nonzero bits in the exact result to
>         // the right of the round digit). An increment occurs
>         // in three cases:
>         //
>         // LSB  Round Sticky
>         // 0    1     1
>         // 1    1     0
>         // 1    1     1
>         // See "Computer Arithmetic Algorithms," Koren, Table 4.9
> 
>         int lsb    = f_signif_bits & (1 << 13 + expdelta);
>         int round  = f_signif_bits & (1 << 12 + expdelta);
>         int sticky = f_signif_bits & ((1 << 12 + expdelta) - 1);
> 
>         if (round != 0 && ((lsb | sticky) != 0 )) {
>             signif_bits++;
>         }
> 
>         // No bits set in significand beyond the *first* exponent
>         // bit, not just the sigificand; quantity is added to the
>         // exponent to implement a carry out from rounding the
>         // significand.
>         assert (0xf800 & signif_bits) == 0x0;
> 
>         return (short)(sign_bit | ( ((exp + 15) << 10) + signif_bits ) );

I didn't test this variant, will do tomorrow when also reviewing the tests themselves.

-------------

PR: https://git.openjdk.org/jdk/pull/9422


More information about the core-libs-dev mailing list