JLS 3.10.2 -- Exposition of hexadecimal f.p. literals

Mon Oct 28 23:15:29 UTC 2019

Thanks Joe. I am unable to spend more time on this issue, so I have 
filed JDK-8233092 to record it for the future.

Alex

On 10/25/2019 5:29 PM, Joe Darcy wrote:
> To provide some additional background on this thread if not the JLS 
> section in question, hexadecimal floating-point literals are a very 
> useful language feature for a narrow range of situations. Those 
> situations include having a straightforward way to set the exact bits of 
> a floating-point value using a roughly human readable format. I commonly 
> use hexadecimal floating-point literals in numerical tests and have 
> added them to the narrative specs for sentinel values such as 
> Double.MAX_VALUE.
> 
> A finite IEEE floating-point value is conceptually a tuple of three 
> values, sign, significand, and exponent, where the significand and 
> exponent have ranges that are a function of the format in question, 
> float or double, etc. Depending on how one wants to formulate the 
> values, the ranges of each of these values can be given in terms of a 
> set of contiguous integers.
> 
> In a hex floating-point literal, the significand value is written in hex 
> but the exponent value is written in *decimal*. However, the decimal 
> value is used as an exponent for base 2 and in that sense is a "binary" 
> exponent. This seemingly conflicting design works for the intended use 
> cases.
> 
> For example, the smallest nonzero double value is numerically equal to 
> 2^-1074. As a hex literal this can be written in a number of ways including
> 
>      0x1.0p-1074
> 
> in decimal, 1 * 2 ^ -1074. The way to write the value corresponding to 
> the representation of the floating-point value, using some underscores 
> to help grouping, is
> 
>      0x0.0000_0000_0000_1p-1022
> 
> Decoding, this is a subnormal value (leading digit 0 with the lowest 
> exponent value) and only the least significant bit of the significand is 
> set. The double format uses 52-bits for its significand field, 13 hex 
> digits.
> 
> Other examples would include how to write "3.0" in a way corresponding 
> to the representation:
> 
>      0x1.8p1
> 
> that is 1.8 as a hex value (namely 1.5 in decimal) multiplied by 2^1 = 2.
> 
> On 10/24/2019 5:04 PM, Alex Buckley wrote:
>> On 10/24/2019 1:38 PM, Joe Darcy wrote:
>>> To make an explicit statement about the value of a floating-point 
>>> literal, I suggest after the sentence
>>>
>>> "A floating-point literal may be expressed in decimal (base 10) or 
>>> hexadecimal (base 16). "
>>>
>>> adding something like
>>>
>>> "The exact numerical value of a decimal floating-point literals is
>>>
>>>      decimal_sequence * 10 ^ exponent
>>>
>>> The exact numerical value of a hexadecimal floating-point literal is
>>>
>>>      hex_sequence * 2 ^ exponent
>>>
>>> The conversion of the exact numerical value to a particular 
>>> floating-point value is handled as if by Float.valueOf or 
>>> Double.valueOf for literals of type float and double, respectively."
>>
>> This is a good start, but needs tightening up. Please consider this 
>> text as if you're seeing it for the first t
> 
> To be more explicit, "decimal_sequence * 10 ^ exponent " is an informal 
> short-hand for "in each of the possible decimal floating-point literal 
> forms below, collect together the leading digits as a digit sequence, 
> treat it as a normal rational numerical value and multiply it by 10 
> raised to the exponent where the exponent if implicitly 1 if not 
> syntactically present in the literal."
> 
>       Digits . [Digits] [ExponentPart] [FloatTypeSuffix]
>      . Digits [ExponentPart] [FloatTypeSuffix]
>      Digits ExponentPart [FloatTypeSuffix]
>      Digits [ExponentPart] FloatTypeSuffix
> 
> 
>> ime, bearing in mind that it's defining terms which map to productions 
>> in the grammar immediately after.
>>
>> -----
>> A floating-point literal may be expressed in decimal (base 10) or 
>> hexadecimal (base 16).
>>
>> For decimal floating-point literals, at least one digit (in either the 
>> whole number or the fraction part) and either a decimal point, an 
>> exponent, or a float type suffix are required. All other parts are 
>> optional. The exponent, if present, is indicated by the ASCII letter e 
>> or E followed by an optionally signed integer.
>>
>> The exact numerical value of a decimal floating-point literal is:
>>   decimal_sequence * 10 ^ exponent
>>
>> For hexadecimal floating-point literals, at least one digit is 
>> required (in either the whole number or the fraction part), and the 
>> exponent is mandatory, and the float type suffix is optional. The 
>> exponent is indicated by the ASCII letter p or P followed by an 
>> optionally signed integer.
>>
>> The exact numerical value of a hexadecimal floating-point literal is:
>>   hex_sequence * 2 ^ exponent
>>
>> Underscores are allowed as separators between digits that denote the 
>> whole-number part, and between digits that denote the fraction part, 
>> and between digits that denote the exponent.
>> -----
>>
>> - What is "decimal_sequence"? The answer must be in terms of the 
>> artifacts mentioned in the immediately preceding paragraph -- or 
>> modify the grammar to introduce new artifacts that can be described in 
>> the narrative.
>>
>> - Similarly for "hex_sequence".
>>
>> - A decimal f-p literal need not include the exponent part, so the 
>> definition can't just assume "exponent" is known.
>>
>> - For a hexadecimal f-p literal, the questioner mentioned that the 
>> (mandatory) exponent is "in base 2", but there is no requirement to 
>> write the exponent in binary. There's lots of potential for confusion 
>> here. What are some examples of hexadecimal f-p literals?
>>
>> In the JLS, it is often the most fundamental descriptions and 
>> operations that are the hardest to phrase. We're not there yet for f-p 
>> literal values.
> 
> There could be value in having a highly condensed floating-point primer 
> in the JLS, but it would be fine to continue to omit such information as 
> well. For example, the statement in the JLS
> 
>      "The smallest positive finite non-zero literal of type double is 
> 4.9e-324."
> 
> is considerably more subtle than it appears at first. All finite binary 
> floating-point values are exactly representable as double values since 
> 10 = 2 * 5. There is a range of the number line which converts to 
> Double.MIN_VALUE and many decimal strings in that range which get 
> converted to Double.MIN_VALUE. The string "4.9e-324" is not the 
> numerically smallest such string nor is it the numerically largest. The 
> exact string has several hundred decimal digits. This string used is the 
> shortest such string, which is regarded as the canonical one.
> 
> Such subtleties could be alluded to in the JLS; if there is interest, I 
> could work on a few paragraphs describing the situation.
> 
> Cheers,
> 
> -Joe
>