JLS 3.10.2 -- Exposition of hexadecimal f.p. literals
Joe Darcy
joe.darcy at oracle.com
Sat Oct 26 00:29:17 UTC 2019
To provide some additional background on this thread if not the JLS
section in question, hexadecimal floating-point literals are a very
useful language feature for a narrow range of situations. Those
situations include having a straightforward way to set the exact bits of
a floating-point value using a roughly human readable format. I commonly
use hexadecimal floating-point literals in numerical tests and have
added them to the narrative specs for sentinel values such as
Double.MAX_VALUE.
A finite IEEE floating-point value is conceptually a tuple of three
values, sign, significand, and exponent, where the significand and
exponent have ranges that are a function of the format in question,
float or double, etc. Depending on how one wants to formulate the
values, the ranges of each of these values can be given in terms of a
set of contiguous integers.
In a hex floating-point literal, the significand value is written in hex
but the exponent value is written in *decimal*. However, the decimal
value is used as an exponent for base 2 and in that sense is a "binary"
exponent. This seemingly conflicting design works for the intended use
cases.
For example, the smallest nonzero double value is numerically equal to
2^-1074. As a hex literal this can be written in a number of ways including
0x1.0p-1074
in decimal, 1 * 2 ^ -1074. The way to write the value corresponding to
the representation of the floating-point value, using some underscores
to help grouping, is
0x0.0000_0000_0000_1p-1022
Decoding, this is a subnormal value (leading digit 0 with the lowest
exponent value) and only the least significant bit of the significand is
set. The double format uses 52-bits for its significand field, 13 hex
digits.
Other examples would include how to write "3.0" in a way corresponding
to the representation:
0x1.8p1
that is 1.8 as a hex value (namely 1.5 in decimal) multiplied by 2^1 = 2.
On 10/24/2019 5:04 PM, Alex Buckley wrote:
> On 10/24/2019 1:38 PM, Joe Darcy wrote:
>> To make an explicit statement about the value of a floating-point
>> literal, I suggest after the sentence
>>
>> "A floating-point literal may be expressed in decimal (base 10) or
>> hexadecimal (base 16). "
>>
>> adding something like
>>
>> "The exact numerical value of a decimal floating-point literals is
>>
>> decimal_sequence * 10 ^ exponent
>>
>> The exact numerical value of a hexadecimal floating-point literal is
>>
>> hex_sequence * 2 ^ exponent
>>
>> The conversion of the exact numerical value to a particular
>> floating-point value is handled as if by Float.valueOf or
>> Double.valueOf for literals of type float and double, respectively."
>
> This is a good start, but needs tightening up. Please consider this
> text as if you're seeing it for the first t
To be more explicit, "decimal_sequence * 10 ^ exponent " is an informal
short-hand for "in each of the possible decimal floating-point literal
forms below, collect together the leading digits as a digit sequence,
treat it as a normal rational numerical value and multiply it by 10
raised to the exponent where the exponent if implicitly 1 if not
syntactically present in the literal."
Digits . [Digits] [ExponentPart] [FloatTypeSuffix]
. Digits [ExponentPart] [FloatTypeSuffix]
Digits ExponentPart [FloatTypeSuffix]
Digits [ExponentPart] FloatTypeSuffix
> ime, bearing in mind that it's defining terms which map to productions
> in the grammar immediately after.
>
> -----
> A floating-point literal may be expressed in decimal (base 10) or
> hexadecimal (base 16).
>
> For decimal floating-point literals, at least one digit (in either the
> whole number or the fraction part) and either a decimal point, an
> exponent, or a float type suffix are required. All other parts are
> optional. The exponent, if present, is indicated by the ASCII letter e
> or E followed by an optionally signed integer.
>
> The exact numerical value of a decimal floating-point literal is:
> decimal_sequence * 10 ^ exponent
>
> For hexadecimal floating-point literals, at least one digit is
> required (in either the whole number or the fraction part), and the
> exponent is mandatory, and the float type suffix is optional. The
> exponent is indicated by the ASCII letter p or P followed by an
> optionally signed integer.
>
> The exact numerical value of a hexadecimal floating-point literal is:
> hex_sequence * 2 ^ exponent
>
> Underscores are allowed as separators between digits that denote the
> whole-number part, and between digits that denote the fraction part,
> and between digits that denote the exponent.
> -----
>
> - What is "decimal_sequence"? The answer must be in terms of the
> artifacts mentioned in the immediately preceding paragraph -- or
> modify the grammar to introduce new artifacts that can be described in
> the narrative.
>
> - Similarly for "hex_sequence".
>
> - A decimal f-p literal need not include the exponent part, so the
> definition can't just assume "exponent" is known.
>
> - For a hexadecimal f-p literal, the questioner mentioned that the
> (mandatory) exponent is "in base 2", but there is no requirement to
> write the exponent in binary. There's lots of potential for confusion
> here. What are some examples of hexadecimal f-p literals?
>
> In the JLS, it is often the most fundamental descriptions and
> operations that are the hardest to phrase. We're not there yet for f-p
> literal values.
There could be value in having a highly condensed floating-point primer
in the JLS, but it would be fine to continue to omit such information as
well. For example, the statement in the JLS
"The smallest positive finite non-zero literal of type double is
4.9e-324."
is considerably more subtle than it appears at first. All finite binary
floating-point values are exactly representable as double values since
10 = 2 * 5. There is a range of the number line which converts to
Double.MIN_VALUE and many decimal strings in that range which get
converted to Double.MIN_VALUE. The string "4.9e-324" is not the
numerically smallest such string nor is it the numerically largest. The
exact string has several hundred decimal digits. This string used is the
shortest such string, which is regarded as the canonical one.
Such subtleties could be alluded to in the JLS; if there is interest, I
could work on a few paragraphs describing the situation.
Cheers,
-Joe
More information about the jls-jvms-spec-comments
mailing list