Thoughts on unified integer literal improvements
Bruce Chapman
brucechapman at paradise.net.nz
Mon Jun 29 02:12:52 PDT 2009
Joe Darcy wrote:
> Hello.
>
> On the set of improved integer literal features, I think combining the
> underscores as separators and binary literals is straightforward given
> separately correct grammars for each change.
>
> As an alternate to "y" and "s" suffices, I suggesting considering a
> "u" suffix to mean unsigned. Literals with a trailing "u" would have
> type int; widening conversions of such literals would 0 extend and
> narrowing conversions would range check on the width of set bits. For
> example,
>
All,
I have spent some time considering Joe's suggestions.
While I really like the aesthetics of "u" means "unsigned" compared with
"y" suffix for "byte", I am also aware of the considerable extra
complexity of defining a new primitive type in the JLS. From a first
scan I have identified the most obvious change points which are recorded
in the google document http://docs.google.com/View?id=dcvp3mkv_112567xb4n
I have also made a (somewhat subjective) comparison of the "u" vs "y"
proposals, along with a variation that combines the "u" suffix surface
syntax with more the semantics of the previous autosizing proposal
(which introduced a new Hex prefix) but done as a suffix in order to be
applicable across all number bases.
A comparison of those is available in the google document
http://spreadsheets.google.com/ccc?key=rS9MTI5_fP9GwWrAafy_LwQ
In practice both the unsigned int and autosizing suffixes behave almost
identically, and where different can both be used to achieve the same
goals with the following exception
Autosizing cannot be used to declare a decimal between
Integer.MAX_VALUE and twice that which zero
extends to a long. The simple workaround is to use "L" suffix to
force a long, which more correctly captures the design
intent anyway.
My preference at the moment is for the "u" suffix with autosizing semantics.
Some work is needed to define the autosizing rules exactly - I am
currently leaning toward (loosely)
The type of the literal is the smallest integral type capable of
holding all significant bits of the constant, unless the first
digit of the literal (excluding prefix characters) is '0' in which
case the type is the smallest integral type capable of holding one more
than
the number of significant bits of the constant value. eg 0xFFu is
byte, 0x0FFu is short.
However my goal is to write a proposal which allows "nice" hexadecimal
byte literals with maximum its likelihood of selection for JDK7, and to
that end, I'd appreciate comments, especially any that run counter to my
following assumptions
1) "unsigned int" needs JLS changes that are excessive comparative to
their value, when compared to the other options.
2) Having a new primitive type "unsigned int" which can only be used for
literals, and cannot be used to declare variables has a high probability
of being viewed negatively by the user community. Some users would wish
for full unsigned support, and others would wish that it wasn't there at
all. The effort cost of explaining this afterwards will be significant
3) Explicit use of leading '0' digit in autosizing case to force zero
extension is a good thing.
4) One way to deal with the octal literal vs leading zero widens
conflict would be to introduce alternate alphabetic prefixes for decimal
(d) and octal (o) eg 0d10 (10) and 0o10 (8) which could be used in
combination with a leading 0 digit to force zero extend (eg 0d0255u is
short value 255)- however I assume this is unjustifiable complexity.
And if you have been explicitly addressed in this email, I'd especially
value your feedback.
regards
Bruce
More information about the coin-dev
mailing list