Thoughts on unified integer literal improvements

Bruce Chapman brucechapman at paradise.net.nz
Mon Jun 29 02:12:52 PDT 2009


Joe Darcy wrote:
> Hello.
>
> On the set of improved integer literal features, I think combining the 
> underscores as separators and binary literals is straightforward given 
> separately correct grammars for each change.
>
> As an alternate to "y" and "s" suffices, I suggesting considering a 
> "u" suffix to mean unsigned.  Literals with a trailing "u" would have 
> type int; widening conversions of such literals would 0 extend and 
> narrowing conversions would range check on the width of set bits.  For 
> example,
>
All,

I have spent some time considering Joe's suggestions.

While I really like the aesthetics of "u" means "unsigned" compared with 
"y" suffix for "byte", I am also aware of the considerable extra 
complexity of defining a new primitive type in the JLS. From a first 
scan I have identified the most obvious change points which are recorded 
in the google document http://docs.google.com/View?id=dcvp3mkv_112567xb4n

I have also made a (somewhat subjective) comparison of the "u" vs "y" 
proposals, along with a variation that combines the "u" suffix surface 
syntax with more the semantics of the previous autosizing proposal 
(which introduced a new Hex prefix) but done as a suffix in order to be 
applicable across all number bases.

A comparison of those is available in the google document 
http://spreadsheets.google.com/ccc?key=rS9MTI5_fP9GwWrAafy_LwQ

In practice both the unsigned int and autosizing suffixes behave almost 
identically, and where different can both be used to achieve the same 
goals with the following exception

    Autosizing cannot be used to declare a decimal between 
Integer.MAX_VALUE and twice that which zero
    extends to a long. The simple workaround is to use "L" suffix to 
force a long, which more correctly captures the design
     intent anyway.

My preference at the moment is for the "u" suffix with autosizing semantics.

Some work is needed to define the autosizing rules exactly - I am 
currently leaning toward (loosely)

    The type of the literal  is the smallest integral type capable of 
holding all significant bits of the constant, unless the first
    digit of the literal (excluding prefix characters) is '0' in which 
case the type is the smallest integral type capable of holding one more 
than
    the number of significant bits of the constant value. eg 0xFFu is 
byte, 0x0FFu is short.

However my goal is to write a proposal which allows "nice" hexadecimal 
byte literals with maximum its likelihood of selection for JDK7, and to 
that end, I'd appreciate comments, especially any that run counter to my 
following assumptions

1) "unsigned int" needs JLS changes that are excessive comparative to 
their value, when compared to the other options.

2) Having a new primitive type "unsigned int" which can only be used for 
literals, and cannot be used to declare variables has a high probability 
of being viewed negatively by the user community. Some users would wish 
for full unsigned support, and others would wish that it wasn't there at 
all. The effort cost of explaining this afterwards will be significant

3) Explicit use of leading '0' digit in autosizing case to force zero 
extension is a good thing.

4) One way to deal with the octal literal vs leading zero widens 
conflict would be to introduce alternate alphabetic prefixes for decimal 
(d) and octal (o) eg 0d10 (10) and 0o10 (8) which could be used in 
combination with a leading 0 digit to force zero extend (eg 0d0255u is 
short value 255)- however I assume this is unjustifiable complexity.

And if you have been explicitly addressed in this email, I'd especially 
value your feedback.

regards

Bruce





More information about the coin-dev mailing list