Thoughts on unified integer literal improvements

Tue Jun 30 03:47:18 PDT 2009

On Jun 30, 2009, at 5:15 PM, Bruce Chapman wrote:

>
>>> What is happening is that the type of the expression on the right  
>>> has
>>> some conversions applied before the assignment. Collectively those  
>>> are
>>> called assignment conversion, and when 0xFFu has a type of byte,  
>>> then
>>> the particular assignment conversion which is applied is the  
>>> widening
>>> conversion.
>>>
>>
>> I somehow understand your point. This 0xFFu is a brand new data type.
>>
> Almost - If its Joe's 0xFFu then yes, but that has been determined  
> too big for coin, if it's autosizing then no, its not a new type,  
> just a new literal syntax for a literal whose type depends on the  
> literal value, but those possible types are existing types.

"a new literal syntax for a literal whose type depends on the literal  
value".

So, 0xFFu, being two bytes long, is always a byte? Then what does this  
line mean?

>>>> int i = 0xFFu // 127 (should be 255)

Or is this only Joe's 0xFFu? Your i will be -1?

>> Don't know if it can be coined without any new VM bytecode.  
>> Implementing
>> it with some syntactic sugar might bring unexpected troubles.
>>
> No new VM support required, just some (quite minor) changes to the  
> compiler. Getting the spec right is probably more effort than  
> getting the compile code written.
>> What does f(0xFFu) mean when there are both f(byte) and f(int)?
>>
>>
> The same as f((byte)255)
>
>>> In practice many operations involving byte or short literals will  
>>> see
>>> them widened to int first, which gets to your next question
>>>
>>>> What is the most important usage of it?
>>>>
>>>>
>>> byte b = 0xFF;
>>>
>>> currently fails because 0xFF is an int and when we assign that to a
>>> byte, we find that the value (255) is out of range for byte (-128  
>>> - 127)
>>> and the compiler knows this and won't let us do it.  We could write
>>>
>>> byte b = (byte)0xFF;
>>>
>>
>> But you already have
>>
>>     byte b = 0xFFy;
>>
>>
> 0xFFy and 0xFFu are alternative designs being explored, they won't  
> BOTH happen, hopefully one will.

Got it. I would prefer 0xFFy.

How about a step further like --

    byte[] bs = 0x0102030405060708AABBCCDDEEFFy;

>
>>> but that gets ugly because sometimes the bit pattern does fit and  
>>> the
>>> cast is unnecessary.
>>>
>>> By being able to encode byte literals code which is manipulating  
>>> then
>>> becomes less cluttered.
>>>
>>> Here is a (slightly contrived for simplification)  example showing  
>>> how
>>> the sign extension can be confusing.
>>>
>>> int i = (byte)0xFF;
>>>
>>
>> If I really write this way, I would use 0xFFFFFFFF. If I want 255, I
>> would write 0xFF.
>>
>>
> The problem is not when you want the sign extension to happen, the  
> problem is when it happens when you don't expect it. When what you  
> really want (in the example) is for the int to have a value 0xFF.  
> The example doesn't show it well, but the sign extend on widening  
> can quite easily catch you out. I have been writing code doing low  
> level protocol byte munging stuff for years,

Me too, http://hg.openjdk.java.net/jdk7/tl/jdk/rev/7360321c37e3

Thanks
Max

> and I thought I understood it. It wasn't till I started  
> investigating this area for project coin that I really understood it  
> (although even that is a bit of a exageration) I thought I  
> understood, but in hindsight I just knew a few magic spells that  
> gave the right answers. I was constantly seeing surprises and common  
> mistakes in code I and colleagues were writing. The coin proposals  
> are aimed at addresses the worst of these pitfalls.
>
> Bruce
>
>> Thanks
>> Max
>>
>>
>