Thoughts on unified integer literal improvements

Tue Jun 30 00:00:27 PDT 2009

Bruce Chapman wrote:
> Weijun Wang wrote:
>> brucechapman at paradise.net.nz wrote:
>>   
>>> Quoting Weijun Wang <Weijun.Wang at Sun.COM>:
>>>
>>>     
>>>>> long ell = 0xFFFFFFFFu; // A positive long value
>>>>>         
>>>> Any particular requirement here? Why not simply
>>>>
>>>>  long ell = 0xFFFFFFFFl;
>>>>
>>>>       
>>>>> I think this approach has some advantages over the "y" suffix; in 
>>>>> particular I think it gives more desirable behavior in cases like
>>>>>         
>>>> this:
>>>>       
>>>>> byte b = 0xFFy // a negative byte value
>>>>> byte b = 0xFFu // also a negative byte value
>>>>>
>>>>> short s = 0xFFy // a negative short value, -128;
>>>>> // byte value is signed extended
>>>>> short s = 0xFFu // a positive short value, +127
>>>>>
>>>>> int i = 0xFFy // -128
>>>>> int i = 0xFFu // 127
>>>>>         
>>>> Does this mean the actual value of 0xFFu is determined by looking at
>>>> the
>>>> LHS of the assignment? This is terrible.
>>>>       
>>> Yes that would be terrible if it were true, fortunately it is not. The value is
>>> determined by the digits and any radix specified by the prefix.  In the above
>>> examples, the "y" suffix means byte, so when is widened to a short or int it
>>> sign extends and stays negative. The "u" suffix would create a special type that
>>> would always zero extend when widening giving a positive number.
>>>     
>>
>> I still find it difficult to understand this 'u' thing. When you say
>> "zero extend when", isn't that the same with what I said "depending on
>> the LHS"?
>>   
> 
> No, The JLS defines a number of conversions, one of which is widening
> and when that occurs. For example
> 
> byte b = 1;
> 
> int j = b+1;
> 
> in the second line the type of the expression "b+1" is not determined by
> the LHS, but the various conversion rules say that the b expression of
> type byte must be widened to an int before the addition and assignment
> occur in that order.
> 
> 
> Similarly when we say
> 
> short s = 0xFFu;
> 
> What is happening is that the type of the expression on the right has
> some conversions applied before the assignment. Collectively those are
> called assignment conversion, and when 0xFFu has a type of byte, then
> the particular assignment conversion which is applied is the widening
> conversion.

I somehow understand your point. This 0xFFu is a brand new data type.

Don't know if it can be coined without any new VM bytecode. Implementing
it with some syntactic sugar might bring unexpected troubles.

What does f(0xFFu) mean when there are both f(byte) and f(int)?

> 
> In practice many operations involving byte or short literals will see
> them widened to int first, which gets to your next question
>> What is the most important usage of it?
>>   
> byte b = 0xFF;
> 
> currently fails because 0xFF is an int and when we assign that to a
> byte, we find that the value (255) is out of range for byte (-128 - 127)
> and the compiler knows this and won't let us do it.  We could write
> 
> byte b = (byte)0xFF;

But you already have

    byte b = 0xFFy;

> 
> but that gets ugly because sometimes the bit pattern does fit and the
> cast is unnecessary.
> 
> By being able to encode byte literals code which is manipulating then
> becomes less cluttered.
> 
> Here is a (slightly contrived for simplification)  example showing how
> the sign extension can be confusing.
> 
> int i = (byte)0xFF;

If I really write this way, I would use 0xFFFFFFFF. If I want 255, I
would write 0xFF.

Thanks
Max

> 
> now you might think that "i" has a value 255, but in fact it has a value
> of -1, because the byte has a value of -1.   Joe's proposal was a means
> to prevent these non obvious sign extension situations.
> 
> Bruce
> 
>>   
>>>> I'd rather use something like 0bXX which is itself always a byte
>>>> literal.
>>>>       
>>> But that is inconsistent, prefixes currently determine radix, suffixes determine
>>> type. Also the binary literals part of the proposal is using 0b as a prefix for
>>> binary literals so 0bXX is an int.
>>>     
>>
>> I see.
>>
>> Thanks
>> Max
>>
>>   
>>> Bruce
>>>
>>>     
>>>> Max
>>>>
>>>>       
>>>>> -Joe
>>>>>
>>>>>         
>>>>  
>>>>       
>>
>>   
>