Thoughts on unified integer literal improvements

Reinier Zwitserloot reinier at zwitserloot.com
Tue Jun 30 04:36:24 PDT 2009


I don't think Artur's issues are particularly tough nuts to crack (in  
that they have an easy and obvious right answer - see end of this  
post), but I do have some rather serious reservations about this byte  
array literal syntax:

1) Is this a context-sensitive literal? In other words, what would:

byte[] bs = 0x01y; do?

or:

byte b = 0x0102y?

or:

void method(byte b) {}
void method(byte[] bs) {}

method(0x001y);

do?

Because that's quite a step, and context-sensitive (e.g. depending on  
the LHS) literals have been mentioned before as iffy. I'm not as  
opposed to context sensitive literals as others on this list, but it  
is nevertheless an important distinction. So, presuming for a moment  
these ARENT context-sensitive, then:

2) Isn't this rather inconsistent? 0x01y is a byte, but 0x0102y is a  
byte array. What if I want a 1-length byte array? Should byte literals  
auto-cast themselves into array form on an as needed basis (autoboxing/ 
unboxing is different from being context-sensitive, though auto- 
conversion has its own troubles). Is just a lone '0xy' a zero-length  
byte array? What about 0 padding? Do those count? Is 0x000102y a  
size-2 or a size-3 byte array?

3) Where does it end? Can I make int arrays? Will this be available if  
0b10111001 style binary literals become standard java? Should decimal  
literals pack into an array like this? The amount of slots required  
for a given decimal is non-obvious, so is that a good idea (certainly  
not if slot size determines if its an array literal or a primitive!)?  
But if not, is this consistent?

4) Weren't we trying to move *AWAY* from arrays? Scala and other  
languages show that it is basically possible to eliminate primitives  
and smartly compile, via duplication for each primitive for example,  
code that is just as fast. Combine such a concept with reification,  
and byte[] is just an obsolete version of List<Byte>. This is a very  
long way away, but we're nevertheless stuck with the unfortunate  
situation that generics and arrays don't play nicely with each other.

5) Where does it end part 2: If byte arrays are 'special' enough that  
it warrants ltierals, can regexp expressions box to Pattern objects,  
or, better yet, can I make actual Pattern literals? Compiling regexes  
is pricey, and with regexp literals you could do some syntax checking  
on them, so it's not just 'saving some characters'. How about XML  
strings? Aren't map and list literals ***FAR*** more useful than byte  
array literals? Seriously - where does it end?

I'd love a proposal for some sort of library literal system, whereby,  
apt-processor-like, a certain format of literal results in a classpath  
check (via ServiceLoader) for some code that knows how to turn the  
stated literal into a primitive object; something java core knows,  
like an actual byte array literal of the type: new byte[] { 0x01,  
0x02, 0x03 }; for example.

A complete proposal for such a system is well beyond coin's scope, and  
there are some significant issues you'd have to resolve (one involves  
escaping: If you can stuff any characters inside a literal definition,  
how does the java parser know where it ends? Imagine a regexp literal  
that is delimited by / as is common in other languages - how would the  
parser know to treat any commas, closing parens, even newlines and  
string quotes inside the slashes as NOT closing the literal?



On 2009/30/06, at 13:05, abies at adres.pl wrote:
>>    byte[] bs = 0x0102030405060708AABBCCDDEEFFy;
>>
>
> Which direction it would be read? byte[0] is 0x01 or 0x0FF ?
>

EVERYTHING in java, where you have to choose between Big-Endian and  
Little-Endian, is Big-Endian. So, this one is easy and should not  
count against the proposal: byte[0] is 0x01.

> What with byte[] bs = 0xabcy; ? Would it be legal - and would it be  
> equivalent to 0x0abcy, 0xa0bcy, 0xabc0y ?

When in doubt, choose not legal. It's a lot easier for somebody to  
clarify by typing a single extra character (a 0) immediately as he  
writes it (provided a cool enough IDE that red-underlines the  
problem), then letting the user assume one way or another and spend an  
hour chasing bugs. Also not a strike against the proposal, presuming  
for a moment that you must have an even number of nibbles in this usage.



>
>
> I understand that it would be kind of byte[] literal, so no  
> possibility of overwriting contents of existing byte array, short of  
> using
> System.arraycopy(0x0102030405060708AABBCCDDEEFFy,0,bs,0,14);
> ?
>
>
> Regards,
> Artur
>




More information about the coin-dev mailing list