Indexing access for Lists and Maps considered harmful?

Reinier Zwitserloot reinier at zwitserloot.com
Wed Jun 24 07:45:52 PDT 2009


I am beginning to see some concensus on this list that the neal  
semantics are the most preferred semantics. That is:

FOO = (a[b] = c);

should be equal to:

a[b] = c;
FOO = a[b];

and thus, both a setter and a getter call.

However, I will again claim that this semantic just plain will not fit  
within the confines of project coin, even if you take the drastic  
measure of hardcoding the setter mechanic onto java.util.List. Even  
with drastic measures that haven't passed in coin before, such as  
extension methods, new keywords, or scala-esque implicit conversions,  
this semantic is not possible in project coin, or at least - is not  
going to be consistent, because of this reason:

A) Expressions that are being used as statements in java are not  
changed - any side effects of the notion that the expression returns a  
value aren't eliminated just because you don't use the return value  
for anything. In other words, the correct semantics for:

FOO = BAR;

which is an expression that may be legally used as a statement, is to  
first calculate the value of BAR, then assign it to FOO, then  
calculate the value of FOO, and toss it. Fortunately, in all legal  
assignment statements that exist in java today, calculating the value  
of the LHS of an assignment is side-effect free, and therefore the  
compiler itself already optimizes this out, and doesn't shove FOO on  
the stack just to remove it again without doing anything with it.

B) The only way to get to neal's semantics for SetIndex behaviour is  
to run a subsequent get operation immediately after the set. In other  
words, this:

FOO = list[idx] = v;

should desugar to:

list.set(idx, v);
FOO = list.get(idx);

As so many on this list have said, this is very very important,  
because consistency is some sort of holy grail.

So let's roll with this consistency argument, and be consistent. I  
thus posit that the only consistent conclusion is:

C) As expressions-turned-into-statements do not stop side-effects, the  
expression:

list[idx] = v;

should desugar not to the obvious:

list.set(idx, v);

Because that would be inconsistent. It should desugar to:

list.set(idx, v);
list.get(idx);

That last statement is NOT side effect free! Who knows what I stuff  
into my classes .get() method. It's an interface, I can do what I  
please! Log something, print something, start up a GUI - make a  
network connection - it doesn't matter. As a practical matter, running  
a useless get() operation is quite a performance hit, and for the few  
scenarios where people did write side effects into their get method,  
while it is technically consistent, I bet it's going to cause no end  
of surprises. If you step away from the consistency and technical  
semantics for a second, "list[idx] = v" calling the getter right after  
the setter is nuts. Completely unexpected.


So, to put it in layman's terms: Consistency is an unattainable pipe  
dream - akin to appreciation for a painting, not a quantifiable  
measure. Please move away from it. I think putting yourself into the  
mind of Average Java is a far more fruitful exercise, and, if needed -  
quantifiable (go out, find some Joe Averages, and ask them - or just  
use your common sense. Puzzlers are defined as things which surprises  
Joe Average Joe. No single puzzler is a 'puzzler' if everyone knew  
chapter and verse of the JLS, and yet we all just know a thing is a  
puzzler when we see one)

So, lets be Joe Average. Imagine he sees:

void example() {
     Map<Integer, Integer> map = new HashMap<Integer, Integer>() {
         public Integer get(Integer key) { System.out.println("FOO!");  
return super.get(key); }
     map.put(10, 0);
     callSomeOverloadedMethod(map[10] = 1);
}

void callSomeOverloadedMethod(Object x) {
     System.out..println("BAR: " + x);
}

void callSomeOverloadedMethod(int x) {
     System.out.println("BAZ:" + x);
}

in a code base somewhere, and think of what he'd think is going to  
happen.

Conclusion: Any of 4 incompatible choices, at least:

A) BAZ: 1 (assignment evaluates to evaluation of RHS)

B) BAR: 1 (assignment evaluates to post-capture evaluation of RHS)

C) FOO! BAR: 1 (assignment evaluates to LHS - e.g. call the getter)

D) BAR: 0 (assignment desugars to map.put(), which returns the value  
that previously occupied the slot of the key)


We can talk about what's the most consistent choice till the cows come  
home, and we can even pick a winner (which so happens to appear to be  
the most complicated one, which is C - just to make things worse), but  
in the end Joe Average Java just is so not going to get that.

The only winning answer is not to choose at all, and return void. That  
way, 25% of the people who get it right are inconvenienced all of the  
20 times it's seen usage in a multi-man-year project, and the remaning  
75% don't spend an hour or two chasing down why it doesn't work. I  
take it as a given that if you fail to guess the right semantic for  
'list[a] = b', that the result bug is NOT all that obvious, and  
certainly does not clearly tell you right as you write/compile your  
code. It'll eventually show up as a runtime issue, probably not even  
an exception, just errant behaviour in your program. No stack trace to  
help you, the problem appearing miles away in both time and distance  
from the offending code line. In other words, in a bad scenario, many  
many hours worth of bug hunting.


And all for this trouble for Joe Average just because we're all  
chasing the mythical 'consistency' white whale that is clearly in the  
eye of the beholder, and not a quantifiable entity.


--Reinier Zwitserloot




More information about the coin-dev mailing list