Indexing access for Lists and Maps considered harmful?
Reinier Zwitserloot
reinier at zwitserloot.com
Wed Jun 24 07:45:52 PDT 2009
I am beginning to see some concensus on this list that the neal
semantics are the most preferred semantics. That is:
FOO = (a[b] = c);
should be equal to:
a[b] = c;
FOO = a[b];
and thus, both a setter and a getter call.
However, I will again claim that this semantic just plain will not fit
within the confines of project coin, even if you take the drastic
measure of hardcoding the setter mechanic onto java.util.List. Even
with drastic measures that haven't passed in coin before, such as
extension methods, new keywords, or scala-esque implicit conversions,
this semantic is not possible in project coin, or at least - is not
going to be consistent, because of this reason:
A) Expressions that are being used as statements in java are not
changed - any side effects of the notion that the expression returns a
value aren't eliminated just because you don't use the return value
for anything. In other words, the correct semantics for:
FOO = BAR;
which is an expression that may be legally used as a statement, is to
first calculate the value of BAR, then assign it to FOO, then
calculate the value of FOO, and toss it. Fortunately, in all legal
assignment statements that exist in java today, calculating the value
of the LHS of an assignment is side-effect free, and therefore the
compiler itself already optimizes this out, and doesn't shove FOO on
the stack just to remove it again without doing anything with it.
B) The only way to get to neal's semantics for SetIndex behaviour is
to run a subsequent get operation immediately after the set. In other
words, this:
FOO = list[idx] = v;
should desugar to:
list.set(idx, v);
FOO = list.get(idx);
As so many on this list have said, this is very very important,
because consistency is some sort of holy grail.
So let's roll with this consistency argument, and be consistent. I
thus posit that the only consistent conclusion is:
C) As expressions-turned-into-statements do not stop side-effects, the
expression:
list[idx] = v;
should desugar not to the obvious:
list.set(idx, v);
Because that would be inconsistent. It should desugar to:
list.set(idx, v);
list.get(idx);
That last statement is NOT side effect free! Who knows what I stuff
into my classes .get() method. It's an interface, I can do what I
please! Log something, print something, start up a GUI - make a
network connection - it doesn't matter. As a practical matter, running
a useless get() operation is quite a performance hit, and for the few
scenarios where people did write side effects into their get method,
while it is technically consistent, I bet it's going to cause no end
of surprises. If you step away from the consistency and technical
semantics for a second, "list[idx] = v" calling the getter right after
the setter is nuts. Completely unexpected.
So, to put it in layman's terms: Consistency is an unattainable pipe
dream - akin to appreciation for a painting, not a quantifiable
measure. Please move away from it. I think putting yourself into the
mind of Average Java is a far more fruitful exercise, and, if needed -
quantifiable (go out, find some Joe Averages, and ask them - or just
use your common sense. Puzzlers are defined as things which surprises
Joe Average Joe. No single puzzler is a 'puzzler' if everyone knew
chapter and verse of the JLS, and yet we all just know a thing is a
puzzler when we see one)
So, lets be Joe Average. Imagine he sees:
void example() {
Map<Integer, Integer> map = new HashMap<Integer, Integer>() {
public Integer get(Integer key) { System.out.println("FOO!");
return super.get(key); }
map.put(10, 0);
callSomeOverloadedMethod(map[10] = 1);
}
void callSomeOverloadedMethod(Object x) {
System.out..println("BAR: " + x);
}
void callSomeOverloadedMethod(int x) {
System.out.println("BAZ:" + x);
}
in a code base somewhere, and think of what he'd think is going to
happen.
Conclusion: Any of 4 incompatible choices, at least:
A) BAZ: 1 (assignment evaluates to evaluation of RHS)
B) BAR: 1 (assignment evaluates to post-capture evaluation of RHS)
C) FOO! BAR: 1 (assignment evaluates to LHS - e.g. call the getter)
D) BAR: 0 (assignment desugars to map.put(), which returns the value
that previously occupied the slot of the key)
We can talk about what's the most consistent choice till the cows come
home, and we can even pick a winner (which so happens to appear to be
the most complicated one, which is C - just to make things worse), but
in the end Joe Average Java just is so not going to get that.
The only winning answer is not to choose at all, and return void. That
way, 25% of the people who get it right are inconvenienced all of the
20 times it's seen usage in a multi-man-year project, and the remaning
75% don't spend an hour or two chasing down why it doesn't work. I
take it as a given that if you fail to guess the right semantic for
'list[a] = b', that the result bug is NOT all that obvious, and
certainly does not clearly tell you right as you write/compile your
code. It'll eventually show up as a runtime issue, probably not even
an exception, just errant behaviour in your program. No stack trace to
help you, the problem appearing miles away in both time and distance
from the offending code line. In other words, in a bad scenario, many
many hours worth of bug hunting.
And all for this trouble for Joe Average just because we're all
chasing the mythical 'consistency' white whale that is clearly in the
eye of the beholder, and not a quantifiable entity.
--Reinier Zwitserloot
More information about the coin-dev
mailing list