A simple solution for [] access that can trivially be added to List and Map.
Reinier Zwitserloot
reinier at zwitserloot.com
Thu Jun 25 12:46:07 PDT 2009
Step 1: "Allow returning anything when overriding or hiding a method
that returns void".
This change involves rule 8.4.8.3 of JLS 3. Rule 8.4.8.3 allows you
to override a method with e.g. signature "Number foo()" with a method
that returns Integer. The clue here is that Integers are also Numbers
(because Integer is a subtype of Number).
We expand rule 8.4.8.3 to add:
If a method declaration d1 overrides or hides another method d2,
where d2 has return type 'void', then d1 may return any type,
including primitives and void.
Step 2: "Create SetIndex, GetIndex, SetWithKey, and GetWithKey
interfaces".
We create the java.lang.operators package, which will hold interfaces
that let types (re)define the meaning of operators employed on
instances of their type. The java.lang namespace is 'special', in that
all source files import it implicitly, so adding new items to it
should not be undertaken lightly. Yet, the java.lang.operators package
clearly shows that it is a fundamental aspect of the java language by
being a sub-package of java.lang; see also
java.lang.annotation.Annotation which is similarly 'magic' (in that
any public @interface silently implements that interface).
Inside this package we add the following 4 interfaces:
public interface SetIndex<V> {
public void set(int idx, V value);
}
public interface GetIndex<V> {
public V get(int idx);
}
public interface SetWithKey<K, V> {
public void put(K key, V value);
}
public interface GetWithKey<K, V> {
public V get(K key);
}
java.util.Map will then add 'GetWithKey<K, V>, SetWithKey<K, V>' to
its implements list.
java.util.List will then add 'GetIndex<T>, SetIndex<T> ' to its
implements list.
Those additions are backwards and migration compatible, due to step 1,
which legally allows both map and list's existing set/put methods to
return something, even though in the SetIndex/SetWithKey interfaces,
set/put return void. All existing class files that implement List and/
or Map will seamlessly support the indexing operations, and any
existing source code that implements Map or List or a child interface/
class thereof continue to compile to the same semantic meaning without
introducing any (new) errors and warnings.
Step 3) Define how operator overloading interacts with the language.
(The method calls in this section are given as how they are
represented in the class file format; that is, the methods are fully
typed).
We now expand the meaning of:
foo[bar]
Depending on context (on the LHS of any assignment operation, so '=',
'+=', '|=', etcetera, as context 'set', and anywhere else as context
'get').
'set' context:
foo[bar] = RHS is translated to 2 potential method calls:
(ONLY if foo is assignable to SetIndex)
method: java/lang/operators/SetIndex.set
receiver: foo
args: bar, RHS
and:
(ONLY if foo is assignable to SetWithKey)
method: java/lang/operators/SetWithKey.put
receiver: foo
args: bar, RHS
In the event foo implements *BOTH* SetIndex and SetWithKey, resolution
is analogous as if both methods had the same name; there is no method
selection ambiguity because SetWithKey's first argument is
neccessarily a class type, whereas SetIndex's first arg is a
primitive, int. In other words, resolution would occur analogous to
how resolution occurs in the following class:
class Example<T> {
void foo(int x) {}
void foo(T x) {}
public static void main(String[] args) {
Example ex = new Example<Number>();
ex.foo(10);
ex.foo(Integer.valueOf(10));
}
}
The specifics for how such calls are resolved can be found in the
JLS3, for example section 5.1.8, "Unboxing conversion" (as the only
possible confusion here is when an Integer object may or may not unbox
to int to call set, or an int primitives boxes to Integer to call put).
'get' context:
Analogous to the 'set' context, except translates to GetIndex.get and
GetWithKey.get.
See Appendix B for a defense of why 4 methods are needed and not 2.
APPENDIX A: Defense of allowing the return of any value when
overriding methods of void.
Currently, rule 8.4.8.3 is implemented by javac by producing 2
methods: The actual defined method with the tightened return type, and
a synthetic wrapper method with the original signature, that simply
forwards the call to the real method. This way, any callers that don't
know about the return type tightening still end up with the right
result. See the JVM spec section on 'method descriptions' on how the
JVM resolves method invocations, which shows why rule 8.4.8.3, which
was introduced in java 1.5 (or even java 1.6?) is backwards and
migration compatible.
The exact same strategy can be implemented in order to let a subclass
provide a return type on a method when the parent class returns void:
The actual method with the return type is put in the class file as
normal, then a synthetic wrapper method with return type void is also
added, which simply calls the real method. The only difference is that
this method just calls the real method, and returns nothing, whereas
the current 8.4.8.3 rule calls the real method and returns what it
returns.
The 'void' type is effectively a supertype of everything: After all,
'being a subtype of' implies that any instances of that type can be
substituted for the supertype. Clearly any object or primitive can be
'substituted' for the void type: Simply bitbucket the primitive or
object.
In this sense, rule 8.4.8.3's current exclusion of the 'void' mechanic
is in fact an inconsistency in the java language spec.
Entirely analogous to 8.4.8.3, the purpose of this relaxation of the
rules is to increase the flexibility of API design. We let APIs
tighten the return type in order to have each level in a hierarchy of
types have the most appropriate semantics. The void case is no
different.
Lastly, there is no existing code, compiled or in source form, that is
legal now that would not be legal after this change, or that would
change semantics. After all, any method that overrides a 'returns
void' method must currently have the 'void' return type, or it
wouldn't compile.
APPENDIX B: Why 4 types and not 2?
The pragmatic reason of fitting both Map's put and List's set method
is the most important, but the operation of putting an entry in a
mapping and accessing the index of a list-like construct are
semantically not the same and should therefore not be lumped into a
single interface. Consider for example that 'List.set' is documented
to throw an IndexOutOfBoundsException, which seems to be a good fit
even for 'SetIndex', whereas it makes absolutely no sense for a put
operation.
Semantically speaking then, foo[x] where foo is a Map, and foo[x]
where foo is a List, just aren't the same operation. They merely have
a similar feel. This situation is not unprecedented; in java, the +
operation can mean either string concatenation or numeric addition.
While they feel somewhat similar and both operations are routinely
represented with the '+' symbol in many languages, the operations
aren't the same. string concatenation isn't commutative, whereas
numeric addition is!
The third reason is that no ambiguity can arise; the compiler will
always know which form is intended (WithKey, or Index). Because
indexed operations work with primitives, and map key storage works
with objects, any given 'foo[bar]' style assignment or retrieval
operation can always be resolved to be either SetIndex/GetIndex or
SetWithKey/GetWithKey, even in the unlikely event that 'foo'
implements both interfaces.
MAJOR DISADVANTAGE:
The notion that assignments, 'for consistency's sake', should
themselves be an expression that returns a value, eventhough there's
considerable confusion as to what ought to be returned, seems to be an
obvious candidate for a disadvantage of this proposal, as this
proposal returns 'void' for assignment operators.
The author believes this is in fact desired behaviour: *because* of
the confusion around what such an assignment ought to return, it is
expected that the nature of the returned value will often be
misunderstood. By returning nothing, any misunderstanding is caught
early with a clear error message, instead of late, without an error
message, and without even a stack trace to aid in debugging.
This proposal does not aim to make list access and array access
consistent. In order to do this, list access must also be retrofitted
so that:
foo.length
where foo is a list, actually calls the size() method, which seems to
be far beyond Project Coin's stated goals. Some sort of 'autoboxing'
of lists into arrays and vice versa when calling methods that expect
lists or arrays is also required to make them consistent, as well as
reification of generics, AND fixing array's different (more
permissive) behaviour for contravariancy. Clearly such a goal is
utterly unatainable within the confines of Project Coin, and therefore
this proposal does not take such consistency into consideration.
If the disadvantage of returning nothing is deemed to great, an
alternative can be posited: The SetWithKey and SetIndex interfaces do
not change, but the desugaring of foo[bar] does change:
foo[bar] = RHS
will first retrofit RHS so that it 'fits' into foo (e.g. apply
autoboxing or unboxing), then this post-capture value is passed on the
set/put method, and is duplicated on the stack as well for further
consumption. This solution has the exact same effect as Neal Gafter's
suggestion to desugar '[]' to a static method which you must then
import, along with the following static method in java.util.Collections:
public static <K,V> V operator_index_put(Map<K,V> map, K key, V value) {
map.put(key, value);
return value;
}
--Reinier Zwitserloot
More information about the coin-dev
mailing list