A simple solution for [] access that can trivially be added to List and Map.

Reinier Zwitserloot reinier at zwitserloot.com
Thu Jun 25 12:46:07 PDT 2009


Step 1: "Allow returning anything when overriding or hiding a method  
that returns void".

   This change involves rule 8.4.8.3 of JLS 3. Rule 8.4.8.3 allows you  
to override a method with e.g. signature "Number foo()" with a method  
that returns Integer. The clue here is that Integers are also Numbers  
(because Integer is a subtype of Number).

We expand rule 8.4.8.3 to add:

  If a method declaration d1 overrides or hides another method d2,  
where d2 has return type 'void', then d1 may return any type,  
including primitives and void.

Step 2: "Create SetIndex, GetIndex, SetWithKey, and GetWithKey  
interfaces".

We create the java.lang.operators package, which will hold interfaces  
that let types (re)define the meaning of operators employed on  
instances of their type. The java.lang namespace is 'special', in that  
all source files import it implicitly, so adding new items to it  
should not be undertaken lightly. Yet, the java.lang.operators package  
clearly shows that it is a fundamental aspect of the java language by  
being a sub-package of java.lang; see also  
java.lang.annotation.Annotation which is similarly 'magic' (in that  
any public @interface silently implements that interface).

Inside this package we add the following 4 interfaces:

public interface SetIndex<V> {
     public void set(int idx, V value);
}

public interface GetIndex<V> {
     public V get(int idx);
}

public interface SetWithKey<K, V> {
     public void put(K key, V value);
}

public interface GetWithKey<K, V> {
     public V get(K key);
}

java.util.Map will then add 'GetWithKey<K, V>, SetWithKey<K, V>' to  
its implements list.
java.util.List will then add 'GetIndex<T>, SetIndex<T> ' to its  
implements list.

Those additions are backwards and migration compatible, due to step 1,  
which legally allows both map and list's existing set/put methods to  
return something, even though in the SetIndex/SetWithKey interfaces,  
set/put return void. All existing class files that implement List and/ 
or Map will seamlessly support the indexing operations, and any  
existing source code that implements Map or List or a child interface/ 
class thereof continue to compile to the same semantic meaning without  
introducing any (new) errors and warnings.

Step 3) Define how operator overloading interacts with the language.

(The method calls in this section are given as how they are  
represented in the class file format; that is, the methods are fully  
typed).

We now expand the meaning of:

foo[bar]

Depending on context (on the LHS of any assignment operation, so '=',  
'+=', '|=', etcetera, as context 'set', and anywhere else as context  
'get').

'set' context:

foo[bar] = RHS is translated to 2 potential method calls:

(ONLY if foo is assignable to SetIndex)
method: java/lang/operators/SetIndex.set
receiver: foo
args: bar, RHS

and:

(ONLY if foo is assignable to SetWithKey)
method: java/lang/operators/SetWithKey.put
receiver: foo
args: bar, RHS

In the event foo implements *BOTH* SetIndex and SetWithKey, resolution  
is analogous as if both methods had the same name; there is no method  
selection ambiguity because SetWithKey's first argument is  
neccessarily a class type, whereas SetIndex's first arg is a  
primitive, int. In other words, resolution would occur analogous to  
how resolution occurs in the following class:

class Example<T> {
     void foo(int x) {}
     void foo(T x) {}
     public static void main(String[] args) {
         Example ex = new Example<Number>();
         ex.foo(10);
         ex.foo(Integer.valueOf(10));
     }
}

The specifics for how such calls are resolved can be found in the  
JLS3, for example section 5.1.8, "Unboxing conversion" (as the only  
possible confusion here is when an Integer object may or may not unbox  
to int to call set, or an int primitives boxes to Integer to call put).

'get' context:

Analogous to the 'set' context, except translates to GetIndex.get and  
GetWithKey.get.

See Appendix B for a defense of why 4 methods are needed and not 2.


APPENDIX A: Defense of allowing the return of any value when  
overriding methods of void.

Currently, rule 8.4.8.3 is implemented by javac by producing 2  
methods: The actual defined method with the tightened return type, and  
a synthetic wrapper method with the original signature, that simply  
forwards the call to the real method. This way, any callers that don't  
know about the return type tightening still end up with the right  
result. See the JVM spec section on 'method descriptions' on how the  
JVM resolves method invocations, which shows why rule 8.4.8.3, which  
was introduced in java 1.5 (or even java 1.6?) is backwards and  
migration compatible.

The exact same strategy can be implemented in order to let a subclass  
provide a return type on a method when the parent class returns void:

The actual method with the return type is put in the class file as  
normal, then a synthetic wrapper method with return type void is also  
added, which simply calls the real method. The only difference is that  
this method just calls the real method, and returns nothing, whereas  
the current 8.4.8.3 rule calls the real method and returns what it  
returns.

The 'void' type is effectively a supertype of everything: After all,  
'being a subtype of' implies that any instances of that type can be  
substituted for the supertype. Clearly any object or primitive can be  
'substituted' for the void type: Simply bitbucket the primitive or  
object.

In this sense, rule 8.4.8.3's current exclusion of the 'void' mechanic  
is in fact an inconsistency in the java language spec.

Entirely analogous to 8.4.8.3, the purpose of this relaxation of the  
rules is to increase the flexibility of API design. We let APIs  
tighten the return type in order to have each level in a hierarchy of  
types have the most appropriate semantics. The void case is no  
different.

Lastly, there is no existing code, compiled or in source form, that is  
legal now that would not be legal after this change, or that would  
change semantics. After all, any method that overrides a 'returns  
void' method must currently have the 'void' return type, or it  
wouldn't compile.

APPENDIX B: Why 4 types and not 2?

The pragmatic reason of fitting both Map's put and List's set method  
is the most important, but the operation of putting an entry in a  
mapping and accessing the index of a list-like construct are  
semantically not the same and should therefore not be lumped into a  
single interface. Consider for example that 'List.set' is documented  
to throw an IndexOutOfBoundsException, which seems to be a good fit  
even for 'SetIndex', whereas it makes absolutely no sense for a put  
operation.

Semantically speaking then, foo[x] where foo is a Map, and foo[x]  
where foo is a List, just aren't the same operation. They merely have  
a similar feel. This situation is not unprecedented; in java, the +  
operation can mean either string concatenation or numeric addition.  
While they feel somewhat similar and both operations are routinely  
represented with the '+' symbol in many languages, the operations  
aren't the same. string concatenation isn't commutative, whereas  
numeric addition is!

The third reason is that no ambiguity can arise; the compiler will  
always know which form is intended (WithKey, or Index). Because  
indexed operations work with primitives, and map key storage works  
with objects, any given 'foo[bar]' style assignment or retrieval  
operation can always be resolved to be either SetIndex/GetIndex or  
SetWithKey/GetWithKey, even in the unlikely event that 'foo'  
implements both interfaces.

MAJOR DISADVANTAGE:

The notion that assignments, 'for consistency's sake', should  
themselves be an expression that returns a value, eventhough there's  
considerable confusion as to what ought to be returned, seems to be an  
obvious candidate for a disadvantage of this proposal, as this  
proposal returns 'void' for assignment operators.

The author believes this is in fact desired behaviour: *because* of  
the confusion around what such an assignment ought to return, it is  
expected that the nature of the returned value will often be  
misunderstood. By returning nothing, any misunderstanding is caught  
early with a clear error message, instead of late, without an error  
message, and without even a stack trace to aid in debugging.

This proposal does not aim to make list access and array access  
consistent. In order to do this, list access must also be retrofitted  
so that:

foo.length

where foo is a list, actually calls the size() method, which seems to  
be far beyond Project Coin's stated goals. Some sort of 'autoboxing'  
of lists into arrays and vice versa when calling methods that expect  
lists or arrays is also required to make them consistent, as well as  
reification of generics, AND fixing array's different (more  
permissive) behaviour for contravariancy. Clearly such a goal is  
utterly unatainable within the confines of Project Coin, and therefore  
this proposal does not take such consistency into consideration.


If the disadvantage of returning nothing is deemed to great, an  
alternative can be posited: The SetWithKey and SetIndex interfaces do  
not change, but the desugaring of foo[bar] does change:

foo[bar] = RHS

will first retrofit RHS so that it 'fits' into foo (e.g. apply  
autoboxing or unboxing), then this post-capture value is passed on the  
set/put method, and is duplicated on the stack as well for further  
consumption. This solution has the exact same effect as Neal Gafter's  
suggestion to desugar '[]' to a static method which you must then  
import, along with the following static method in java.util.Collections:

public static <K,V> V operator_index_put(Map<K,V> map, K key, V value) {
     map.put(key, value);
     return value;
}

  --Reinier Zwitserloot





More information about the coin-dev mailing list