diamond operator & implementation strategies (v3)

Sat Aug 22 03:20:34 PDT 2009

Neal Gafter wrote:
> I believe there are situations that they give different, incompatible
> results.  The "complex" method is currently (Java 5 and later) used for
> inference of method type arguments, so the language and compiler must
> implement it already.  I feel it's better to have one inference algorithm
> rather than two.
>   
Hi Neal (and others)
I'd like to provide a couple of clarifications; let's start with the 
easy one (implementation): javac's type inference (and JLS 
type-inference, in general) is made up of two *mostly independent* steps: 

*) inference from actual vs. formal arguments (see JLS 15.12.2.7)
*) inference from return type vs. expected type (see JLS 15.12.2.8)

In Javac these two steps are unrelated, as the first one (actual vs. 
formal) is performed during method resolution (that is when javac has to 
find a method symbol that is suitable for a given call site); the latter 
step is performed at a later point, when the result type of the method 
is checked against the expected type (caveat: this step is actually 
skipped if there's no expected type).

As you can see from the code, both approaches tend to reuse a lot of 
what is already available in the compiler internals - that's also why 
the patches are not so big, in terms of line of code.

The simple approach only applies the second step of javac's type 
inference (which is a call to Infer.instantiateExpr). No new 'inference' 
code is required in order to implement that approach.

On the other hand, the complex approach needs both steps (which means a 
call to Infer.instantiateMethod followed by an indirect call to 
Infer.instantiateExpr).

Bottom line: both approaches re-use what is already available inside the 
compiler.

Why is the complex approach 'complex' ? The complex approach requires a 
'fake' constructor symbol to be synthetized by the compiler. This 
constructor should replace all the class type variables with 'holes' 
that can be filled during the inference process. In other words, the 
complex approach doesn't work on the class as it is in the source code; 
it requires a source transformation - so that, e.g. the following 
constructor:

class Foo<X> {
Foo(X x) { ... }
}

gets rewritten as follows:

<X> translated_foo_constr Foo<X> (X x) { ... }

[The actual implementation is optimized, so that no source translation 
occurs - the compiler just adds synthetic symbols to the current class, 
so that the resolution process can succeed].

The difference between the two approaches is that with the simple 
approach the compiler can discover the type of the class to be 
instantiated earlier in the process - that is, w/o having to apply a 
method resolution round. With the complex approach, on the other hand, 
it's impossible to say which classtype will be instantiated before 
method resolution takes place.

We believe that this subtle difference is what makes the simple approach 
more compelling and more in spirit with the Java language.

About evolution: the simple approach is not a proper subset of the 
complex approach - which means that the results given by the two 
algorithms may vary (even if not too often as the benchmarks show). As I 
said there are situations in which one approach is better than the 
other, and vice-versa. However both approaches are a proper subset of 
what I should call 'full complex' approach - which is essentially a 
complex approach on steroids (that is augmented with some javac 
type-inference changes).

Let's rivisit the two examples of my earlier email in order to see if 
the 'full-complex' approach has an answer to both of them:

from DefaultMXBeanMappingFactory.java

private static <T extends Enum<T>> MXBeanMapping
            makeEnumMapping(Class<?> enumClass, Class<T> fake) {
        return new EnumMapping<T>(Util.<Class<T>>cast(enumClass));
//would work with complex but not with basic
    }

[...]

private static final class EnumMapping<T extends Enum<T>> [...]

In this case the 'full-complex' approach will obviously win - as the 
actual argument type would be used in ordet to infer T'==Class<T> 
(simple approach alone would fail because of the recursive bound on T).

The other example was:

from Snapshot.java

    private SoftReference<Vector> finalizablesCache;

    [...]

    Vector<JavaHeapObject> finalizables = new Vector<JavaHeapObject>();

    [...]

    finalizablesCache = new SoftReference<Vector>(finalizables); //would
work with basic but not with complex

    [...]

Which used to fail with complex, as it inferred a type not compatible 
with the LHS - if we could join the two type-inference steps described 
above so to throw the method return type at inference earlier in the 
process, we would have an additional constraint added to the inference 
algorithm that would force the compiler to come up with a type 
compatible with the LHS (SoftReference<Vector>).

In other words the two approaches are currently not fully compatible 
because they are both (in their own way) incomplete. Which also means 
that they both can be subsumed by a more complete approach that is able 
to join the advantages of both approaches.

Maurizio
> On Fri, Aug 21, 2009 at 12:54 PM, Paul Benedict <pbenedict at apache.org>wrote:
>
>   
>> Maurizio and all,
>>
>> Do you believe that the "simple approach" is a subset of the "complex
>> approach"? For the changes recommended regarding the JLS, I hope they would
>> nicely pave the way for the complex approach later on. It would be a shame
>> if these type-inference rules cornered javac from later expanding its
>> rules.
>>
>> Paul
>>
>>
>>     
>
>