improved return type-inference: request for review

Tue Feb 24 09:43:18 PST 2009

Hi,
this is the fix [1] for an outstanding javac/JLS inference bug: [3] and
[4] respectively. Those CRs both have to do with recursive bounds being
included in the set of constraints generated by 15.12.2.8 (but not by
javac). It's important to fix it both because it's preventing other
important inference fixes (6638712) and because it might lead to
problems when implementing an item of the project coin (namely 'Concise
initialization of generic variables' [2]).

Let's try to keep things simple: let's start from the example recently
submitted by Martin, which I think it's the simpler test case I've seen
so far:

class Test<X> {

  <K extends Test<K>> void m() {}

  void test() {
    m();
  }
}

When you compile this, the compiler complains with the following message
(afaik this affects all compilers, jdk 5, 6, 6-open, 7):

TestX.java:6: incompatible types; inferred type argument(s)
java.lang.Object do not conform to bounds of type variable(s) K
found   : <K>void
required: void
    m();
     ^
1 error

The message is somewhat messy, and hard to understand, but the meaning
it's simple: the type inferred for K is Object; unfortunately Object is
not compatible with K's declared bound; in fact Object should be a
subtype of [K:=Object]Test<K> = Test<Object> which is obviously not true.

How happened that jaac inferred Object for K? According to JLS
15.12.2.8, the following set of constraints should be derived for the
above method call:

Test<K> >> K

K should then be inferred as glb(Test<K>) = Test<K>.

This is unfortunate at best for two reasons: first, the inferred type
for K is defined in terms of K itself; secondly, Test<K> does not
conform to K's declared bound, as Test<K> <: [K:=Test<K>]Test<K> =
Test<Test<K>> does not hold.

However, javac impl does not seem to follow JLS here; in fact, javac is
deliberately removing any recursive constraint from the set derived from
15.12.2.8. Since there's just one constraint here, this leave us with
the answer K = Object, which is slightly better than the solution
provided by the JLS as it does not contains further references to K -
but still does not satisfy K's declared bound.

[Sidebar: there other examples in which javac follows more closely the
algorithm described by the JLS - in other words, sometimes javac fails
to exclude recursive bounds from the set of generated constraints,
because of minor trivial bugs - such bugs are however crucial when it
comes to compile the code of some compiler regression tests!]

Now the though question: how should JLS deal with this? I think there
are at least two available options:

1) Give up inference (which means that javac is right) and inferring
Object for K (which then causes a compilation error because Object does
not conform to K's bound).

2) Adopt a complex inference scheme that would allow 15.12.2.8 to yield
an inferred type that (i) does not depend on K and (ii) respect K's
declared (recursive) bound.

I went for (2), and here's why: in our running example (and in similar
examples we have in our regression codebase), one could be tempted to
say: ''this is just rubbish, fix your code, please''. On the other hand,
last year I noticed that there are several use cases that would start to
fail if we choosed (1):

Object value = Enum.valueOf((Class)Test.class, "");

where Enum.valueof is defined as follows
<T extends Enum<T>> T valueOf(Class<T>, String)

In this case we have that T cannot be inferred from actual arguments (as
the only argument is raw here). It follows that T must be inferred in
15.12.2.8; here we have the following constraints:

T <: Object (for the return type)
T <: Enum<T> (declared bound)

In other words we are in the same situation as above - only, this coding
pattern is way more common (here inferring Object for T would cause the
above code to be rejected). At this point you might have a question: how
does javac deal with this code? We showed how javac is not capable of
handling recursive bounds; the trick here is that javac (very
surprisingly) never applies 15.12.2.8 (since this is an unchecked call).
Which also means that javac never has to find an answer to the above
question. Note that one day javac will be corrected so that 15.12.2.8
will be applied even in this case - the fact that javac doesn't apply
15.12.2.8 can be regarded as a bug (one of the biggest consequences of
this is CR 6638712).

Back to the original question, given the only constraint (derived from
15.12.2.8):

Test<K> >> K

What type should we infer for K?

I have shown that Object is not a viable answer (does not respect K's
declared bound). At first it would seem that we should pick a type among
the following candidates:

-Test<?>
-Test<? extends Test<?>>

Those types are surely better than Object, but all have the problem of
not satisfying K's bound: e.g.

Test<?> <: [K:=Test<?>]Test<K> = Test<Test<?>> (false because
type-argument Test<?> does not contain ?)

I think there's only one solution to this problem, which is the following:

K inferred as #1, where #1 is a captured type variable whose upper bound
is Test<#1>. Let's see how this works w.r.t. to bound checking:

#1 <; [K:=#1]Test<K> = Test<#1>
ub(#1) = Test<#1> <: Test<#1> (true!)

This solution has all the required properties, in that (i) the inferred
type does not contain reference to the type to be inferred (T) and (ii)
the inferred type satisfy K's declared bound.

As you can see, the right answer is not that trivial - moreover as Alex
mentioned in the related JLS CR [3], an acceptable solution should be
able to deal with more complex cases like:

class Foo {
    <T extends List<U>, U extends List<T>> U foo() {return null;}
    List<?> s =  foo();
}

Here we have two mutually referring recursive bounds (aaargh!). My
solution works by inferring #1 (extends List<#2>) for T and #2 (extends
List<#1>) for U.

Final note:
Since my approach makes use of some synthetically generated
captured-type variables with recursive bounds, sometimes javac might
crash because  of a problem in the toString routine for CapturedTypes
(infinite loop). A recent, separate diagnostic work, ease this problem
by truncating the infinite regression - what I'm saying is that, wthout
proper diagnostic support, it won't be possible to support this fix as
it is. The problem is that each captured type in the javac world is
associated to an underlying wildcard type argument (the one that
originated the captured type). In this case it's obvious that, given the
fact that the captured type variable I'm introducing is synthetic,
there's no such wildcard type argument - however I believe that
displaying a meaningful wildcard to the user could help him understand
what went wrong in case of errors - but one could always replace the
synthetic captured type's wildcard argument with a simpler type, such as
Test<?> (instead of the more precise Test<? extends #1> which leads to
infinite recursion).

Maurizio

[1] http://cr.openjdk.java.net/~mcimadamore/6369605/
[2] http://bugs.sun.com/view_bug.do?bug_id=4879776
[3] http://bugs.sun.com/view_bug.do?bug_id=6369608
[4] http://bugs.sun.com/view_bug.do?bug_id=6369605