Type Inference Question
Dan Smith
daniel.smith at oracle.com
Wed Jun 6 12:39:42 PDT 2012
On May 30, 2012, at 6:12 AM, Jan Finis wrote:
> Hi compiler-dev Team,
>
> I am currently implementing parts of a Java compiler and I am stuck with the type inference of method type parameters, as specified in paragraph 15.12.2.7 of the Java Spec.
>
> I have already tried to ask my questions on stackoverflow, but they are so in-depth that only people who have already gone the trouble implementing this weird piece of text can help me.
>
>
> Actually, I have two questions, the first one was already posed on stackoverflow. I copy its text here:
>
> My problem is this line in the spec:
>
> lcta(U) = ? if U's upper bound is Object, otherwise ? extends lub(U,Object)
>
> U is an arbitrary type expression. What is the upper bound of a type expression? In addition, why is the lcta always a wildcard?
>
> The spec defines
>
> CandidateInvocation(G) = lci(Inv(G)).
>
> Now, for example, consider the case that Inv(G) = { List<String> }, i.e., the only possible candidate invocation is a single parameterized type. Now, due to the rule
>
> lci(G<X1, ..., Xn>) = G<lcta(X1), ..., lcta(Xn)>,
>
> the result of CandidateInvocation( G ) = lci( { List<String> } ) would be defined as:
>
> List<lcta(String)>
>
> in my opinion, lcta should simply return String here, because if List<String> is the only possible invocation, it is a good idea to infer List<String> as the argument. However, the definition of lcta(U) dictates that the result is either ? or ? extends lub(...), so the result IS ALWAYS a wildcard. This seems strange. What am I misinterpreting here?
The JLS rule is clearly wrong here, as you say. It should simply be lcta(U) = U (with similar rules for wildcards, following the pattern used in the binary definition of lcta). Or, better, lci should never bother to invoke lcta with a single type argument.
There are a handful of problems with the definition of lub, so I would take everything here with a large grain of salt, and make sure it's consistent with common sense.
For example:
- There's no mention of unbounded wildcards in the definition of lcta
- The rule for lcta(? extends U, ? super V) is not type-safe (lub(List<? extends Number>, List<? super Number>) = List<Number>!)
- "It is possible that the process above yields an infinite type": huge can of worms, never mentioned again in the spec. javac ignores this and goes two levels deep.
> The second problem is about type inference of capture converted arguments. This code is taken from the openJDK collections library:
>
> class UnmodifiableMap<K,V> {
> public static <T> Set<T> unmodifiableSet(Set<? extends T> s) {
> return null;
> }
>
> private final Map<? extends K, ? extends V> m;
> private transient Set<K> keySet;
> public Set<K> keySet() {
> if (keySet==null)
> keySet = UnmodifiableMap.unmodifiableSet(m.keySet()); //This line is the problem
> return keySet;
> }
>
>
> }
>
> the problem is the type inference for the unmodifiableSet method call. The actual type of the argument m.keySet() is Set<Capture of ? extends K> and the formal type argument is Set<? extends T>. This imposes the subtype constraint Set<Capture of ? extends K> <: Set<? extends T> which is simplified to Capture of ? extends K <: ? extends T and finally T :> Capture of ? extends K. According to the spec, this final constraint yields lub(Capture of ? extends K) as inferred type for T. If I read the spec correctly, lub(Capture of ? extends K) is Capture of ? extends K, so this very capture is inferred for T. My compiler does this and it produces a compile error, since the result of the method is then Set<Capture of ? extends K>. This result cannot be applied to the variable keySet which is of type Set<K>. Thus, the "correct" type inference should yield K instead of Capture of ? extends K. Since this is code from the standard library which should compile fine, it seems that the usual javac infers that correct bound. But how? According to the spec, Capture of ? extends K should be inferred. What am I misinterpreting here?
Agreed that your interpretation is correct, and this should be a type error.
Maurizio has explained the javac voodoo that goes on here, but in this case there's a clear, valid description of what the type should be in the JLS, and I see no reason to depart from it. As he said, though, there's a two-way evolution that goes on in order to hopefully get us to a point where the JLS and javac are completely consistent.
You might be interested to know that we're significantly revising inference for Lambda (being standardized via JSR 335 and ultimately Java SE 8), and so most likely this particular issue will go away in 8. The new strategy would be to look at the return type constraint, "Set<T> <: Set<K>", before settling on an instantiation for T.
Feel free to ask more questions and provide more feedback. It's always enlightening to hear how things go for an independent language implementation.
—Dan
More information about the compiler-dev
mailing list