Comparator
John Rose
john.r.rose at oracle.com
Mon Jul 22 18:07:38 PDT 2013
On Jul 16, 2013, at 3:08 PM, Dan Smith <daniel.smith at oracle.com> wrote:
> On Jul 15, 2013, at 2:33 PM, John Rose <john.r.rose at oracle.com> wrote:
>
>> ...I would hope that the inferencer can help the user in many cases like Marc's.
I'm still hoping, but will try hard to do this in the context of the very reasonable account you have given. (Thanks!)
> Without ruling out the possibility of enhancements that address situations like this, let me provide some general guidance/motivation for the status quo.
>
> 1) Certain generic methods have to-be-inferred type variables in their return types. These are special entities: invocations can be poly expressions -- using context to influence type inference. When you're analyzing inference, mentally highlight these methods, because they will get special treatment. The same goes for diamond class instance creation expressions.
The special treatment I'm hoping for here, I think, is that receiver expressions can be partially typed to include type inference variables that are later solved with constraints contributed by non-receiver arguments. That's what (I think) I mean by "parity" for receiver arguments.
Maybe an extra-special treatment would be required, in order to retain type inference variables arising from a receiver expression and *not* resolve them (make them concrete) to their bounds? Is it only poly expressions that are exempt from having their type variables "resolved" to the bounds? In that case, the necessary fact that receivers not be poly expressions would prevent them from "exporting" type inference variables to be later combined with the other arguments.
> 2) In Java 7, the special treatment for these methods applied only when i) the invocation is in an assignment context, and ii) the arguments to the invocation don't provide any lower-bound information about the variables that appear in the return type. In Java 8, this is supported much more broadly: in any *assignment* or *invocation* context, we use the target type to influence inference. We even let multiple levels of invocation contexts push down a target type (e.g., "String s = id(id(id(someMethod())))", because all the variables are essentially solved at the same time.
What I am (or was) hoping for is similar treatement something like String s = someFactory().id().id().id(). But if someFactory() is solved before anything else, then the contribution from String is not possible. If (somehow) someFactor() were only partially reduced to Supplier<T#alpha_42>, then T#alpha_42 could eventually be married up with String.
> 3) A method invocation receiver is _not_ in an assignment or invocation context. Thus, there is no context to influence the typing, and the expression must be typed on its own. Why can't we just pretend that an instance method is actually a static method, thus giving the receiver a target type? Because method resolution needs a "type to search" as a prerequisite, and this is the type of the receiver.
Yes. And so I reach for a notion (perhaps chimerical, perhaps inherently complicated) of "partially reduced" receiver, whose type C<T#...> is neither a proper type, nor a pure inference variable. As a JVM guy, I say to myself, "this one has an erasure C, so surely it can be used as a scope and searched for methods."
> It is hard, and in some cases impossible, to do everything else in method resolution (searching for applicable methods, picking the most specific one), when we haven't even worked out what the type to search is yet.
Yes, it feels like there are corner cases here to stumble over.
If I have a proper type for the receiver, then I can pick a method out from the receiver which has no type variables, except those on the method itself. In particular, the type parameters for the receiver class (like T in Supplier<T>) have been given the "no-poly-expression" treatment, which I think means they reduce to their bounds.
If I allow the receiver to "float" as an improper type (Supplier<T#alpha_42>), then all of its methods will have type inference variables "out of the box", not just from inside their individual type parameter bindings (<U#beta_96> from public <U> U id(U)). if the selected method is not overloaded, then it seems like any extra per-method type parameter it has (U) would work indistinguishably from the up-level type parameter (T).
If the selected method is overloaded (there are >1 methods initially selected from the scope), then the enclosing type parameter may play an odd role in the logic for selecting the applicable methods and most specific method. We have adjusted those selection passes to tolerate type inference variables, but perhaps having a single TIV binding apply to all the methods simultaneously would cause some problem? (If so, we could make some sort of ad hoc definition, such as splitting the type variable T and imputing it separately to all of the methods, as if it were a sibling of each U.)
So here's a boldly uninformed proposal: Allow receivers to be poly expressions. Forbid them from typing to anything other than a proper type or improper type with a proper head, C<T...> or C<T#gamma...>. (Reduce a pure type inference variable to its class bound which must exist, as is done now. Disallow member refs and lambdas on any convenient grounds.) When testing method application and specificity, treat any residual TIV arising from the receiver as if it arose from each tested method, in a separate copy. Applicability or specificity testing might re-unify the splits of the residual TIV, or might not; let the chips fall. Maybe it's useful and not too complex?
Anyway, I have heard before about the impossible corner cases, but I don't have a clear idea of what they are like. Do you have one handy in Email that you can forward or that I can look up on lambda-dev or somewhere? The impossible corner case would be some problem of applicability or specificity testing which is not soluble when a type variable in the receiver's class is left unresolved; and it would be not soluble despite the fact that a static method call analogue of the expression is tolerant of the same unresolved type variables. ... I'll stop guessing about what the problem is, but I'm very curious.
> Implication: receivers are _not_ interchangeable with method arguments. (This isn't new: arguments can be boxed but receivers cannot; arguments can have constraints on their type variables -- see Collections.sort -- but receivers cannot.)
I know it's not new; this is why there are workarounds in the Java programming literature for using static methods to gain control over erring type variables. It has always irked me that non-static methods can't reliably do the same tricks. Now that we are investing heavily in fluent APIs (rightly) I am venting about it, in hopes that something can be done. (There's that "hope" word again.)
I suppose we could force them to be interchangeable, with suitable adjustment for scoping, application, and selection as suggested above. The "suitable adjustment" might then be ugly, complex, unintuitive, and with surprising consequences for the user.
I would hope, though, that a suitable adjustment could be found which would do exactly the needful thing and no more: Allow the receiver to export type inference variables but still contribute a scope with applicable methods.
> 4) We could certainly do a better job with error messages.
For example, perhaps the error engine could "roll back" resolved/bound type variables on the receiver to the previous TIVs, then see if there is a solution, and finally suggest a cast or explicit type argument somewhere in the erring method call which would lead to the same solution. That's about what we will do manually and slowly by email.
> TLDR: the status quo is that when you type a dot, the compiler has to completely type-check the stuff to the left before it can proceed. Inference can do a lot of cool tricks with context, but will not be influenced by "context" that occurs after the dot.
Which implies that if a user is baffled by unsolved type variables in a long fluent method chain, an explicit cast or type argument should be placed near the leftmost dot in the chain.
> —Dan
Thank you for your help with this. Watching the draft spec. come to life in pleasantly usable APIs is marvelous!
— John
More information about the lambda-dev
mailing list