heads-up: biggie overload rewrite

Fri Jul 26 10:44:29 PDT 2013

On Fri, Jul 26, 2013 at 6:25 AM, Maurizio Cimadamore
<maurizio.cimadamore at oracle.com> wrote:
> On 26/07/13 00:47, Zhong Yu wrote:
>>
>> On Thu, Jul 25, 2013 at 2:49 PM, Dan Smith <daniel.smith at oracle.com>
>> wrote:
>>>
>>> It's a question of what the API should look like.  To overload methods
>>> like this is to make a calculated choice that only non-overloaded method
>>> references and explicitly-typed lambdas can be used as arguments.  I'm not
>>> sure if that choice has actually been made in this case, or if the library
>>> folks were waiting for the dust to settle before deciding how to handle
>>> 'comparing'.
>>>
>>> >From a language perspective, the problem is simple: we can't type check
>>> 's.length()' until we know that 's' is a String, and we can't figure that
>>> out until we look at the assignment target, and we can't do that until
>>> overload resolution is
>>
>> >From a human's stand point, things on the left should put constraints
>> on things on the right, because things on the left come first. When a
>> writer/reader writes/reads this partial code
>>
>>      Comparator<String> comparator = Comparator.comparing(
>>
>> it's crystal clear to him that T=String; it feels like the programmer
>> has instructed the compiler with the explicit type argument. It'll be
>> very surprising to him that the compiler fails to see that because the
>> compiler treats information on the left (the assignment target) with a
>> lower priority - contrary to human's parsing habit.
>>
>> In Java assignment targets happen to be on the left, so I believe it
>> should take precedence in inference constraints. (In another imaginary
>> language the target can be on the right, then I'll be against to
>> include the target in inference)
>
> Well, humans do that all the time when reading expressions like
>

> 1 +  3 * 2

We can instantaneously identify the structure of this expression, only
due to intense training and practices at young age. What about this
java expression -

    1 + 2 & 3

Most java programmer cannot parse it offhand. A better language design
would be to group "+", "&" and most operators at the same precedence
level, so that programmers only need to parse from left to right.

>
> Don't they?
>
> The rule of the game here have always been that you look at the RHS first,
> you get a type for it, and then check compatibility against the LHS. Which I
> think it's how 99.9% of imperative languages work.
>
> Another example is with method selection:
>

> m(expr)

(0)
If the signature of `m` is to be determined due to overload and/or
free type variables, and if the type of `expr` is determined free of
context, we use the type of `expr` to determine the signature of `m`.
This is child-to-parent, not left-to-right (or top-to-bottom) parsing.
I guess we are ok with it because, (i) `m` and `expr` are very close,
(ii) multiple applicable `m` methods must all do roughly the same
thing, so it doesn't bother us very much that the exact signature of
`m` is uncertain when we scan up to `m(`

Let's also look at other 3 cases:

(1)
Signature of `m` is determined free of context (no overload or free
type variable); type of `expr` is also determined free of context. No
problem here, neither puts a constraint on the other. However, if the
argument type doesn't match the parameter type, there are two ways to
report the conflict:
    left-to-right: "method m(X) cannot be applied to Y"
    child-to-parent: "there is no such method m(Y)"
I think the left-to-right version sounds more natural.

(2)
Signature of `m` is determined; type of `expr` is to be determined.
Should we use the parameter type to constrain the argument type? Why
not. I've seen many questions of this nature on stackoverflow.com. It
is very surprising to programmers that the compiler cannot take
advantage of knowledge of `m` to infer `expr`. It seems very
reasonable to think that the method invocation is equivalent to
    X x = expr;
    m(x);
I think this is definitely a deficiency of the language.

(3)
Neither are determined. I believe no inference should be done, it's
too magical, and it'll be hard to human to understand. The programmer
needs to provide type info to at least one of them.

I'm not in favor of lavish type inference. The main point of static
typing is that we *want* to write down the types explicitly. We may
wish to eliminate some apparently redundant type info in close
vicinity, but it's not our general intention to reduce explicit type
info in our code.

Zhong Yu

>
> In most languages you need to know what 'expr' is before you can look at
> what m is, despite this order of operation is 'against' human parsers
> abilities.
>
> If you change the rules of the game and make overload selection depending on
> the target type, then understanding the meaning of expressions like
>
> g(f(m()))
>
> will be much harder, for both humans, compilers and IDEs.
>
> Maurizio
>
>
>>
>> Zhong Yu
>>
>>
>> done.  In other words, there is not a context-independent way to
>> figure out that 'Comparator.comparing(s->s.length())' is best treated
>> as an invocation of  'comparing(ToIntFunction)'.
>>>
>>> —Dan
>>>
>>> On Jul 25, 2013, at 1:40 PM, Sam Pullara <spullara at gmail.com> wrote:
>>>
>>>> Bummer. That is going to be surprising for people, especially with
>>>> comparators.
>>>>
>>>> Sam
>>>>
>>>> On Jul 25, 2013, at 12:37 PM, Maurizio Cimadamore
>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>
>>>>> On 25/07/13 20:25, Maurizio Cimadamore wrote:
>>>>>>
>>>>>> On 25/07/13 20:22, Maurizio Cimadamore wrote:
>>>>>>>
>>>>>>> On 25/07/13 19:15, Sam Pullara wrote:
>>>>>>>>
>>>>>>>> I just did a build this morning and still see these fail:
>>>>>>>>
>>>>>>>>           Comparator<String> comparator =
>>>>>>>> Comparator.comparing(String::length);
>>>>>>>>           Comparator<String> comparator = Comparator.comparing(s ->
>>>>>>>> s.length());
>>>>>>>>
>>>>>>>> Are they still supposed to result in:
>>>>>>>>
>>>>>>>> java: reference to comparing is ambiguous
>>>>>>>>     both method <T>comparing(java.util.function.ToLongFunction<?
>>>>>>>> super T>) in java.util.Comparator and method
>>>>>>>> <T>comparing(java.util.function.ToDoubleFunction<? super T>) in
>>>>>>>> java.util.Comparator match
>>>>>>>
>>>>>>> Possible that you didn't pick up the changes?
>>>>>>
>>>>>> The first work for me (haven't tried the second but should be ok).
>>>>>
>>>>> Sorry - was doing the wrong experiment; the first one WILL work - but
>>>>> not the second; the problem is that the signature:
>>>>>
>>>>> public static <T> Comparator<T> comparing(ToIntFunction<? super T>
>>>>> keyExtractor) {
>>>>>
>>>>> doesn't play by the rules illustrated in my email (T is included in the
>>>>> return type).
>>>>>
>>>>> Maurizio
>>>>>>
>>>>>> Maurizio
>>>>>>>
>>>>>>> Maurizio
>>>>>>>>
>>>>>>>> Sam
>>>>>>>>
>>>>>>>> On Jul 25, 2013, at 9:47 AM, Maurizio Cimadamore
>>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>>
>>>>>>>>> Dear lambdians,
>>>>>>>>> I've just pushed a patch [1] that enhances javac overload
>>>>>>>>> resolution/most specific story in several ways. One of the most
>>>>>>>>> notable
>>>>>>>>> effects is the removal of the dreaded 'inference loop' message [2].
>>>>>>>>> The
>>>>>>>>> logic behind that message was noble: at the time we thought it
>>>>>>>>> would
>>>>>>>>> have been better to report an error when the compiler was forced to
>>>>>>>>> infer a variable to some 'default' instantiation (such as
>>>>>>>>> j.l.Object),
>>>>>>>>> as this could cause severe downstream problems when type-checking a
>>>>>>>>> lambda whose body depended on that choice. However, this was before
>>>>>>>>> we
>>>>>>>>> added the more complex inference support; now that we have a more
>>>>>>>>> capable inference engine, with all bells and whistles, we also have
>>>>>>>>> a
>>>>>>>>> bigger degree of complexity and, because inference constraints are
>>>>>>>>> propagated transitively, the line between default and non-default
>>>>>>>>> instantiation has become a lot fuzzier than it used to be. Hence
>>>>>>>>> the
>>>>>>>>> decision of getting rid of that logic (and related error message) -
>>>>>>>>> which also makes the language more consistent (as inference
>>>>>>>>> typically
>>>>>>>>> only gave such errors when lambdas/method references are present).
>>>>>>>>>
>>>>>>>>> Other improvements in sight for the structural most specific logic;
>>>>>>>>> many
>>>>>>>>> of you [3,4,5] have reported cases in which the compiler was unable
>>>>>>>>> to
>>>>>>>>> distinguish between several signatures, where the 'right' choice
>>>>>>>>> seemed
>>>>>>>>> indeed really easy. The problem in that case was that the
>>>>>>>>> structural
>>>>>>>>> most specific check would only kick in if the compiler can prove
>>>>>>>>> that
>>>>>>>>> all target types agree on the parameter types to be inferred for
>>>>>>>>> i.e. an
>>>>>>>>> implicit lambda. In certain cases (when generic methods were used),
>>>>>>>>> the
>>>>>>>>> compiler couldn't do that, so it basically went back to the old
>>>>>>>>> most
>>>>>>>>> specific logic. Here's an example:
>>>>>>>>>
>>>>>>>>> <T,R> Stream<R> map(Stream<T> s, Function<T,R> f) { }
>>>>>>>>> <T> IntStream map(Stream<T> s, ToIntFunction<T> f) { }
>>>>>>>>>
>>>>>>>>> map(ss, s->s.length()); //now ok - used to be ambiguous
>>>>>>>>>
>>>>>>>>> There are rules to this game though; if the variable to be inferred
>>>>>>>>> to
>>>>>>>>> be able to type-check the lambda (T in the above example) happened
>>>>>>>>> to
>>>>>>>>> depend on one of the inference variables mentioned in the method
>>>>>>>>> return
>>>>>>>>> type, the most specific check would fail and the compiler would
>>>>>>>>> again
>>>>>>>>> report an ambiguity. The reason for this is that it's not possible
>>>>>>>>> to
>>>>>>>>> guarantee that the eager instantiation of T would remain the same
>>>>>>>>> after
>>>>>>>>> looking at the target type (and we want overload selection to be
>>>>>>>>> independent from the target type, as we believe it's crucial to
>>>>>>>>> keep the
>>>>>>>>> model tractable for developers).
>>>>>>>>>
>>>>>>>>> The last improvement is related to the way in which method
>>>>>>>>> arguments are
>>>>>>>>> type-checked; javac is now able to reason about the subtle
>>>>>>>>> dependencies
>>>>>>>>> that arise when a lambda is passed as an argument to a generic
>>>>>>>>> method;
>>>>>>>>> in the above case for instance, javac will detect that there's a
>>>>>>>>> dependency between T and R in the first method. In fact, if we had
>>>>>>>>> an
>>>>>>>>> instantiation for T, we would then be able to type-check the lambda
>>>>>>>>> and
>>>>>>>>> we will most likely be able to derive new constraints for R. So, it
>>>>>>>>> would be mad for the compiler to go and try to infer R _before_
>>>>>>>>> looking
>>>>>>>>> at the lambda expression.
>>>>>>>>>
>>>>>>>>> I think those improvements go a long way in terms of polishing the
>>>>>>>>> overall overload resolution story that the language presents to
>>>>>>>>> developers; it gets rid of several outstanding issues, and makes
>>>>>>>>> the
>>>>>>>>> whole overload selection process more streamlined and consistent.
>>>>>>>>> I'm
>>>>>>>>> looking forward to hear your feedback (and bug reports :-)) as you
>>>>>>>>> start
>>>>>>>>> using the next promoted lambda bits.
>>>>>>>>>
>>>>>>>>> Enjoy the ride!
>>>>>>>>>
>>>>>>>>> [1] -
>>>>>>>>> http://hg.openjdk.java.net/lambda/lambda/langtools/rev/d34073d069c8
>>>>>>>>> [2] -
>>>>>>>>>
>>>>>>>>> http://mail.openjdk.java.net/pipermail/lambda-dev/2013-July/010352.html
>>>>>>>>> [3] -
>>>>>>>>>
>>>>>>>>> http://mail.openjdk.java.net/pipermail/lambda-dev/2013-July/010476.html
>>>>>>>>> [4] -
>>>>>>>>>
>>>>>>>>> http://mail.openjdk.java.net/pipermail/lambda-dev/2013-June/010088.html
>>>>>>>>> [5] -
>>>>>>>>>
>>>>>>>>> http://mail.openjdk.java.net/pipermail/lambda-dev/2013-July/010590.html
>>>>>>>>>
>>>>>>>>> Maurizio
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>
>>>
>