Overload resolution simplification

Sat Aug 31 06:01:08 PDT 2013

+1

On Sat, Aug 31, 2013 at 3:42 PM, Remi Forax <forax at univ-mlv.fr> wrote:

> I disagree that the actual overload resolution is independent on the
> context,
> the choice of the most specific method is but the choice of applicable
> methods is not independent of the context.
>
> so when there are several overloads and an implicitly typed lambda,
> we can first try to select a subset of all overloads (like we currently
> select applicable methods) and then
> use the fact that the inferred signature must be the same for all
> applicable overloads.
>
> if we do that, it think it solves the problem we have with comparing().
>
> cheers,
> Rémi
>
>
> On 08/26/2013 11:54 PM, Dan Smith wrote:
>
>> Recall that I outlined a simplification to overload resolution a few
>> weeks ago.  There has been some useful discussion about the implications;
>> I'd like now to finalize the decision.  I do not sense a general pushback
>> from EG members, but please speak up if at this point you're uncomfortable
>> with following the approach I presented.
>>
>> There's been a long discussion on the lambda-spec-observers list related
>> to this, which Maurizio has been kind enough to provide an expert voice in.
>>  Those who dislike the proposed overloading behavior generally express
>> discomfort with some of the compiler's limitations:
>> 1) can't disambiguate based on parameter usage in the lambda body
>> 2) can't see "obvious" information from return context and check lambdas
>> early
>> 3) can't see "obvious" matching parameter types and check lambdas early
>>
>> Of course, there are trade-offs between simplicity and expressiveness,
>> and the proposal sides with simplicity in each case.  We can examine them
>> more closely:
>>
>> 1) I've gotten pretty definitive feedback from the EG that we do not want
>> overload resolution to depend on how a lambda parameter is (or is not) used
>> in a lambda body.  Too many Bad Things happen: simple refactorings can
>> break things, users struggle to understand multiple typings of the same
>> expression, lots of technical problems.
>>
>> 2) The canonical example here is Comparators.comparing.  Java has always
>> made overload resolution choices independent of context, and we do not
>> propose to change that basic design principle now.  Thus, there is simply
>> no way to choose between overloads when the only clue for disambiguating is
>> buried in an implicit lambda body that can't yet be type-checked.  (To be
>> clear: _this is not new_.  The overloading limitation has always been
>> there, and no concrete proposal explored by this JSR has done away with it.)
>>
>> 3) This one is more subtle.  The idea is that, in principle, we could
>> carve out exceptions for (1) and (2), but still rely on implicit lambdas
>> for overload disambiguation if neither one applies.  I proposed an analysis
>> at the August 1 meeting that would do this: carving out (1) by raising an
>> error if an attempt was made to type the same lambda with multiple sets of
>> parameter types, and carving out (2) by distinguishing between
>> invocation-context-dependent lambdas and invocation-context-independent
>> lambdas.  The feedback I got is that we needed something simpler.  Hence,
>> the simplified proposal that defers all implicit lambda typing until after
>> overload resolution.
>>
>> Is the sacrifice of expressiveness important?  Zhong Yu described three
>> reasonable overloading patterns, to give a sense of what is being lost [1]:
>>
>>  1. primitive vs boxed
>>>
>>>     map( T -> Integer )
>>>     map( T -> int )
>>>
>>> 2. flat map
>>>
>>>     Future<T>
>>>         Future<S> then( T -> S );
>>>         Future<S> then( T -> Future<S> );
>>>
>>> 3. void vs Void
>>>
>>>         Future<S> then( T -> S ); // where S=Void
>>>         Future<Void> then( T->void );
>>>     // in the 2nd version, the lambda body does not need to return
>>> (Void)null
>>>
>> I argued in my initial mail that our experience with the Stream APIs
>> suggests to us that, even without the language restriction, it's often
>> helpful to provide disambiguating names anyway.  Otherwise, the
>> distinctions might be too subtle.  (For example, we renamed 'map' to
>> 'mapToInt', etc., _without_ being required to do so by the compiler, in
>> part because it provided useful information about otherwise-subtle
>> behavior.)
>>
>> We also found in the Stream API that deceptively-similar patterns can
>> arise in cases in which (1) or (2) apply, and so i) it's not obvious when
>> overloaded declarations like this are "okay", and ii) it's hard to avoid
>> sometimes having to use a set of distinct names for a (conceptually) single
>> overloaded operation, leading to naming inconsistencies.  (For example,
>> 'flatMap' and 'comparing' had to be renamed.  And then it seemed silly to
>> have a 'map' that didn't match 'flatMap'.)
>>
>> One suggestion in lambda-spec-observers was that we could support
>> examples like these three, while avoiding some of the complexity of a more
>> general approach to (3), by checking implicit lambdas "early" if all
>> remaining overloads agree on the parameter types for the lambdas (and never
>> when inference variables appear in parameter types).  While reducing some
>> complexity, this still has the two problems I described in the previous
>> paragraph.
>>
>> Maurizio (Aug 19) had some additional critiques [2]:
>>
>>  the restriction that I've seen popping up more frequently is: only do
>>> type-checking if _all_ overloads agree on the implicit lambda parameter
>>> types, correct? There would be no combinatorial explosion then.
>>>
>>> This is generally a good strategy (in fact the one I've been proposing
>>> since the beginning as a possible incremental update). But it's not
>>> issue free. It requires global reasoning for the user. To answer the
>>> question: will this lambda be checked during overload (so that most
>>> specific will get better treatment)? You really have to work hard, look
>>> at all overloads, and see if they agree on the lambda parameter type. As
>>> I was trying to explain last week, this is way harder than it looks on
>>> the surface, as those target types can be nested within other generic
>>> types, type-variable declarations can be subtly swapped, you can have
>>> wildcards which require their own handling. All this _will_ make things
>>> more complex for the user (in the current model a lambda is type-checked
>>> during overload only if it has explicit parameters - rather easy to see,
>>> no?).
>>>
>>> Another problem is that the approach will create an asymmetry between
>>> generic and non-generic method overloads
>>>
>> ---
>>
>> Last, there's been some confusion about how method references fit into
>> this story, so I thought I'd try to clarify.
>>
>> We put method references in two categories: "exact" and "inexact" (trying
>> out this terminology; feel free to suggest alternatives).
>>
>> Exact method references always refer to the same method, with the same
>> arity and parameter/return types, independent of context.  To meet this
>> standard, they must not be:
>> - Overloaded (some other method exists in the same type with the same
>> name)
>> - Generic (declares type parameters)
>> - Varargs
>>
>> (This design is informed by the fact that a high percentage of method
>> declarations meet this standard.  The remaining cases might eventually be
>> handled with some combination of i) stronger inference, and/or ii) explicit
>> parameter type syntax.)
>>
>> (A method reference like Foo::m is a special case -- it could be an
>> unbound instance method reference or a static method reference; but since
>> there's only one method, we can determine the arity of this reference in a
>> context-independent way by seeing whether the method is declared static or
>> not.)
>>
>> Exact method references are analogous to explicit lambdas: the "meaning"
>> of the expression is context-independent, even though its type (is it a
>> Function? a Predicate?) is not.
>>
>> During overload resolution, exact method references can be used to
>> disambiguate, like explicit lambdas.  Inference constraints can be produced
>> from the referenced method's parameter and return types.  Most-specific
>> logic that prefers ToIntFunction<Foo> over Function<Foo,Object> can be
>> employed.
>>
>> Inexact method references behave like implicit lambdas.  The only thing
>> checked during overload resolution is that they have the right "shape" (an
>> arity check -- but note that, unlike implicit lambdas, an inexact method
>> reference can support multiple arities, so we test whether any possible
>> referenced declaration has an appropriate arity).  If an inexact method
>> reference is passed as an argument to an overloaded method where multiple
>> targeted functional interfaces have a compatible arity, an ambiguity
>> generally occurs, unless some other invocation argument disambiguates.  The
>> "meaning" of the invocation, like the meaning of a lambda body, remains
>> unknown until a set of concrete parameter types can be provided by a target
>> type.
>>
>> While we don't have the benefit of two different syntaxes to visually
>> distinguish the two forms, the goal is for the same rules that apply to
>> implicit/explicit lambdas to also apply to method references.  Hopefully
>> this makes reasoning about overloading behavior in the presence of method
>> references tractable for programmers.
>>
>> ---
>>
>> Conclusion: the discussion (and time to mull over and experiment with
>> things) has not dissuaded us Oracle folks from thinking the proposed path
>> is a good idea.  Nor do I sense substantial pushback from the EG, while at
>> the same time this solves some of the hard complexity problems that the EG
>> was uncomfortable with.  The prototype (in the OpenJDK Lambda repository)
>> seems to be working.  We have a much cleaner story to tell users.  And,
>> finally, this conservative path leaves us room to change our mind and add
>> extra power in a later version, if needed.  So it looks like all signs
>> point to adopting this plan.
>>
>> Please chime in if you feel like there's anything we've overlooked...
>>
>> —Dan
>>
>> [1] http://mail.openjdk.java.net/**pipermail/lambda-spec-**
>> observers/2013-August/000422.**html<http://mail.openjdk.java.net/pipermail/lambda-spec-observers/2013-August/000422.html>
>> [2] http://mail.openjdk.java.net/**pipermail/lambda-spec-**
>> observers/2013-August/000474.**html<http://mail.openjdk.java.net/pipermail/lambda-spec-observers/2013-August/000474.html>
>>
>> On Aug 8, 2013, at 6:19 PM, Dan Smith <daniel.smith at oracle.com> wrote:
>>
>>  We spent some time at the EG meeting last week talking about the
>>> overload resolution story in the presence of lambdas/method references
>>> (and, why not, type argument inference).  There are a lot of tricky
>>> dependencies here, and the goal is to find a balance between expressivity
>>> and simplicity.
>>>
>>> The sense I got from the meeting is that, despite our efforts to refine
>>> the story (there have been a few iterations), we're still not there yet in
>>> terms of simplicity.  In particular, I think what's crucial about the model
>>> I presented is that users can identify the difference between implicit
>>> lambdas that get type checked pre-overload-resolution and
>>> post-overload-resolution; the sanity check I got is that nobody will be
>>> able to make that distinction.
>>>
>>> A couple of days later, Maurizio pointed out that, as we've iterated on
>>> our libraries, we've largely abandoned the space of programs that requires
>>> some of the more complex overload disambiguation machinery.  And looking
>>> more closely at those use cases, we agreed that we've probably been
>>> focusing too much on some atypical patterns.
>>>
>>> So, let me propose a greatly simplified but probably not-very-noticeably
>>> less expressive approach:
>>>
>>> Overload resolution will only check the arity of all implicit lambdas
>>> and will ignore overloaded method references.  If the body of a lambda is
>>> important for disambiguation, it must have explicit parameter types.
>>>
>>> Benefits of this approach:
>>> - Very easy to understand -- it's mostly a syntactic distinction
>>> - Consistent between all different patterns of overloading that were
>>> previously treated differently
>>> - Facilitates a simple declaration-site warning check when method
>>> signatures conflict
>>> - Encourages use of explicit lambdas -- clearly acknowledges that we
>>> can't solve all inference problems with implicit lambdas
>>> - Avoids re-checking lambdas with different parameter types which means:
>>> -- Typing of lambda bodies is easier for users to process
>>> -- Implementations don't have to do speculative checking of arbitrary
>>> blocks of code
>>> -- Bad theoretical complexity goes away
>>>
>>> We've thought about it for a few days and think this is a much better
>>> scenario for users and more in line with the EG's expectations (based on
>>> feedback both this year and last).
>>>
>>> Any questions/concerns?
>>>
>>> ---
>>>
>>> Here's an example of something we would stop disambiguating:
>>>
>>> interface I<T> {
>>> <R> R map(Function<T,R> f);
>>> int map(ToIntFunction<T> f);
>>> long map(ToLongFunction<T> f);
>>> double map(ToDoubleFunction<T> f);
>>> }
>>>
>>> someIofString.map(s -> s.length());
>>>
>>> Declaration-site workaround: rename the methods.
>>>
>>> Use-site workaround: explicit parameter type:
>>> someIofString.map((String s) -> s.length());
>>>
>>> ---
>>>
>>> Here's an example of something else we would stop disambiguating:
>>>
>>> static void m(Function<String,Integer> f);
>>> static void m(ToIntFunction<Object> f);
>>>
>>> m(x -> x.length() > 10 ? 5 : 10);
>>>
>>> ---
>>>
>>> And here's something that we never could disambiguate in the first place
>>> (due to fundamental design constraints):
>>>
>>> interface Comparators {
>>> <T, R extends Comparable<? super R>> Comparator<T> comparing(Function<T,
>>> R> f);
>>> <T> Comparator<T> comparing(ToIntFunction<T> f);
>>> }
>>>
>>> Comparator<String> cs = Comparators.comparing(s -> -s.length());
>>>
>>> ---
>>>
>>> —Dan
>>>
>>
>