Deconstructor (and pattern) overload selection

Wed Apr 3 12:15:41 UTC 2024

Hello, 
I would be even more brutal here because I think that the reason *alternative representation* is better serve by factory methods than constructors. 

The same way, in term of de-constructrion, for *alternative representation*, a named method pattern is better than a deconstructor. 
So instead of one deconstructor for (A,B) and another one for (X,Y), I think we should stir users ti use two methods asAB() and asXY(). 

For me, i think it's enough, at least for a first preview, to disambiguate only on the deconstructor arity. 

Note that the deconstructor arity is important because conceptually adding a deconstructor with a supplementary binding is conceptually equivalent to adding a getter to a class, 
we need that to be able to enhance a class in a backward compatible way. 

regards, 
Rémi 

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Monday, April 1, 2024 6:34:49 PM
> Subject: Deconstructor (and pattern) overload selection

> The next big pattern matching JEP will be about deconstruction patterns. (Static
> and instance patterns will likely come separately.) Now that we've got the
> bikeshed painting underway, there are a few other loose ends here, and one of
> them is overload selection.

> We explored taking the existing overload selection algorithm and turning it
> inside out, but after going down that road a bit, I think this both
> unnecessarily much complexity for not enough value, and also potentially
> fraught with nasty corner cases. I think there is a much simpler answer here
> which is entirely good enough.

> First, let's remind ourselves, why do we have constructor overloading in the
> first place? There are three main reasons:

> - Concision. If a fully-general constructor takes many parameters, but not all
> are essential to the use case, then the construction site becomes a site of
> accidental complexity. Being able to handle common grouping of parameters
> simplifies use sites.

> - Flexibility. Related to the above, not only might the user not need to specify
> a given constructor parameter, but they want the flexibility of saying "let the
> implementation pick the best value". Constructors with fewer parameters reserve
> more flexibility for the implementation.

> - Alternative representations. Some objects may take multiple representations as
> input, such as accepting a Date, a LocalDate, or a LocalDateTime.

> The first two cases are generally handled with "telescoping constructor nests",
> where we have:

> Foo(A a)
> Foo(A a, B b)
> Foo(A a, B b, C d, D d)

> Sometimes the telescopes don't fold perfectly, and becomes "trees":

> Foo(A a)
> Foo(A a, B b)
> Foo(A a, C c, D d)
> Foo(A a, B b, C d, D d)

> Which constructors to include are subjective judgments on the part of class
> authors to find good tradeoffs between code size and concision/flexibility.

> We had initially assumed that each constructor overload would have a
> corresponding deconstructor, but further experimentation suggests this is not
> an ideal assumption.

> Clue One that it is not a good assumption comes from the asymmetry between
> constructors and deconstructors; if we have constructors and deconstructors of
> shape C(List), then it is OK to invoke C's constructor with List or its
> subtypes, but we can invoke C's deconstructor with List or its subtypes or its
> supertypes.

> Clue Two is that applicability for constructors is based on method invocation
> context, but applicability for deconstructors is based on cast context, which
> has different rules. It seems unlikely that we will ever get symmetry given
> this.

> The "Flexibility" requirement does not really apply to deconstructors; having a
> deconstructor that accepts additional bindings does not constrain anything, not
> in the same way as a constructor taking needlessly specific arguments. Imagine
> if ArrayList had only constructors that take int (for array capacity); this is
> terrible for the constructor, because it forces a resource management decision
> onto users who will not likely make a very good decision, and one that is hard
> to change later, but pretty much harmless for deconstructors.

> The "Concision" requirement does not really apply as much to deconstructors as
> constructors; matching with `Foo(var a, _, _)` is not nearly as painful as
> invoking with lots of parameters, each of which require an explicit choice by
> the user.

> So the main reason for overloading deconstructors is to match representations
> with the constructor overloads -- but with a given "representation set", there
> probably does not need to be as many deconstructors as constructors. What we
> really need is to match the "maximal" constructor in a telescoping nest with a
> corresponding deconstructor, or for a tree-shaped set, one for each "maximal"
> representation.

> So for a class with constructors

> Foo()
> Foo(A a)
> Foo(A a, B B)
> Foo(X x)
> Foo(X x, Y y)

> we would want dtors for (A,B) and (X,Y), but don't really need the others.

> So, let's start fresh on overload selection. Deconstructors have a set of
> applicability rules based on arity first (eventually, varargs, but not yet) and
> then on applicability of type patterns, which is in turn rooted in castability.
> Because we don't have the compatibility problem introduced by autoboxing, we
> can ignore the distinction between phase 1 and 2 of overload selection (we will
> have this problem with varargs later, though.)

> Given this, the main question we have to resolve is to what degree -- if any --
> we may deem one overload "more applicable" than others. I think there is one
> rule here that is forced: an exact type match (modulo erasure) is more
> applicable than an inexact type match. So given:

> D(Object o)
> D(String s)

> then

> case D(String s)

> should choose the latter. This allows the client to (mostly) steer to a specific
> overload just by using the right types (rather than `var` or a subtype.) It is
> not clear to me whether we need anything more here; in the event of ambiguity,
> a client can pick the right overload with the right type patterns. (Nested
> patterns may need to be manually unrolled to subsequent clauses in some cases.)

> So basically (on a per-binding basis): an exact match is more applicable than an
> inexact match, and ... that's it. Users can steer towards a particular overload
> by selecting exact matches on enough bindings. Libraries can provide their own
> "joins" if they want to disambiguate problematic overloads like:

> D(Object o, String s)
> D(String s, Object o)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20240403/b2ee0d75/attachment.htm>