Pattern method names based on existing library conventions [was Re: Pattern matching for nested List with additional size constraints]

Thu Jan 28 16:40:00 UTC 2021

This is all a great first attempt to navigate this new territory. Having 
spent a lot of time there already, there's some I agree with, and some 
for which I'd say "understandable first thought, but keep going."  It 
will surely take us a while to discover the ideal set of idioms for this 
new language, just as it did in the early days of Java.

If you don't read any further, the takeaway of all this is: patterns are 
not just "better getters", and our API design idioms should not merely 
be extrapolated from what worked with getters, but should reflect the 
richness of what patterns enable.  The old rules will encourage us 
towards some bad habits for the new world, and we should be on lookout 
for that.

> On Wed, 27 Jan 2021 at 20:03, Brian Goetz <brian.goetz at oracle.com> wrote:
>>        if (p instanceof Profile(Rules(List.of(var rule, ...)))) {...}
> With my library design hat on, I don't think "of" is a suitable method
> name here.

There are surely lots of possible library idioms, and I'm sure we've not 
thought of all of them yet, so its good to brainstorm about what some 
others might be.  For the record, here's the rationale behind this 
particular naming choice.

A key design goal for pattern matching in Java is that however you make 
an object, there is a complementary way to take it apart, and, absent a 
good reason to the contrary, that should be as syntactically and 
semantically similar as possible.

If we construct with constructor:

     Foo f = new Foo(a);

we can take apart with the same-descriptor deconstructor:

     if (f instanceof Foo(var a)) { ... }

If we construct with a factory:

     Foo f = Foo.of(a);

we can take apart with the same-named, same-descriptor deconstructor:

     if (f instanceof Foo.of(var a)) { ... }

or, perhaps a more evident example:

     Optional<String> a = Optional.of(x);
     Optional<String> b = Optional.empty();

     switch (x) {
         case Optional.of(var x): ...
         case Optional.empty(): ...
     }

In this case (heh), the library designer (hi!) chose to use factory 
methods _for the names_; the deconstruction benefits from the same thing.

(There are other forms of aggregation, which should have corresponding 
patterns -- array initializers and patterns, and eventually, if we ever 
have collection literals, we'll have corresponding collection patterns.)

A good intuition here when reading this might be: "if f could have come 
from invoking Foo.of(...), what parameters would have been provided?"

> 1) Factory methods that should be constructors
>
> Some factory methods exist simply because a constructor was not an
> acceptable alternative - LocalDate.of(int, int, int),
> Integer.valueOf(int) and List.of(T...) being three examples for
> different reasons. Given this, it seems desirable that the language
> should allow *some* pattern methods to have a use-site syntax that
> looks exactly like a deconstructor.
>
>    if (p instanceof Profile(Rules(List(var rule, ...)))) {...}
>    if (p instanceof LocalDate(var year, month, day)) { ..}
>    if (p instanceof Integer(var value)) { ..}
>    if (p instanceof Optional(var value)) { ..}
>    if (p instanceof Optional()) { ..}
>    for (Map.Entry(var key, var value) : map.entrySet() {...}
>
> In all these situations, the binding is effectively taking the raw
> contents and binding them, as records do, and that is what the
> use-site syntax should be.

I think this is overstated, but there's a reasonable design option (for 
*some* classes) buried in here.  Whether you provide a constructor or 
factory is an API design choice; similarly, whether you provide a 
corresponding deconstructor or static member pattern is also an API 
design choice.   If you choose factories one way, but deconstructors the 
other, that's an asymmetry; there can be cases where such an asymmetry 
is justified, but in others will just be a gratuitous inconsistency, so 
there needs to be a corresponding benefit to make up for the asymmetry.

One of the benefits of factories over constructors is "could return a 
cached instance/subclass", which is not in play for deconstructors.  So 
if a class has _only_ one name of factory method (make, newFoo), I think 
the idiom of "deconstruct those with a deconstructor" might be a 
reasonable option, because deconstructors don't need this degree of 
flexibility, and the name chosen by the API designer isn't really adding 
any value, since they're all the same.  This goes especially so if the 
name is something like "make" rather than "of", since "of" worse fine in 
both directions, whereas "make" does not.

On the other hand, for classes that have multiple names of factories, 
the story is very different.  For example, deconstructing Optional with 
a pair of deconstructors would be a bad choice.   The names in 
Optional.of / Optional.empty are *the whole point*, and add tremendously 
to the readability, so losing these is pretty bad (and the inconsistency 
of mechanism is just a dis-bonus.)  From a readability perspective, will 
it really be obvious that

     case Optional():

means "empty optional" rather than "yep, its an optional"?  I don't 
think so -- we chose factories in part because names are useful. So, I 
think treating Optional this way would be be a truly bad move.

If we're trying  to extract rules, a candidate rule might be "if there's 
only likely to ever be one name of getter, and it is not suited for use 
in both directions, consider a deconstruction pattern instead."

> 2) Extract everything and Transform.
>
> This is where the inbound factory method performs a meaningful
> transformation, not just a straightforward assignment. A typical
> example is  `LocalDate.ofYearDay(int, int)` vs `LocalDate.of(int, int,
> int)`. The pattern method will typically always match, but it doesn't
> have to. Examples so far suggest these kind of transformation
> factories would look like:
>
>    if (p instanceof LocalDate.ofYearDay(var year, var dayOfYear)) { ..}
>
> But I'm not creating a LocalDate, I'm extracting it.

The pattern of mirroring the factory name depends a lot on how 
"bidirectional" the factory name is.  Some factory names are terrible 
this way ("make" or "newFoo"), and some are just fine ("of").  List.of() 
works well in both directions; "make me a list of these elements", vs 
"are you a list of things matching these patterns." Names like ofX or 
withX also are probably pretty good ("make me a Foo with mustard" / "are 
you a Foo with mustard").

Of course, pre-existing APIs that were designed before we knew patterns 
were coming might not have such reversible names; that's to be 
expected.  I suspect that, now that patterns are part of the language, 
we'll give more thought to whether names are reversible and will 
gravitate towards those names.  But pre-existing APIs might have to be 
more creative with their inverses.

But, I think its important to note that this is *a migration concern*, 
not a language design or "API style guide" issue; existing APIs that 
haven't left room for patterns (because, no one knew they were coming) 
should be, in the long run, the exception rather than the rule.  Let's 
not burden new APIs with compromises necessitated by old language 
limitations.  The old guidelines grew up relative to the old language; 
some of the considerations that were relevant in the old guidelines no 
longer apply.

A word of warning: the intuition of "I'm extracting, not creating" is 
probably pulling you in the wrong direction.  Of course its true, but we 
will be much better off when we design APIs that reflect the inherent 
round-tripping enabled by matched ctor/dtor (or factory/static pattern) 
pairs.  To the extent we are building APIs that can be taken apart with 
patterns, we should be striving to build APIs that are "symmetric" and 
"transparent".

I get how it is super-tempting to think of patterns as just being 
"better getters", and its true to a degree, but it is also missing a big 
part of the point.

<mathematical excursion>
This pairing appeals to the mathematical concepts of 
embedding-projection (alternately, adjunction).  The idea is that you 
have two domains A and B, and a pair of functions e : A->B and p: B->A.  
Embedding appeals to going from a "smaller" domain to a bigger one; 
projection is the opposite.  Things we know about the smaller domain are 
true in the bigger one, and, if they can be projected from the bigger to 
the smaller, the opposite is also true.  Its allowable for the 
projection to throw away information, but not the reverse.  So composing 
e-then-p is an identity, and composing p-then-e is a "normalization" (or 
may not map a value).

As an example, consider int and Integer.  The latter is "one bigger" 
(null).  The embedding of int into Integer is obvious: new Integer(i).  
The projection from Integer to int is similarly obvious: for non-null 
Integer, use Integer.intValue(), otherwise there's no mapping (NPE.)  
You can go int -> Integer -> int as many times as you like, and end up 
where you start; once you go one hop in the other direction and don't 
fall off (Integer -> int), you can continue riding the merry-go-round.  
And arithmetic is isomorphic in the two domains.

A richer example is the relationship between the pairs of integers (n, 
d) and a Rational class.  A well-behaved ctor/dtor pair (and similar for 
factory/pattern pair) forms an embedding-projection between these two 
domains; the Rational constructor might reject zero denominator, and 
might normalize numerator and denominator by factoring out GCD, but once 
this is done, the two domains ((n,d) pair, vs Rational object) behave 
identically and can be freely interconverted without further loss of 
information.  (Adhering to this behavior gives us, among other things, 
free marshalling to XML.)   This is a behavior worth aspiring to.
</mathematical excursion>

> It seems to me that there is already a method prefix that already
> represents something like "extract and transform" - the method prefix
> "as". eg. "I want the LocalDate as a YearDay"

... in some particular APIs which some characteristics that already use 
certain naming conventions.   Surely let's identify suitable conversions 
for migrating those, but let's not extrapolate too much from the APIs we 
built before we knew where we were going.  This is old-think; treating 
patterns as merely "better getters."  They're more than that.

> 3) Check and Extract.
>
> This is like the `Rule.dynamic(var dc)` example in this thread, where
> there is not always a match. Here, there are two distinct parts to the
> problem - matching and extraction, but the existing proposed syntax
> merges those two parts:
>
>    if (p instanceof Map.withMapping(key)(var value)) {...}
>
> Again, I don't think "withMapping" works well as a method name. Nor do
> I believe this flows well when trying to read what it says.

Again, I think this bothers you because you're still trying to think of 
a pattern as a "conditional multi-getter", rather than as a projection.  
I will agree that `withMapping` is a little clunky, but it does work in 
both directions -- you could use it to make a map "with a given 
mapping", and identify whether a map is one "with a given mapping."

> But it seems to me that there is already a boolean method on Map that
> performs the desired match. It is called `containsKey`. I believe this
> will be true of the vast majority of pattern methods, eg.
> `String.contains()`, `Class.isArray()` and many more. Typically these
> are `isXxx()` methods.

I don't disagree that `containsKey` is a good name, or even a better 
name, for this pattern.  But, you might find it useful to know that I 
deliberately chose *not to* use containsKey in these examples -- because 
I was concerned it would play into exactly the wrong mental models for 
patterns that I've been pushing back on in this very mail, so I chose to 
stick with the more symmetric naming style for purposes of 
illustration.  So while in this case, I might actually agree that this 
is a better name, and might advocate for that in the end, I think that 
may be more the exception than the rule of borrowing from the old 
intuitions.  The old intuitions need to be set aside to make room for 
new ones, and then we can see what we want to keep of the old ones.

> So, my key observation is that pattern methods should be building on
> the existing knowledge developers have of libraries by using these
> existing boolean method names in patterns.

While I totally understand how you got here, I disagree with this almost 
completely.  This burdens a better tool with the compromises we had to 
make to make the bad old tools work.

The "almost" part is: existing patterns of API naming came from a 
reasonable thought process, and it might be reasonable to re-run some of 
that thought process in the context of the newer, more powerful language 
we have now, and some of those intuitions will be helpful in building 
symmetric / transparent / round-trippable APIs. But let's not try to 
cram patterns into the idioms we developed to approximate them.

In short: look ahead, not backwards.

> Proposal:

While I am sure that people will code as you have suggested in this 
proposal (for the same reason: seeking the comfort of the old rules), I 
think most of these would be seriously suboptimal choices.  They seem to 
mostly flow from "let's cram patterns into the mental model we have 
now", rather than broadening the model.

So, I think these are good first thoughts, just not the place we want to 
land.  My advice is: broaden your perspective, try to embrace the degree 
to which patterns are not "just" better getters, and iterate!  That's 
how we discover better API design idioms.