Pattern method names based on existing library conventions [was Re: Pattern matching for nested List with additional size constraints]
Brian Goetz
brian.goetz at oracle.com
Thu Jan 28 16:40:00 UTC 2021
This is all a great first attempt to navigate this new territory. Having
spent a lot of time there already, there's some I agree with, and some
for which I'd say "understandable first thought, but keep going." It
will surely take us a while to discover the ideal set of idioms for this
new language, just as it did in the early days of Java.
If you don't read any further, the takeaway of all this is: patterns are
not just "better getters", and our API design idioms should not merely
be extrapolated from what worked with getters, but should reflect the
richness of what patterns enable. The old rules will encourage us
towards some bad habits for the new world, and we should be on lookout
for that.
> On Wed, 27 Jan 2021 at 20:03, Brian Goetz <brian.goetz at oracle.com> wrote:
>> if (p instanceof Profile(Rules(List.of(var rule, ...)))) {...}
> With my library design hat on, I don't think "of" is a suitable method
> name here.
There are surely lots of possible library idioms, and I'm sure we've not
thought of all of them yet, so its good to brainstorm about what some
others might be. For the record, here's the rationale behind this
particular naming choice.
A key design goal for pattern matching in Java is that however you make
an object, there is a complementary way to take it apart, and, absent a
good reason to the contrary, that should be as syntactically and
semantically similar as possible.
If we construct with constructor:
Foo f = new Foo(a);
we can take apart with the same-descriptor deconstructor:
if (f instanceof Foo(var a)) { ... }
If we construct with a factory:
Foo f = Foo.of(a);
we can take apart with the same-named, same-descriptor deconstructor:
if (f instanceof Foo.of(var a)) { ... }
or, perhaps a more evident example:
Optional<String> a = Optional.of(x);
Optional<String> b = Optional.empty();
switch (x) {
case Optional.of(var x): ...
case Optional.empty(): ...
}
In this case (heh), the library designer (hi!) chose to use factory
methods _for the names_; the deconstruction benefits from the same thing.
(There are other forms of aggregation, which should have corresponding
patterns -- array initializers and patterns, and eventually, if we ever
have collection literals, we'll have corresponding collection patterns.)
A good intuition here when reading this might be: "if f could have come
from invoking Foo.of(...), what parameters would have been provided?"
> 1) Factory methods that should be constructors
>
> Some factory methods exist simply because a constructor was not an
> acceptable alternative - LocalDate.of(int, int, int),
> Integer.valueOf(int) and List.of(T...) being three examples for
> different reasons. Given this, it seems desirable that the language
> should allow *some* pattern methods to have a use-site syntax that
> looks exactly like a deconstructor.
>
> if (p instanceof Profile(Rules(List(var rule, ...)))) {...}
> if (p instanceof LocalDate(var year, month, day)) { ..}
> if (p instanceof Integer(var value)) { ..}
> if (p instanceof Optional(var value)) { ..}
> if (p instanceof Optional()) { ..}
> for (Map.Entry(var key, var value) : map.entrySet() {...}
>
> In all these situations, the binding is effectively taking the raw
> contents and binding them, as records do, and that is what the
> use-site syntax should be.
I think this is overstated, but there's a reasonable design option (for
*some* classes) buried in here. Whether you provide a constructor or
factory is an API design choice; similarly, whether you provide a
corresponding deconstructor or static member pattern is also an API
design choice. If you choose factories one way, but deconstructors the
other, that's an asymmetry; there can be cases where such an asymmetry
is justified, but in others will just be a gratuitous inconsistency, so
there needs to be a corresponding benefit to make up for the asymmetry.
One of the benefits of factories over constructors is "could return a
cached instance/subclass", which is not in play for deconstructors. So
if a class has _only_ one name of factory method (make, newFoo), I think
the idiom of "deconstruct those with a deconstructor" might be a
reasonable option, because deconstructors don't need this degree of
flexibility, and the name chosen by the API designer isn't really adding
any value, since they're all the same. This goes especially so if the
name is something like "make" rather than "of", since "of" worse fine in
both directions, whereas "make" does not.
On the other hand, for classes that have multiple names of factories,
the story is very different. For example, deconstructing Optional with
a pair of deconstructors would be a bad choice. The names in
Optional.of / Optional.empty are *the whole point*, and add tremendously
to the readability, so losing these is pretty bad (and the inconsistency
of mechanism is just a dis-bonus.) From a readability perspective, will
it really be obvious that
case Optional():
means "empty optional" rather than "yep, its an optional"? I don't
think so -- we chose factories in part because names are useful. So, I
think treating Optional this way would be be a truly bad move.
If we're trying to extract rules, a candidate rule might be "if there's
only likely to ever be one name of getter, and it is not suited for use
in both directions, consider a deconstruction pattern instead."
> 2) Extract everything and Transform.
>
> This is where the inbound factory method performs a meaningful
> transformation, not just a straightforward assignment. A typical
> example is `LocalDate.ofYearDay(int, int)` vs `LocalDate.of(int, int,
> int)`. The pattern method will typically always match, but it doesn't
> have to. Examples so far suggest these kind of transformation
> factories would look like:
>
> if (p instanceof LocalDate.ofYearDay(var year, var dayOfYear)) { ..}
>
> But I'm not creating a LocalDate, I'm extracting it.
The pattern of mirroring the factory name depends a lot on how
"bidirectional" the factory name is. Some factory names are terrible
this way ("make" or "newFoo"), and some are just fine ("of"). List.of()
works well in both directions; "make me a list of these elements", vs
"are you a list of things matching these patterns." Names like ofX or
withX also are probably pretty good ("make me a Foo with mustard" / "are
you a Foo with mustard").
Of course, pre-existing APIs that were designed before we knew patterns
were coming might not have such reversible names; that's to be
expected. I suspect that, now that patterns are part of the language,
we'll give more thought to whether names are reversible and will
gravitate towards those names. But pre-existing APIs might have to be
more creative with their inverses.
But, I think its important to note that this is *a migration concern*,
not a language design or "API style guide" issue; existing APIs that
haven't left room for patterns (because, no one knew they were coming)
should be, in the long run, the exception rather than the rule. Let's
not burden new APIs with compromises necessitated by old language
limitations. The old guidelines grew up relative to the old language;
some of the considerations that were relevant in the old guidelines no
longer apply.
A word of warning: the intuition of "I'm extracting, not creating" is
probably pulling you in the wrong direction. Of course its true, but we
will be much better off when we design APIs that reflect the inherent
round-tripping enabled by matched ctor/dtor (or factory/static pattern)
pairs. To the extent we are building APIs that can be taken apart with
patterns, we should be striving to build APIs that are "symmetric" and
"transparent".
I get how it is super-tempting to think of patterns as just being
"better getters", and its true to a degree, but it is also missing a big
part of the point.
<mathematical excursion>
This pairing appeals to the mathematical concepts of
embedding-projection (alternately, adjunction). The idea is that you
have two domains A and B, and a pair of functions e : A->B and p: B->A.
Embedding appeals to going from a "smaller" domain to a bigger one;
projection is the opposite. Things we know about the smaller domain are
true in the bigger one, and, if they can be projected from the bigger to
the smaller, the opposite is also true. Its allowable for the
projection to throw away information, but not the reverse. So composing
e-then-p is an identity, and composing p-then-e is a "normalization" (or
may not map a value).
As an example, consider int and Integer. The latter is "one bigger"
(null). The embedding of int into Integer is obvious: new Integer(i).
The projection from Integer to int is similarly obvious: for non-null
Integer, use Integer.intValue(), otherwise there's no mapping (NPE.)
You can go int -> Integer -> int as many times as you like, and end up
where you start; once you go one hop in the other direction and don't
fall off (Integer -> int), you can continue riding the merry-go-round.
And arithmetic is isomorphic in the two domains.
A richer example is the relationship between the pairs of integers (n,
d) and a Rational class. A well-behaved ctor/dtor pair (and similar for
factory/pattern pair) forms an embedding-projection between these two
domains; the Rational constructor might reject zero denominator, and
might normalize numerator and denominator by factoring out GCD, but once
this is done, the two domains ((n,d) pair, vs Rational object) behave
identically and can be freely interconverted without further loss of
information. (Adhering to this behavior gives us, among other things,
free marshalling to XML.) This is a behavior worth aspiring to.
</mathematical excursion>
> It seems to me that there is already a method prefix that already
> represents something like "extract and transform" - the method prefix
> "as". eg. "I want the LocalDate as a YearDay"
... in some particular APIs which some characteristics that already use
certain naming conventions. Surely let's identify suitable conversions
for migrating those, but let's not extrapolate too much from the APIs we
built before we knew where we were going. This is old-think; treating
patterns as merely "better getters." They're more than that.
> 3) Check and Extract.
>
> This is like the `Rule.dynamic(var dc)` example in this thread, where
> there is not always a match. Here, there are two distinct parts to the
> problem - matching and extraction, but the existing proposed syntax
> merges those two parts:
>
> if (p instanceof Map.withMapping(key)(var value)) {...}
>
> Again, I don't think "withMapping" works well as a method name. Nor do
> I believe this flows well when trying to read what it says.
Again, I think this bothers you because you're still trying to think of
a pattern as a "conditional multi-getter", rather than as a projection.
I will agree that `withMapping` is a little clunky, but it does work in
both directions -- you could use it to make a map "with a given
mapping", and identify whether a map is one "with a given mapping."
> But it seems to me that there is already a boolean method on Map that
> performs the desired match. It is called `containsKey`. I believe this
> will be true of the vast majority of pattern methods, eg.
> `String.contains()`, `Class.isArray()` and many more. Typically these
> are `isXxx()` methods.
I don't disagree that `containsKey` is a good name, or even a better
name, for this pattern. But, you might find it useful to know that I
deliberately chose *not to* use containsKey in these examples -- because
I was concerned it would play into exactly the wrong mental models for
patterns that I've been pushing back on in this very mail, so I chose to
stick with the more symmetric naming style for purposes of
illustration. So while in this case, I might actually agree that this
is a better name, and might advocate for that in the end, I think that
may be more the exception than the rule of borrowing from the old
intuitions. The old intuitions need to be set aside to make room for
new ones, and then we can see what we want to keep of the old ones.
> So, my key observation is that pattern methods should be building on
> the existing knowledge developers have of libraries by using these
> existing boolean method names in patterns.
While I totally understand how you got here, I disagree with this almost
completely. This burdens a better tool with the compromises we had to
make to make the bad old tools work.
The "almost" part is: existing patterns of API naming came from a
reasonable thought process, and it might be reasonable to re-run some of
that thought process in the context of the newer, more powerful language
we have now, and some of those intuitions will be helpful in building
symmetric / transparent / round-trippable APIs. But let's not try to
cram patterns into the idioms we developed to approximate them.
In short: look ahead, not backwards.
> Proposal:
While I am sure that people will code as you have suggested in this
proposal (for the same reason: seeking the comfort of the old rules), I
think most of these would be seriously suboptimal choices. They seem to
mostly flow from "let's cram patterns into the mental model we have
now", rather than broadening the model.
So, I think these are good first thoughts, just not the place we want to
land. My advice is: broaden your perspective, try to embrace the degree
to which patterns are not "just" better getters, and iterate! That's
how we discover better API design idioms.
More information about the amber-dev
mailing list