Member Patterns -- the bikeshed
Brian Goetz
brian.goetz at oracle.com
Sat Mar 30 20:53:14 UTC 2024
On 3/30/2024 3:23 PM, Victor Nazarov wrote:
> I have two points that I think may be good to consider in the list of
> options.
>
> 1. I'm not sure if this was considered, but I find explicit lists of
> covering patterns
> rather natural and more flexible than using case as a pattern-modifier.
Agreed (this is how F# does it), and we tried that, but it is so
contrary to how members are done in Java. (One might think that one
could declare a "sealed" pattern, which "permits" a list of other
patterns, and this sounds perfectly natural, but it looks pretty weird.)
> The important feature of explicit lists is that there may be more than
> one covering set of patterns.
Yes, been down this road too, but the reality is that this is not likely
to come up nearly as often as one might imagine.
> 2. I think that there is a middle ground between functional and
> imperative pattern body definition style that may look cumbersome at
> first, but nevertheless gives you best of both worlds:
The `match` block is an interesting idea, will consider.
>
> * deconstructor patterns look dual to constructors
> * names from the list of pattern variables are actually used and
> checked by the compiler
> * control flow is still functional, which is more natural
>
> The downside that is retained from the imperative style is the need
> for alpha-renaming,
> but I think we still have to deal with shadowing and renaming
> local-variable seems natural and easy.
>
> Middle ground may be used like a special form that can be used in the
> pattern body.
> This form works mostly the same way as `with`-clause as defined in the
> "Derived Record Instances" JEP.
>
> Here is the long list of examples to fully illustrate different
> interactions:
>
> ````
> class Optional<T> matches (of|empty) {
> public static <T> pattern<Optional<T>> of(T value) {
> if (that.isPresent()) {
> match {
> value = that.get();
> }
> }
> }
>
> public static <T> pattern<Optional<T>> empty() {
> if (that.isEmpty())
> match {}
> }
> }
>
> class Pattern {
> public pattern<String> regexMatch(String... groups) {
> Matcher m = this.matcher(that);
> if (m.matches()) {
> match {
> groups =
> IntStream.range(1, m.groupCount())
> .map(Matcher::group)
> .toArray(String[]::new);
> }
> }
> }
> }
>
> class A {
> private final int a;
>
> public A(int a) {
> this.a = a;
> }
> public pattern A(int a) {
> match {
> a = that.a;
> }
> }
> }
>
> class B extends A {
> private final int b;
>
> public B(int a, int b) {
> super(a);
> this.b = b;
> }
>
> public pattern B(int a, int b) {
> if (that instanceof super(var aa)) {
> match {
> a = aa;
> b = that.b;
> }
> }
> }
> }
>
> interface Converter<T,U> {
> pattern<T> convert(U u);
> }
> Converter<Integer, Short> c =
> pattern (s) -> {
> if (that >= Short.MIN_VALUE && that <= Short.MAX_VALUE)
> match {
> s = (short) that;
> }
> };
> ````
>
> --
> Victor Nazarov
>
>
> On Fri, Mar 29, 2024 at 10:59 PM Brian Goetz <brian.goetz at oracle.com>
> wrote:
>
> We now come to the long-awaited bikeshed discussion on what member
> patterns should look like.
>
> Bikeshed disclaimer for EG:
> - This is likely to evoke strong opinions, so please take pains
> to be especially constructive
> - Long reply-to-reply threads should be avoided even more than usual
> - Holistic, considered replies preferred
> - Please change subject line if commenting on a sub-topic or
> tangential
> concern
>
> Special reminders for Remi:
> - Use of words like "should", "must", "shouldn't", "mistake",
> "wrong", "broken"
> are strictly forbidden.
> - If in doubt, ask questions first.
>
> Notes for external observers:
> - This is a working document for the EG; the discussion may
> continue for a
> while before there is an official proposal. Please be patient.
>
>
> # Pattern declaration: the bikeshed
>
> We've largely identified the model for what kinds of patterns we
> need to
> express, but there are still several degrees of freedom in the syntax.
>
> As the model has simplified during the design process, the space
> of syntax
> choices has been pruned back, which is a good thing. However,
> there are still
> quite a few smaller decisions to be made. Not all of the
> considerations are
> orthogonal, so while they are presented individually, this is not
> a "pick one
> from each column" menu.
>
> Some of these simplifications include:
>
> - Patterns with "input arguments" have been removed; another way
> to get to what
> this gave us may come back in another form.
> - I have grown increasingly skeptical of the value of the
> imperative `match`
> statement. With better totality analysis, I think it can be
> eliminated.
>
> We can discuss these separately but I would like to sync first on
> the broad
> strokes for how patterns are expressed.
>
> ## Object model requirements
>
> As outlined in "Towards Member Patterns", the basic model is that
> patterns are
> the dual of other executable members (constructors, static
> methods, instance
> methods.) While they are like methods in that they have inputs,
> outputs, names,
> and an imperative body, they have additional degrees of freedom that
> constructors and methods lack:
>
> - Patterns are, in general, _conditional_ (they can succeed or
> fail), and only
> produce bindings (outputs) when they succeed. This
> conditionality is
> understood by the language's flow analysis, and is used for
> computing scoping
> and definite assignment.
> - Methods can return at most one value; when a pattern completes
> successfully,
> it may bind multiple values.
> - All patterns have a _match candidate_, which is a distinguished,
> possibly-implicit parameter. Some patterns also have a
> receiver, which is
> also a distinguished, possibly-implicit parameter. In some
> such cases the
> receiver and match candidate are aliased, but in others these
> may refer to
> different objects.
>
> So a pattern is a named executable member that takes a _match
> candidate_ as a
> possibly-implicit parameter, maybe takes a receiver as an implicit
> parameter,
> and has zero or more conditional _bindings_. Its body can perform
> imperative
> computation, and can terminate either with match failure or
> success. In the
> success case, it must provide a value for each binding.
>
> Deconstruction patterns are special in many of the same ways
> constructors are:
> they are constrained in their name, inheritance, and probably their
> conditionality (they should probably always succeed). Just as the
> syntax for
> constructors differs slightly from that of instance methods, the
> syntax for
> deconstructors may differ slightly from that of instance
> patterns. Static
> patterns, like static methods, have no receiver and do not have
> access to the
> type parameters of the enclosing class.
>
> Like constructors and methods, patterns can be overloaded, but in
> accordance
> with their duality to constructors and methods, the overloading
> happens on the
> _bindings_, not the inputs.
>
> ## Use-site syntax
>
> There are several kinds of type-driven patterns built into the
> language: type
> patterns and record patterns. A type pattern in a `switch` looks
> like:
>
> case String s: ...
>
> And a record pattern looks like:
>
> case MyRecord(P1, P2, ...): ...
>
> where `P1..Pn` are nested patterns that are recursively matched to the
> components of the record. This use-site syntax for record
> patterns was chosen
> for its similarity to the construction syntax, to highlight that a
> record
> pattern is the dual of record construction.
>
> **Deconstruction patterns.** The simplest kind of member pattern, a
> deconstruction pattern, will have the same use-site syntax as a
> record pattern;
> record patterns can be thought of as a deconstruction pattern
> "acquired for
> free" by records, just as records do with constructors, accessors,
> object
> methods, etc. So the use of a deconstruction pattern for `Point`
> looks like:
>
> case Point(var x, var y): ...
>
> whether `Point` is a record or an ordinary class equipped with a
> suitable
> deconstruction pattern.
>
> **Static patterns.** Continuing with the idea that the
> destructuring syntax
> should evoke the aggregation syntax, there is an obvious candidate
> for the
> use-site syntax for static patterns:
>
> case Optional.of(var e): ...
> case Optional.empty(): ...
>
> **Instance patterns.** Uses of instance patterns will likely come
> in two forms,
> analogous to bound and unbound instance method references,
> depending on whether
> the receiver and the match candidate are the same object. In the
> unbound form,
> used when the receiver is the same object as the match candidate,
> the pattern
> name is qualified by a _type_:
>
> ```
> Class<?> k = ...
> switch (k) {
> // Qualified by type
> case Class.arrayClass(var componentType): ...
> }
> ```
>
> This means that we _resolve_ the pattern `arrayClass` starting at
> `Class` and
> _select_ the pattern using the receiver, `k`. We may also be able
> to omit the
> class qualifier if the static type of the match candidate is
> sufficient to
> resolve the desired pattern.
>
> In the bound form, used when the receiver is distinct from the
> match candidate,
> the pattern name is qualified with an explicit _receiver
> expression_. As an
> example, consider an interface that captures primitive widening
> and narrowing
> conversions, such as those between `int` and `long`. In the
> widening direction,
> conversion is unconditional, so this can be modeled as a method
> from `int` to
> `long`. In the other direction, conversion is conditional, so
> this is better
> modeled as a _pattern_ whose match candidate is `long` and which
> binds an `int`
> on success. Since these are instance methods of some class (say,
> `NumericConversion<T,U>`), we need to provide the receiver
> instance in order to
> resolve the pattern:
>
> ```
> NumericConversion<int, long> nc = ...
>
> switch (aLong) {
> case nc.narrowed(int i):
> ...
> }
> ```
>
> The explicit receiver syntax would also be used if we exposed
> regular expression
> matching as a pattern on the `j.u.r.Pattern` object (the name
> collision on
> `Pattern` is unfortunate). Imagine we added a `matching` instance
> pattern to
> `j.u.r.Pattern`; then we could use it in `instanceof` as follows:
>
> ```
> static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)");
> ...
> if (aString instanceof P.matching(String as, String bs)) { ... }
> ```
>
> Each of these use-site syntaxes is modeled after the use-site
> syntax for a
> method invocation or method reference.
>
> ## Declaration-site syntax
>
> To avoid being biased by the simpler cases, we're going to work
> all the cases
> concurrently rather than starting with the simpler cases and
> working up. (It
> might seem sensible to start with deconstructors, since they are
> the "easy"
> case, but if we did that, we would likely be biased by their
> simplicity and then
> find ourselves painted into a corner.) As our example gallery, we
> will consider:
>
> - Deconstruction pattern for `Point`;
> - Static patterns for `Optional::of` and `Optional::empty`;
> - Static pattern for "power of two" (illustrating a computations
> where success
> or failure, and computation of bindings, cannot easily be
> separated);
> - Instance pattern for `Class::arrayClass` (used unbound);
> - Instance pattern for `Pattern::matching` on regular expressions
> (used bound).
>
> Member patterns, like methods, have _names_. (We can think of
> constructors as
> being named for their enclosing classes, and the same for
> deconstructors.) All
> member patterns have a (possibly empty) ordered list of
> _bindings_, which are
> the dual of constructor or method parameters. Bindings, in turn,
> have names and
> types. And like constructors and methods, member patterns have a
> _body_ which
> is a block statement. Member patterns also have a _match
> candidate_, which is a
> likely-implicit method parameter.
>
> ### Member patterns as inverse methods and constructors
>
> Regardless of syntax, let us remind ourselves that that
> deconstructors are the
> categorical dual to constructors (coconstructors), and pattern
> methods are the
> categorical dual to methods (comethods). They are dual in their
> structure: a
> constructor or method takes N arguments and produces a result, the
> corresponding
> member pattern consumes a match candidate and (conditionally)
> produces N
> bindings.
>
> Moreover, they are semantically dual: the return value produced by
> construction
> or factory invocation is the match candidate for the corresponding
> member
> pattern, and the bindings produced by a member pattern are the
> answers to the
> _Pattern Question_ -- "could this object have come from an
> invocation of my
> dual, and if so, with what arguments."
>
> ### What do we call them?
>
> Given the significant overlap between methods and patterns, the
> first question
> about the declaration we need to settle is how to identify a
> member pattern
> declaration as distinct from a method or constructor declaration.
> _Towards
> Member Patterns_ tried out a syntax that recognized these as
> _inverse_ methods
> and constructors:
>
> public Point(int x, int y) { ... }
> public inverse Point(int x, int y) { ... }
>
> While this is a principled choice which clearly highlights the
> duality, and one
> that might be good for specification and verbal description, it is
> questionable
> whether this would be a great syntax for reading and writing
> programs.
>
> A more traditional option is to choose a "noun" (conditional)
> keyword, such as
> `pattern`, `matcher`, `extractor`, `view`, etc:
>
> public pattern Point(int x, int y) { ... }
>
> If we are using a noun keyword to identify pattern declarations,
> we could use
> the same noun for all of them, or we could choose a different one for
> deconstruction patterns:
>
> public deconstructor Point(int x, int y) { ... }
>
> Alternately, we could reach for a symbol to indicate that we are
> talking about
> an inverted member. C++ fans might suggest
>
> public ~Point(int x, int y) { ... }
>
> but this is too cryptic (it's evocative once you see it, but then
> it becomes
> less evocative as we move away from deconstructors towards
> instance patterns.)
>
> If we wish to offer finer-grained control over conditionality, we
> might
> additionally need a `total` / `partial` modifier, though I would
> prefer to avoid
> that.
>
> Of the keyword candidates, there is one that stands out (for good
> and bad)
> because it connects to something that is already in the language:
> `pattern`. On
> the one hand, using the term `pattern` for the declaration is a
> slight abuse; on
> the other, users will immediately connect it with "ah, so that's
> how I make a
> new pattern" or "so that's what happens when I match against this
> pattern."
> (Lisps would resolve this tension by calling it `defpattern`.)
>
> The others (`matcher`, `view`, `extractor`, etc) are all made-up
> terms that
> don't connect to anything else in the language, for better or
> worse. If we pick
> one of these, we are asking users to sort out _three_ separate new
> things in
> their heads: (use-site) patterns, (declaration-site) matchers, and
> the rules of
> how patterns and matchers are connected. Calling them both
> "patterns", despite
> the mild abuse of terminology, ties them together in a way that
> recognizes their
> connection.
>
> My personal position: `pattern` is the strongest candidate here,
> despite some
> flaws.
>
> ### Binding lists and match candidates
>
> There are two obvious alternatives for describing the binding list
> and match
> candidate of a pattern declaration, both with their roots in the
> constructor and
> method syntax:
>
> - Pretend that a pattern declaration is like a method with
> multiple return, and
> put the binding list in the "return position", and make the
> match candidate
> an ordinary parameter;
> - Lean into the inverse relationship between constructors and
> methods (and
> consistency with the use-site syntax), and put the binding list
> in the
> "parameter list position". For static patterns and some
> instance patterns,
> which need to explicitly identify the match candidate type,
> there are several
> sub-options:
> - Lean further into the duality, putting the match candidate
> type in the
> "return position";
> - Put the match candidate type somewhere else, where it is less
> likely to be
> confused for a method return.
>
> The "method-like" approach might look like this:
>
> ```
> class Point {
> // Constructor and deconstructor
> public Point(int x, int y) { ... }
> public pattern (int x, int y) Point(Point target) { ... }
> ...
> }
>
> class Optional<T> {
> // Static factory and pattern
> public static<T> Optional<T> of(T t) { ... }
> public static<T> pattern (T t) of(Optional<T> target) { ... }
> ...
> }
> ```
>
> The "inverse" approach might look like:
>
> ```
> class Point {
> // Constructor and deconstructor
> public Point(int x, int y) { ... }
> public pattern Point(int x, int y) { ... }
> ...
> }
>
> class Optional<T> {
> // Static factory and pattern (using the first sub-option)
> public static<T> Optional<T> of(T t) { ... }
> public static<T> pattern Optional<T> of(T t) { ... }
> ...
> }
> ```
>
> With the "method-like" approach, the match candidate gets an
> explicit name
> selected by the author; with the inverse approach, we can go with
> a predefined
> name such as `that`. (Because deconstructors do not have
> receivers, we could by
> abuse of notation arrange for the keyword `this` to refer instead
> to the match
> candidate within the body of a deconstructor. While this might
> seem to lead to
> a more familiar notation for writing deconstructors, it would create a
> gratuitous asymmetry between the bodies of deconstruction patterns
> and those of
> other patterns.)
>
> Between these choices, nearly all the considerations favor the
> "inverse"
> approach:
>
> - The "inverse" approach makes the declaration look like the use
> site. This
> highlights that `pattern Point(int x, int y)` is what gets
> invoked when you
> match against the pattern use `Point(int x, int y)`. (This
> point is so
> strong that we should probably just stop here.)
> - The "inverse" members also look like their duals; the only
> difference is the
> `pattern` keyword (and possibly the placement of the match
> candidate type).
> This makes matched pairs much more obvious, and such matched
> pairs will be
> critical both for future language features and for library idioms.
> - The method-like approach is suggestive of multiple return or
> tuples, which is
> probably helpful for the first few minutes but actually harmful
> in the long
> term. This feature is _not_ (much as some people would like to
> believe) about
> multiple return or tuples, and playing into this misperception
> will only make
> it harder to truly understand. So this suggestion ends up
> propping up the
> wrong mental model.
>
> The main downside of the "inverse" approach is the one-time speed
> bump of the
> unfamiliarity of the inverted syntax. (The "method-like" syntax
> also has its
> own speed bumps, it is just unfamiliar in different ways.) But
> unlike the
> advantages of the inverse approach, which continue to add value
> forever, this
> speed bump is a one-time hurdle to get over.
>
> To smooth out the speed bumps of the inverse approach, we can
> consider moving
> the position of the match candidate for static and (suitable)
> instance pattern
> declarations, such as:
>
> ```
> class Optional<T> {
> // the usual static factory
> public static<T> Optional<T> of(T t) { ... }
>
> // Various ways of writing the corresponding pattern
> public static<T> pattern of(T t) for Optional<T> { ... }
> // or ...
> public static<T> pattern(Optional<T>) of(T t) { ... }
> // or ...
> public static<T> pattern(Optional<T> that) of(T t) { ... }
> // or ...
> public static<T> pattern<Optional<T>> of(T t) { ... }
> ...
> }
> ```
>
> (The deconstructor example looks the same with either variant.)
> Of these,
> treating the match candidate like a "parameter" of "pattern" is
> probably the
> most evocative:
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) { ... }
> ```
>
> as it can be read as "pattern taking the parameter `Optional<T>
> that` called
> `of`, binding `T`, and is a short departure from the inverse syntax.
>
> The main value of the various rearrangements is that users don't
> need to think
> about things operating in reverse to parse the syntax. This trades
> some of the
> secondary point (patterns looking almost exactly like their
> inverses) for a
> certain amount of cognitive load, while maintaining the most important
> consideration: that the declaration site look like the use site.
>
> For instance pattern declarations, if the match candidate type is
> the same as
> the receiver type, the match candidate type can be elided as it is
> with
> deconstructors.
>
> My personal position: the "multiple return" version is terrible;
> all the
> sub-variants of the inverse version are probably workable.
>
> ### Naming the match candidate
>
> We've been assuming so far that the match candidate always has a
> fixed name,
> such as `that`; this is an entirely workable approach. Some of the
> variants are
> also amenable to allowing authors to explicitly select a name for
> the match
> candidate. For example, if we put the match candidate as a
> "parameter" to the `pattern` keyword, there is an obvious place to
> put the name:
>
> ```
> static<T> pattern(Optional<T> target) of(T t) { ... }
> ```
>
> My personal opinion: I don't think this degree of freedom buys us
> much, and in
> the long run readability probably benefits by picking a fixed name
> like `that`
> and sticking with it. Even with a fixed name, if there is a
> sensible position
> for the name, allowing users to type `that` for explicitness is
> fine (as we do
> with instance methods, though many people don't know this.) We
> may even want to
> require it.
>
> ## Body types
>
> Just as there are two obvious approaches for the declaration,
> there are two
> obvious approaches we could take for the body (though there is
> some coupling
> between them.) We'll call the two body approaches _imperative_ and
> _functional_.
>
> The imperative approach treats bindings as initially-DU variables
> that must be
> DA on successful completion, getting their value through ordinary
> assignment;
> the functional approach sets all the bindings at once,
> positionally. Either
> way, member patterns (except maybe deconstructors) also need a way to
> differentiate a successful match from a failed match.
>
> Here is the `Point` deconstructor with both imperative and
> functional style. The
> functional style uses a placeholder `match` statement to indicate
> a successful
> match and provision of bindings:
>
> ```
> class Point {
> int x, y;
>
> Point(int x, int y) {
> this.x = x;
> this.y = y;
> }
>
> // Imperative style, deconstructor always succeeds
> pattern Point(int x, int y) {
> x = that.x;
> y = that.y;
> }
>
> // Functional style
> pattern Point(int x, int y) {
> match(that.x, that.y);
> }
> }
> ```
>
> There are some obvious differences here. In the imperative style,
> the dtor body
> looks much more like the reverse of the ctor body. The functional
> style is more
> concise (and amenable to further concision via the "concise method
> bodies"
> mechanism in the future), as well as a number of less obvious
> differences. For
> deconstructors, the imperative approach is likely to feel more
> natural because
> of the obvious symmetry with constructors.
>
> In reality, it is _premature at this point to have an opinion_,
> because we
> haven't yet seen the full scope of the problem; deconstructors are
> a special
> case in many ways, which almost surely is distorting our initial
> opinion. As we
> move towards conditional patterns (and pattern lambdas), our
> opinions may flip.
>
> Regardless of which we pick, there are some additional syntactic
> choices to be
> made -- what syntax to use to indicate success (we used `match` in
> the above
> example) or failure. (We should be especially careful around
> trying to reuse
> words like `return`, `break`, or `yield` because, in the case
> where there are
> zero bindings (which is allowable), it becomes unclear whether
> they mean "fail"
> or "succeed with zero bindings".)
>
> ### Success and failure
>
> Except for possibly deconstructors, which we may require to be
> total, a pattern
> declaration needs a way to indicate success and failure. In the
> examples above,
> we posited a `match` statement to indicate success in the
> functional approach,
> and in both examples leaned on the "implicit success" of
> deconstructors (under
> the assumption they always succeed). Now let's look at the more
> general case to
> figure out what else is needed.
>
> For a static pattern like `Optional::of`, success is conditional.
> Using
> `match-fail` as a placeholder for "the match failed", this might
> look like
> (functional version):
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent())
> match (that.get());
> else
> match-fail;
> }
> ```
>
> The imperative version is less pretty, though. Using
> `match-success` as a
> placeholder:
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent()) {
> t = that.get();
> match-success;
> }
> else
> match-fail;
> }
> ```
>
> Both arms of the `if` feel excessively ceremonial here. And if we
> chose to not
> make all deconstruction patterns unconditional, deconstructors
> would likely need
> some explicit success as well:
>
> ```
> pattern Point(int x, int y) {
> x = that.x;
> y = that.y;
> match-success;
> }
> ```
>
> It might be tempting to try and eliminate the need for explicit
> success by
> inferring it from whether or not the bindings are DA or not, but
> this is
> error-prone, is less type-checkable, and falls apart completely
> for patterns
> with no bindings.
>
> ### Implicit failure in the functional approach
>
> One of the ceremonial-seeming aspects of `Optional::of` above is
> having to say
> `else match-fail`, which doesn't feel like it adds a lot of
> value. Perhaps we
> can be more concise without losing clarity.
>
> Most conditional patterns will have a predicate to determine
> matching, and then
> some conditional code to compute the bindings and claim success.
> Having to say
> "and if the predicate didn't hold, then I fail" seems like
> ceremony for the
> author and noise for the reader. Instead, if a conditional
> pattern falls off
> the end without matching, we could treat that as simply not matching:
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent())
> match (that.get());
> }
> ```
>
> This says what we mean: if the optional is present, then this
> pattern succeeds
> and bind the contents of the `Optional`. As long as our "succeed"
> construct
> strongly enough connotes that we are terminating abruptly and
> successfully, this
> code is perfectly clear. And most conditional patterns will look
> a lot like
> `Optional::of`; do some sort of test and if it succeeds, extract
> the state and
> bind it.
>
> At first glance, this "implicit fail" idiom may seem error-prone
> or sloppy. But
> after writing a few dozen patterns, one quickly tires of saying "else
> match-fail" -- and the reader doesn't necessarily appreciate
> reading it either.
>
> Implicit failure also simplifies the selection of how we
> explicitly indicate
> failure; using `return` in a pattern for "no match" becomes pretty
> much a forced
> move. We observe that (in a void method), "return" and "falling
> off the end"
> are equivalent; if "falling off the end" means "no match", then so
> should an
> explicit `return`. So in those few cases where we need to
> explicitly signal "no
> match", we can just use `return`. It won't come up that often,
> but here's an
> example where it does:
>
> ```
> static pattern(int that) powerOfTwo(int exp) {
> int exp = 0;
>
> if (that < 1)
> return; // explicit fail
>
> while (that > 1) {
> if (that % 2 == 0) {
> that /= 2;
> ++exp;
> }
> else
> return; // explicit fail
> }
> match (exp);
> }
> ```
>
> As a bonus, if `return` as match failure is a forced move, we need
> only select a
> term for "successful match" (which obviously can't be `return`).
> We could use
> `match` as we have in the examples, or a variant like `matched` or
> `matches`.
> But rather than just creating a new control operator, we have an
> opportunity to
> lean into the duality a little harder, by including the pattern
> syntax in the
> match:
>
> ```
> matches of(that.get());
> ```
>
> or the (optionally?) qualified (inferring type arguments, as we do
> at the use
> site):
>
> ```
> matches Optional.of(that.get());
> ```
>
> These "use the name" approaches trades a small amount of verbosity
> to gain a
> higher degree of fidelity to the pattern use site (and to evoke
> the comethod
> completion.)
>
> If we don't choose "implicit fail", we would have to invent _two_
> new control
> flow statements to indicate "success" and "failure".
>
> My personal position: for the functional approach, implicit
> failure both makes
> the code simpler and clearer, and after you get used to it, you
> don't want to go
> back. Whether we say `match` or `matches` or `matches
> <pattern-name>` are all
> workable, though I like some variant that names the pattern.
>
> ### Implicit success in the imperative approach
>
> In the imperative approach, we can be implicit as well, but it
> feels more
> natural (at least, initially) to choose implicit success rather
> than failure.
> This works great for unconditional patterns:
>
> ```
> pattern Point(int x, int y) {
> x = that.x;
> y = that.y;
> // implicit success
> }
> ```
>
> but not quite as well for conditional patterns:
>
> ```
> static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent()) {
> t = that.get();
> }
> else
> match-fail;
> // implicit success
> }
> ```
>
> We can eliminate one of the arms of the if, with the more concise (but
> convoluted) inversion:
>
> ```
> static<T> pattern(Optional<T> that) of(T t) {
> if (!that.isPresent())
> match-fail;
> t = that.get();
> // implicit success
> }
> ```
>
> Just as with the functional approach, if we choose imperative and
> "implicit
> success", using `return` to indicate success is pretty much a
> forced move.
>
> ### Imperative is a trap
>
> If we assume that functional implies implicit failure, and
> imperative implies
> implicit success, then our choices become:
>
> ```
> class Optional<T> {
> public static<T> Optional<T> of(T t) { ... }
>
> // imperative, implicit success
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent()) {
> t = that.get();
> }
> else
> match-fail;
> }
>
> // functional, implicit failure
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent())
> matches of(that.get());
> }
> }
> ```
>
> Once we get past deconstructors, the imperative approach looks
> worse by
> comparison because we need to assign all the bindings (which is _O(n)_
> assignments) _and also_ indicate success or failure somehow,
> whereas in the
> functional style all can be done together with a single `matches`
> statement.
>
> Looking at the alternatives, except maybe for unconditional
> patterns, the
> functional example above seems a lot more natural. The imperative
> approach
> works with deconstructors (assuming they are not conditional), but
> does not
> scale so well to conditionality -- which is the essence of patterns.
>
> From a theoretical perspective, the method-comethod duality also
> gives us a
> forceful nudge towards the functional approach. In a method, the
> method
> arguments are specified as a positional list of expressions at the
> use site:
>
> m(a, b, c)
>
> and these values are invisibly copied into the parameter slots of
> the method
> prior to frame activation. The dual to that for a comethod to
> similarly convey
> the bindings in a positional list of expressions (as they must
> either all be
> produced or none), where they are copied into the slots provided
> at the use
> site, as is indicated by `matches` in the above examples.
>
> My personal position: the imperative style feels like a trap. It
> seems
> "obvious" at first if we start with deconstructors, but becomes
> increasingly
> difficult when we get past this case, and gets in the way of other
> opportunities. The last gasp before acceptance is the discomfort
> that dtor and
> ctor bodies are written in different styles, but in the rear-view
> mirror, this
> feels like a non-issue.
>
> ### Derive imperative from functional?
>
> If we start with "functional with implicit failure", we can
> possibly rescue
> imperative by deriving a version of imperative from functional, by
> "overloading"
> the match-success operator.
>
> If we have a pattern whose binding names are `b1..bn` of types
> `B1..Bn`, then
> the `matches` operator must take a list of expressions `e1..en`
> whose arity and
> types are compatible with `B1..Bn`. But we could allow `matches`
> to also have a
> nilary form, which would have the effect of being shorthand for
>
> matches <pattern-name>(b1, b2, ..., bn)
>
> where each of `b1..bn` must be DA at the point of matching. This
> means that we
> could express patterns in either form:
>
> ```
> class Optional<T> {
> public static<T> Optional<T> of(T t) { ... }
>
> // imperative, derived from functional with implicit failure
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent()) {
> t = that.get();
> matches of;
> }
> }
>
> public static<T> pattern(Optional<T> that) of(T t) {
> if (that.isPresent())
> matches of(that.get());
> }
> }
> ```
>
> This flexibility allows users to select a more verbose expression
> in exchange
> for a clearer association of expressions and bindings, though as
> we'll see, it
> does come with some additional constraints.
>
> ### Wrapping an existing API
>
> Nearly every library has methods (sometimes sets of methods) that
> are patterns
> in disguise, such as the pair of methods `isArray` and
> `getComponentType` in
> `Class`, or the `Matcher` helper type in `java.util.regex`.
> Library maintainers
> will likely want to wrap (or replace) these with real patterns, so
> these can
> participate more effectively in conditional contexts, and in some
> cases,
> highlight their duality with factory methods.
>
> Matching a string against a `j.u.r.Pattern` regular expression has
> all the same
> elements as a pattern, just with an ad-hoc API (and one that I
> have to look up
> every time). But we can fairly easily wrap a true pattern around
> the existing
> API. To match against a `Pattern` today, we pass the match
> candidate to
> `Pattern::matcher`, which returns a `Matcher` with accessors
> `Matcher::matches`
> (did it match) and `Matcher::group` (conditionally extract a
> particular capture
> group.) If we want to wrap this with a pattern called `regexMatch`:
>
> ```
> pattern(String that) regexMatch(String... groups) {
> Matcher m = this.matcher(that);
> if (m.matches())
> matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
> .map(Matcher::group)
> .toArray(String[]::new));
> // whole lotta matchin' goin' on
> }
> ```
>
> This says that a `j.u.r.Pattern` has an instance pattern called
> `regex`, whose
> match candidate is `String`, and which binds a varargs of `String`
> corresponding
> to the capture groups. The implementation simply delegates to the
> existing
> `j.u.r.Matcher` API. This means that `j.u.r.Pattern` becomes a
> sort of "pattern
> object", and we can use it as a receiver at the use site:
>
> ```
> static Pattern As = Pattern.compile("(a*)");
> static Pattern Bs = Pattern.compile("(b*)");
> ...
> switch (string) {
> case As.regexMatch(var as): ...
> case Bs.regexMatch(var bs): ...
> ...
> }
> ```
>
> ### Odds and ends
>
> There are a number of loose ends here. We could choose other
> names for the
> match-success and match-fail operations, including trying to reuse
> `break` or
> `yield`. But, this reuse is tricky; it must be very clear whether
> a given form
> of abrupt completion means "success" or "failure", because in the
> case of
> patterns with no bindings, we will have no other syntactic cues to
> help
> disambiguate. (I think having a single `matches`, with implicit
> failure and
> `return` meaning failure, is the sweet spot here.)
>
> Another question is whether the binding list introduces
> corresponding variables
> into the scope of the body. For imperative, the answer is "surely
> yes"; for
> functional, the answer is "maybe" (unless we want to do the trick
> where we
> derive imperative from functional, in which case the answer is
> "yes" again.)
>
> If the binding list does not correspond to variables in the body,
> this may be
> initially discomforting; because they do not declare program
> elements, they may
> feel that they are left "dangling". But even if they are not
> declaring
> _program_ elements, they are still declaring _API_ elements
> (similar to the
> return type of a method.) We will want to provide Javadoc on the
> bindings, just
> like with parameters; we will want to match up binding names in
> deconstructors
> with parameter names in constructors; we may even someday want to
> support
> by-name binding at the use site (e.g., `case Foo(a: var a)`). The
> names are
> needed for all of these, just not for the body. Names still
> matter. My take
> here is that this is a transient "different is scary" reaction,
> one that we
> would get over quickly.
>
> A final question is whether we should consider unqualified names
> as implicitly
> qualified by `that` (and also `this`, for instance patterns, with
> some conflict
> resolution). Users will probably grow tired of typing `that.` all
> the time, and most of the time, the unqualified use is perfectly
> readable.
>
> ## Exhaustiveness
>
> There is one last syntax question in front of us: how to indicate
> that a set of
> patterns are (claimed to be) exhaustive on a given match candidate
> type. We see
> this with `Optional::of` and `Optional::empty`; it would be sad if
> the compiler
> did not realize that these two patterns together were exhaustive
> on `Optional`.
> This is not a feature that will be used often, but not having it
> at all will be
> a repeated irritant.
>
> The best I've come up with is to call these `case` patterns, where
> a set of
> `case` patterns for a given match candidate type in a given class
> are asserted
> to be an exhaustive set:
>
> ```
> class Optional<T> {
> static<T> Optional<T> of(T t) { ... }
> static<T> Optional<T> empty() { ... }
>
> static<T> case pattern of(T t) for Optional<T> { ... }
> static<T> case pattern empty() for Optional<T> { ... }
> }
> ```
>
> Because they may not be truly exhaustive, `switch` constructs will
> have to back
> up the static assumption of exhaustiveness with a dynamic check,
> as we do for
> other sets of exhaustive patterns that may have remainder.
>
> I've experimented with variants of `sealed` but it felt more
> forced, so this is
> the best I've come up with.
>
> ## Example: patterns delegating to other patterns
>
> Pattern implementations must compose. Just as a subclass
> constructor delegates
> to a superclass constructor, the same should be true for
> deconstructors.
> Here's a typical superclass-subclass pair:
>
> ```
> class A {
> private final int a;
>
> public A(int a) { this.a = a; }
> public pattern A(int a) { matches A(that.a); }
> }
>
> class B extends A {
> private final int b;
>
> public B(int a, int b) {
> super(a);
> this.b = b;
> }
>
> // Imperative style
> public pattern B(int a, int b) {
> if (that instanceof super(var aa)) {
> a = aa;
> b = that.b;
> matches B;
> }
> }
>
> // Functional style
> public pattern B(int a, int b) {
> if (that instanceof super(var a))
> matches B(a, b);
> }
> }
> ```
>
> (Ignore the flow analysis and totality for the time being; we'll
> come back to
> this in a separate document.)
>
> The first thing that jumps out at us is that, in the imperative
> version, we had
> to create a "garbage" variable `aa` to receive the binding,
> because `a` was
> already in scope, and then we have to copy the garbage variable
> into the real
> binding variable. Users will surely balk at this, and rightly so.
> In the
> functional version (depending on the choices from "Odds and Ends")
> we are free
> to use the more natural name and avoid the roundabout locution.
>
> We might be tempted to fix the "garbage variable" problem by
> inventing another
> sub-feature: the ability to use an existing variable as the target
> of a binding,
> such as:
>
> ```
> pattern Point(int a, int b) {
> if (this instanceof A(__bind a))
> b = this.b;
> }
> ```
>
> But, I think the language is stronger without this feature, for
> two reasons.
> First, having to reason about whether a pattern match introduces a
> new binding
> or assigns to an existing variables is additional cognitive load
> for users to
> reason about, and second, having assignment to locals happening
> through
> something other than assignment introduces additional complexity
> in finding
> where a variable is modified. While we can argue about the
> general utility of
> this feature, bringing it in just to solve the garbage-variable
> problem is
> particularly unattractive.
>
> ## Pattern lambdas
>
> One final consideration is is that patterns may also have a lambda
> form. Given
> a single-abstract-pattern (SAP) interface:
>
> ```
> interface Converter<T,U> {
> pattern(T t) convert(U u);
> }
> ```
>
> one can implement such a pattern with a lambda. Such a lambda has
> one parameter
> (the match candidate), and its body looks like the body of a
> declared pattern:
>
> ```
> Converter<Integer, Short> c =
> i -> {
> if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE)
> matches Converter.convert((short) i);
> };
> ```
>
> Because the bindings of the pattern lambda are defined in the
> interface, not in
> the lambda, this is one more reason not to like the imperative
> version: it is
> brittle, and alpha-renaming bindings in the interface would be a
> source-incompatible change.
>
> ## Example gallery
>
> Here's all the pattern examples so far, and a few more, using the
> suggested
> style (functional, implicit fail, implicit `that`-qualification):
>
> ```
> // Point dtor
> pattern Point(int x, int y) {
> matches Point(x, y);
> }
>
> // Optional -- static patterns for Optional::of, Optional::empty
> static<T> case pattern(Optional<T> that) of(T t) {
> if (isPresent())
> matches of(t);
> }
>
> static<T> case pattern(Optional<T> that) empty() {
> if (!isPresent())
> matches empty();
> }
>
> // Class -- instance pattern for arrayClass (match candidate type
> inferred)
> pattern arrayClass(Class<?> componentType) {
> if (that.isArray())
> matches arrayClass(that.getComponentType());
> }
>
> // regular expression -- instance pattern in j.u.r.Pattern
> pattern(String that) regexMatch(String... groups) {
> Matcher m = matcher(that);
> if (m.matches())
> matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
> .map(Matcher::group)
> .toArray(String[]::new));
> }
>
> // power of two (somewhere)
> static pattern(int that) powerOfTwo(int exp) {
> int exp = 0;
>
> if (that < 1)
> return;
>
> while (that > 1) {
> if (that % 2 == 0) {
> that /= 2;
> exp++;
> }
> else
> return;
> }
> matches powerOfTwo(exp);
> }
> ```
>
> ## Closing thoughts
>
> I came out of this exploration with very different conclusions
> than I expected
> when going in. At first, the "inverse" syntax seemed stilted, but
> over time it
> started to seem more obvious. Similarly, I went in expecting to
> prefer the
> imperative approach for the body, but over time, started to warm
> to the
> functional approach, and eventually concluded it was basically a
> forced move if
> we want to support more than just deconstructors. And I started
> out skeptical
> of "implicit fail", but after writing a few dozen patterns with
> it, going back
> to fully explicit felt painful. All of this is to say, you should
> hold your
> initial opinions at arm's length, and give the alternatives a
> chance to sink in.
>
> For most _conditional_ patterns (and conditionality is at the
> heart of pattern
> matching), the functional approach cleanly highlights both the
> match predicate
> and the flow of values, and is considerably less fussy than the
> imperative
> approach in the same situation; `Optional::of`,
> `Class::arrayClass`, and `regex`
> look great here, much better than the would with imperative. None
> of these
> illustrate delegation, but in the presence of delegation, the gap
> gets even
> wider.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240330/251383c3/attachment-0001.htm>
More information about the amber-spec-observers
mailing list