<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<br>
<br>
<div class="moz-cite-prefix">On 3/30/2024 3:23 PM, Victor Nazarov
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com">
<div dir="ltr">
<div>I have two points that I think may be good to consider in
the list of options.<br>
<br>
1. I'm not sure if this was considered, but I find explicit
lists of covering patterns<br>
rather natural and more flexible than using case as a
pattern-modifier.<br>
</div>
</div>
</blockquote>
<br>
Agreed (this is how F# does it), and we tried that, but it is so
contrary to how members are done in Java. (One might think that
one could declare a "sealed" pattern, which "permits" a list of
other patterns, and this sounds perfectly natural, but it looks
pretty weird.) <br>
<br>
<blockquote type="cite" cite="mid:CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com">
<div dir="ltr">
<div>The important feature of explicit lists is that there may
be more than one covering set of patterns.<br>
</div>
</div>
</blockquote>
<br>
Yes, been down this road too, but the reality is that this is not
likely to come up nearly as often as one might imagine. <br>
<br>
<blockquote type="cite" cite="mid:CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com">
<div dir="ltr">
<div>2. I think that there is a middle ground between functional
and imperative pattern body definition style that may look
cumbersome at first, but nevertheless gives you best of both
worlds:<br>
</div>
</div>
</blockquote>
<br>
The `match` block is an interesting idea, will consider.<br>
<br>
<br>
<blockquote type="cite" cite="mid:CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com">
<div dir="ltr">
<div><br>
* deconstructor patterns look dual to constructors<br>
* names from the list of pattern variables are actually
used and checked by the compiler<br>
* control flow is still functional, which is more natural<br>
<br>
The downside that is retained from the imperative style is the
need for alpha-renaming,<br>
but I think we still have to deal with shadowing and renaming
local-variable seems natural and easy.<br>
<br>
Middle ground may be used like a special form that can be used
in the pattern body.<br>
This form works mostly the same way as `with`-clause as
defined in the "Derived Record Instances" JEP.<br>
<br>
Here is the long list of examples to fully illustrate
different interactions:<br>
<br>
````<br>
class Optional<T> matches (of|empty) { <br>
public static <T>
pattern<Optional<T>> of(T value) {<br>
if (that.isPresent()) {<br>
match {<br>
value = that.get();<br>
}<br>
}<br>
}<br>
<br>
public static <T>
pattern<Optional<T>> empty() {<br>
if (that.isEmpty())<br>
match {}<br>
}<br>
}<br>
<br>
class Pattern {<br>
public pattern<String> regexMatch(String...
groups) {<br>
Matcher m = this.matcher(that);<br>
if (m.matches()) {<br>
match {<br>
groups =<br>
IntStream.range(1, m.groupCount())<br>
.map(Matcher::group)<br>
.toArray(String[]::new);<br>
}<br>
}<br>
}<br>
}<br>
<br>
class A {<br>
private final int a;<br>
<br>
public A(int a) {<br>
this.a = a;<br>
}<br>
public pattern A(int a) {<br>
match {<br>
a = that.a;<br>
}<br>
}<br>
}<br>
<br>
class B extends A {<br>
private final int b;<br>
<br>
public B(int a, int b) {<br>
super(a);<br>
this.b = b;<br>
}<br>
<br>
public pattern B(int a, int b) {<br>
if (that instanceof super(var aa)) {<br>
match {<br>
a = aa;<br>
b = that.b;<br>
}<br>
}<br>
}<br>
}<br>
<br>
interface Converter<T,U> {<br>
pattern<T> convert(U u);<br>
}<br>
Converter<Integer, Short> c =<br>
pattern (s) -> {<br>
if (that >= Short.MIN_VALUE && that
<= Short.MAX_VALUE)<br>
match {<br>
s = (short) that;<br>
}<br>
};<br>
````<br>
</div>
<div><br>
</div>
<div>--<br>
</div>
<div>
<div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Victor Nazarov<br>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Mar 29, 2024 at
10:59 PM Brian Goetz <<a href="mailto:brian.goetz@oracle.com" moz-do-not-send="true" class="moz-txt-link-freetext">brian.goetz@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div> <font size="4" face="monospace">We now come to the
long-awaited bikeshed discussion on what member patterns
should look like. <br>
<br>
Bikeshed disclaimer for EG: <br>
- This is likely to evoke strong opinions, so please
take pains to be especially constructive<br>
- Long reply-to-reply threads should be avoided even
more than usual<br>
- Holistic, considered replies preferred<br>
- Please change subject line if commenting on a
sub-topic or tangential<br>
concern<br>
<br>
Special reminders for Remi:<br>
- Use of words like "should", "must", "shouldn't",
"mistake", "wrong", "broken"<br>
are strictly forbidden. <br>
- If in doubt, ask questions first. <br>
<br>
Notes for external observers:<br>
- This is a working document for the EG; the discussion
may continue for a<br>
while before there is an official proposal. Please be
patient.<br>
<br>
<br>
# Pattern declaration: the bikeshed<br>
<br>
We've largely identified the model for what kinds of
patterns we need to<br>
express, but there are still several degrees of freedom in
the syntax.<br>
<br>
As the model has simplified during the design process, the
space of syntax<br>
choices has been pruned back, which is a good thing.
However, there are still<br>
quite a few smaller decisions to be made. Not all of the
considerations are<br>
orthogonal, so while they are presented individually, this
is not a "pick one<br>
from each column" menu. <br>
<br>
Some of these simplifications include:<br>
<br>
- Patterns with "input arguments" have been removed;
another way to get to what<br>
this gave us may come back in another form. <br>
- I have grown increasingly skeptical of the value of the
imperative `match`<br>
statement. With better totality analysis, I think it
can be eliminated.<br>
<br>
We can discuss these separately but I would like to sync
first on the broad<br>
strokes for how patterns are expressed.<br>
<br>
## Object model requirements<br>
<br>
As outlined in "Towards Member Patterns", the basic model
is that patterns are<br>
the dual of other executable members (constructors, static
methods, instance<br>
methods.) While they are like methods in that they have
inputs, outputs, names,<br>
and an imperative body, they have additional degrees of
freedom that<br>
constructors and methods lack: <br>
<br>
- Patterns are, in general, _conditional_ (they can
succeed or fail), and only<br>
produce bindings (outputs) when they succeed. This
conditionality is<br>
understood by the language's flow analysis, and is used
for computing scoping<br>
and definite assignment.<br>
- Methods can return at most one value; when a pattern
completes successfully,<br>
it may bind multiple values.<br>
- All patterns have a _match candidate_, which is a
distinguished,<br>
possibly-implicit parameter. Some patterns also have a
receiver, which is<br>
also a distinguished, possibly-implicit parameter. In
some such cases the<br>
receiver and match candidate are aliased, but in others
these may refer to<br>
different objects.<br>
<br>
So a pattern is a named executable member that takes a
_match candidate_ as a<br>
possibly-implicit parameter, maybe takes a receiver as an
implicit parameter,<br>
and has zero or more conditional _bindings_. Its body can
perform imperative<br>
computation, and can terminate either with match failure
or success. In the<br>
success case, it must provide a value for each binding.<br>
<br>
Deconstruction patterns are special in many of the same
ways constructors are:<br>
they are constrained in their name, inheritance, and
probably their<br>
conditionality (they should probably always succeed).
Just as the syntax for<br>
constructors differs slightly from that of instance
methods, the syntax for<br>
deconstructors may differ slightly from that of instance
patterns. Static<br>
patterns, like static methods, have no receiver and do not
have access to the<br>
type parameters of the enclosing class. <br>
<br>
Like constructors and methods, patterns can be overloaded,
but in accordance<br>
with their duality to constructors and methods, the
overloading happens on the<br>
_bindings_, not the inputs. <br>
<br>
## Use-site syntax<br>
<br>
There are several kinds of type-driven patterns built into
the language: type<br>
patterns and record patterns. A type pattern in a
`switch` looks like:<br>
<br>
case String s: ...<br>
<br>
And a record pattern looks like:<br>
<br>
case MyRecord(P1, P2, ...): ...<br>
<br>
where `P1..Pn` are nested patterns that are recursively
matched to the<br>
components of the record. This use-site syntax for record
patterns was chosen<br>
for its similarity to the construction syntax, to
highlight that a record<br>
pattern is the dual of record construction. <br>
<br>
**Deconstruction patterns.** The simplest kind of member
pattern, a<br>
deconstruction pattern, will have the same use-site syntax
as a record pattern;<br>
record patterns can be thought of as a deconstruction
pattern "acquired for<br>
free" by records, just as records do with constructors,
accessors, object<br>
methods, etc. So the use of a deconstruction pattern for
`Point` looks like:<br>
<br>
case Point(var x, var y): ...<br>
<br>
whether `Point` is a record or an ordinary class equipped
with a suitable<br>
deconstruction pattern. <br>
<br>
**Static patterns.** Continuing with the idea that the
destructuring syntax<br>
should evoke the aggregation syntax, there is an obvious
candidate for the<br>
use-site syntax for static patterns: <br>
<br>
case Optional.of(var e): ...<br>
case Optional.empty(): ...<br>
<br>
**Instance patterns.** Uses of instance patterns will
likely come in two forms,<br>
analogous to bound and unbound instance method references,
depending on whether<br>
the receiver and the match candidate are the same object.
In the unbound form,<br>
used when the receiver is the same object as the match
candidate, the pattern<br>
name is qualified by a _type_:<br>
<br>
```<br>
Class<?> k = ...<br>
switch (k) { <br>
// Qualified by type<br>
case Class.arrayClass(var componentType): ...<br>
}<br>
```<br>
<br>
This means that we _resolve_ the pattern `arrayClass`
starting at `Class` and<br>
_select_ the pattern using the receiver, `k`. We may also
be able to omit the<br>
class qualifier if the static type of the match candidate
is sufficient to<br>
resolve the desired pattern.<br>
<br>
In the bound form, used when the receiver is distinct from
the match candidate,<br>
the pattern name is qualified with an explicit _receiver
expression_. As an<br>
example, consider an interface that captures primitive
widening and narrowing<br>
conversions, such as those between `int` and `long`. In
the widening direction,<br>
conversion is unconditional, so this can be modeled as a
method from `int` to<br>
`long`. In the other direction, conversion is
conditional, so this is better<br>
modeled as a _pattern_ whose match candidate is `long` and
which binds an `int`<br>
on success. Since these are instance methods of some
class (say,<br>
`NumericConversion<T,U>`), we need to provide the
receiver instance in order to<br>
resolve the pattern:<br>
<br>
```<br>
NumericConversion<int, long> nc = ...<br>
<br>
switch (aLong) { <br>
case nc.narrowed(int i): <br>
...<br>
}<br>
```<br>
<br>
The explicit receiver syntax would also be used if we
exposed regular expression<br>
matching as a pattern on the `j.u.r.Pattern` object (the
name collision on<br>
`Pattern` is unfortunate). Imagine we added a `matching`
instance pattern to<br>
`j.u.r.Pattern`; then we could use it in `instanceof` as
follows: <br>
<br>
```<br>
static final java.util.regex.Pattern P =
Pattern.compile("(a*)(b*)"); <br>
...<br>
if (aString instanceof P.matching(String as, String bs)) {
... }<br>
```<br>
<br>
Each of these use-site syntaxes is modeled after the
use-site syntax for a<br>
method invocation or method reference.<br>
<br>
## Declaration-site syntax<br>
<br>
To avoid being biased by the simpler cases, we're going to
work all the cases<br>
concurrently rather than starting with the simpler cases
and working up. (It<br>
might seem sensible to start with deconstructors, since
they are the "easy"<br>
case, but if we did that, we would likely be biased by
their simplicity and then<br>
find ourselves painted into a corner.) As our example
gallery, we will consider:<br>
<br>
- Deconstruction pattern for `Point`;<br>
- Static patterns for `Optional::of` and
`Optional::empty`;<br>
- Static pattern for "power of two" (illustrating a
computations where success<br>
or failure, and computation of bindings, cannot easily
be separated);<br>
- Instance pattern for `Class::arrayClass` (used
unbound);<br>
- Instance pattern for `Pattern::matching` on regular
expressions (used bound).<br>
<br>
Member patterns, like methods, have _names_. (We can
think of constructors as<br>
being named for their enclosing classes, and the same for
deconstructors.) All<br>
member patterns have a (possibly empty) ordered list of
_bindings_, which are<br>
the dual of constructor or method parameters. Bindings,
in turn, have names and<br>
types. And like constructors and methods, member patterns
have a _body_ which<br>
is a block statement. Member patterns also have a _match
candidate_, which is a<br>
likely-implicit method parameter. <br>
<br>
### Member patterns as inverse methods and constructors<br>
<br>
Regardless of syntax, let us remind ourselves that that
deconstructors are the<br>
categorical dual to constructors (coconstructors), and
pattern methods are the<br>
categorical dual to methods (comethods). They are dual in
their structure: a<br>
constructor or method takes N arguments and produces a
result, the corresponding<br>
member pattern consumes a match candidate and
(conditionally) produces N<br>
bindings. <br>
<br>
Moreover, they are semantically dual: the return value
produced by construction<br>
or factory invocation is the match candidate for the
corresponding member<br>
pattern, and the bindings produced by a member pattern are
the answers to the<br>
_Pattern Question_ -- "could this object have come from an
invocation of my<br>
dual, and if so, with what arguments." <br>
<br>
### What do we call them?<br>
<br>
Given the significant overlap between methods and
patterns, the first question<br>
about the declaration we need to settle is how to identify
a member pattern<br>
declaration as distinct from a method or constructor
declaration. _Towards<br>
Member Patterns_ tried out a syntax that recognized these
as _inverse_ methods<br>
and constructors:<br>
<br>
public Point(int x, int y) { ... }<br>
public inverse Point(int x, int y) { ... }<br>
<br>
While this is a principled choice which clearly highlights
the duality, and one<br>
that might be good for specification and verbal
description, it is questionable<br>
whether this would be a great syntax for reading and
writing programs. <br>
<br>
A more traditional option is to choose a "noun"
(conditional) keyword, such as<br>
`pattern`, `matcher`, `extractor`, `view`, etc:<br>
<br>
public pattern Point(int x, int y) { ... }<br>
<br>
If we are using a noun keyword to identify pattern
declarations, we could use<br>
the same noun for all of them, or we could choose a
different one for<br>
deconstruction patterns:<br>
<br>
public deconstructor Point(int x, int y) { ... }<br>
<br>
Alternately, we could reach for a symbol to indicate that
we are talking about<br>
an inverted member. C++ fans might suggest<br>
<br>
public ~Point(int x, int y) { ... }<br>
<br>
but this is too cryptic (it's evocative once you see it,
but then it becomes<br>
less evocative as we move away from deconstructors towards
instance patterns.)<br>
<br>
If we wish to offer finer-grained control over
conditionality, we might<br>
additionally need a `total` / `partial` modifier, though I
would prefer to avoid<br>
that.<br>
<br>
Of the keyword candidates, there is one that stands out
(for good and bad)<br>
because it connects to something that is already in the
language: `pattern`. On<br>
the one hand, using the term `pattern` for the declaration
is a slight abuse; on<br>
the other, users will immediately connect it with "ah, so
that's how I make a<br>
new pattern" or "so that's what happens when I match
against this pattern."<br>
(Lisps would resolve this tension by calling it
`defpattern`.)<br>
<br>
The others (`matcher`, `view`, `extractor`, etc) are all
made-up terms that<br>
don't connect to anything else in the language, for better
or worse. If we pick<br>
one of these, we are asking users to sort out _three_
separate new things in<br>
their heads: (use-site) patterns, (declaration-site)
matchers, and the rules of<br>
how patterns and matchers are connected. Calling them
both "patterns", despite<br>
the mild abuse of terminology, ties them together in a way
that recognizes their<br>
connection.<br>
<br>
My personal position: `pattern` is the strongest candidate
here, despite some<br>
flaws.<br>
<br>
### Binding lists and match candidates<br>
<br>
There are two obvious alternatives for describing the
binding list and match<br>
candidate of a pattern declaration, both with their roots
in the constructor and<br>
method syntax: <br>
<br>
- Pretend that a pattern declaration is like a method
with multiple return, and<br>
put the binding list in the "return position", and make
the match candidate<br>
an ordinary parameter;<br>
- Lean into the inverse relationship between constructors
and methods (and<br>
consistency with the use-site syntax), and put the
binding list in the<br>
"parameter list position". For static patterns and some
instance patterns,<br>
which need to explicitly identify the match candidate
type, there are several<br>
sub-options:<br>
- Lean further into the duality, putting the match
candidate type in the<br>
"return position";<br>
- Put the match candidate type somewhere else, where it
is less likely to be<br>
confused for a method return.<br>
<br>
The "method-like" approach might look like this:<br>
<br>
```<br>
class Point { <br>
// Constructor and deconstructor<br>
public Point(int x, int y) { ... }<br>
public pattern (int x, int y) Point(Point target) {
... }<br>
...<br>
}<br>
<br>
class Optional<T> { <br>
// Static factory and pattern<br>
public static<T> Optional<T> of(T t) { ...
}<br>
public static<T> pattern (T t)
of(Optional<T> target) { ... }<br>
...<br>
}<br>
```<br>
<br>
The "inverse" approach might look like:<br>
<br>
```<br>
class Point { <br>
// Constructor and deconstructor<br>
public Point(int x, int y) { ... }<br>
public pattern Point(int x, int y) { ... }<br>
...<br>
}<br>
<br>
class Optional<T> { <br>
// Static factory and pattern (using the first
sub-option)<br>
public static<T> Optional<T> of(T t) { ...
}<br>
public static<T> pattern Optional<T> of(T
t) { ... }<br>
...<br>
}<br>
```<br>
<br>
With the "method-like" approach, the match candidate gets
an explicit name<br>
selected by the author; with the inverse approach, we can
go with a predefined<br>
name such as `that`. (Because deconstructors do not have
receivers, we could by<br>
abuse of notation arrange for the keyword `this` to refer
instead to the match<br>
candidate within the body of a deconstructor. While this
might seem to lead to<br>
a more familiar notation for writing deconstructors, it
would create a<br>
gratuitous asymmetry between the bodies of deconstruction
patterns and those of<br>
other patterns.)<br>
<br>
Between these choices, nearly all the considerations favor
the "inverse"<br>
approach:<br>
<br>
- The "inverse" approach makes the declaration look like
the use site. This<br>
highlights that `pattern Point(int x, int y)` is what
gets invoked when you<br>
match against the pattern use `Point(int x, int y)`.
(This point is so<br>
strong that we should probably just stop here.)<br>
- The "inverse" members also look like their duals; the
only difference is the<br>
`pattern` keyword (and possibly the placement of the
match candidate type).<br>
This makes matched pairs much more obvious, and such
matched pairs will be<br>
critical both for future language features and for
library idioms.<br>
- The method-like approach is suggestive of multiple
return or tuples, which is<br>
probably helpful for the first few minutes but actually
harmful in the long<br>
term. This feature is _not_ (much as some people would
like to believe) about<br>
multiple return or tuples, and playing into this
misperception will only make<br>
it harder to truly understand. So this suggestion ends
up propping up the<br>
wrong mental model. <br>
<br>
The main downside of the "inverse" approach is the
one-time speed bump of the<br>
unfamiliarity of the inverted syntax. (The "method-like"
syntax also has its<br>
own speed bumps, it is just unfamiliar in different
ways.) But unlike the<br>
advantages of the inverse approach, which continue to add
value forever, this<br>
speed bump is a one-time hurdle to get over. <br>
<br>
To smooth out the speed bumps of the inverse approach, we
can consider moving<br>
the position of the match candidate for static and
(suitable) instance pattern<br>
declarations, such as:<br>
<br>
```<br>
class Optional<T> { <br>
// the usual static factory<br>
public static<T> Optional<T> of(T t) { ...
}<br>
<br>
// Various ways of writing the corresponding pattern<br>
public static<T> pattern of(T t) for
Optional<T> { ... }<br>
// or ...<br>
public static<T> pattern(Optional<T>) of(T
t) { ... }<br>
// or ...<br>
public static<T> pattern(Optional<T> that)
of(T t) { ... }<br>
// or ...<br>
public static<T>
pattern<Optional<T>> of(T t) { ... }<br>
...<br>
}<br>
```<br>
<br>
(The deconstructor example looks the same with either
variant.) Of these,<br>
treating the match candidate like a "parameter" of
"pattern" is probably the<br>
most evocative:<br>
<br>
```<br>
public static<T> pattern(Optional<T> that)
of(T t) { ... }<br>
```<br>
<br>
as it can be read as "pattern taking the parameter
`Optional<T> that` called<br>
`of`, binding `T`, and is a short departure from the
inverse syntax.<br>
<br>
The main value of the various rearrangements is that users
don't need to think<br>
about things operating in reverse to parse the syntax.
This trades some of the<br>
secondary point (patterns looking almost exactly like
their inverses) for a<br>
certain amount of cognitive load, while maintaining the
most important<br>
consideration: that the declaration site look like the use
site. <br>
<br>
For instance pattern declarations, if the match candidate
type is the same as<br>
the receiver type, the match candidate type can be elided
as it is with<br>
deconstructors. <br>
<br>
My personal position: the "multiple return" version is
terrible; all the<br>
sub-variants of the inverse version are probably workable.<br>
<br>
### Naming the match candidate<br>
<br>
We've been assuming so far that the match candidate always
has a fixed name,<br>
such as `that`; this is an entirely workable approach.
Some of the variants are<br>
also amenable to allowing authors to explicitly select a
name for the match<br>
candidate. For example, if we put the match candidate as
a "parameter" to the `pattern` keyword, there is an
obvious place to put the name:<br>
<br>
```<br>
static<T> pattern(Optional<T> target) of(T t)
{ ... }<br>
```<br>
<br>
My personal opinion: I don't think this degree of freedom
buys us much, and in<br>
the long run readability probably benefits by picking a
fixed name like `that`<br>
and sticking with it. Even with a fixed name, if there is
a sensible position<br>
for the name, allowing users to type `that` for
explicitness is fine (as we do<br>
with instance methods, though many people don't know
this.) We may even want to<br>
require it.<br>
<br>
## Body types<br>
<br>
Just as there are two obvious approaches for the
declaration, there are two<br>
obvious approaches we could take for the body (though
there is some coupling<br>
between them.) We'll call the two body approaches
_imperative_ and<br>
_functional_. <br>
<br>
The imperative approach treats bindings as initially-DU
variables that must be<br>
DA on successful completion, getting their value through
ordinary assignment;<br>
the functional approach sets all the bindings at once,
positionally. Either<br>
way, member patterns (except maybe deconstructors) also
need a way to<br>
differentiate a successful match from a failed match. <br>
<br>
Here is the `Point` deconstructor with both imperative and
functional style. The<br>
functional style uses a placeholder `match` statement to
indicate a successful<br>
match and provision of bindings:<br>
<br>
```<br>
class Point {<br>
int x, y;<br>
<br>
Point(int x, int y) {<br>
this.x = x;<br>
this.y = y;<br>
}<br>
<br>
// Imperative style, deconstructor always succeeds<br>
pattern Point(int x, int y) {<br>
x = that.x;<br>
y = that.y;<br>
}<br>
<br>
// Functional style<br>
pattern Point(int x, int y) {<br>
match(that.x, that.y);<br>
}<br>
}<br>
```<br>
<br>
There are some obvious differences here. In the
imperative style, the dtor body<br>
looks much more like the reverse of the ctor body. The
functional style is more<br>
concise (and amenable to further concision via the
"concise method bodies"<br>
mechanism in the future), as well as a number of less
obvious differences. For<br>
deconstructors, the imperative approach is likely to feel
more natural because<br>
of the obvious symmetry with constructors.<br>
<br>
In reality, it is _premature at this point to have an
opinion_, because we<br>
haven't yet seen the full scope of the problem;
deconstructors are a special<br>
case in many ways, which almost surely is distorting our
initial opinion. As we<br>
move towards conditional patterns (and pattern lambdas),
our opinions may flip.<br>
<br>
Regardless of which we pick, there are some additional
syntactic choices to be<br>
made -- what syntax to use to indicate success (we used
`match` in the above<br>
example) or failure. (We should be especially careful
around trying to reuse<br>
words like `return`, `break`, or `yield` because, in the
case where there are<br>
zero bindings (which is allowable), it becomes unclear
whether they mean "fail"<br>
or "succeed with zero bindings".) <br>
<br>
### Success and failure<br>
<br>
Except for possibly deconstructors, which we may require
to be total, a pattern<br>
declaration needs a way to indicate success and failure.
In the examples above,<br>
we posited a `match` statement to indicate success in the
functional approach,<br>
and in both examples leaned on the "implicit success" of
deconstructors (under<br>
the assumption they always succeed). Now let's look at
the more general case to<br>
figure out what else is needed.<br>
<br>
For a static pattern like `Optional::of`, success is
conditional. Using<br>
`match-fail` as a placeholder for "the match failed", this
might look like<br>
(functional version):<br>
<br>
```<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent())<br>
match (that.get());<br>
else<br>
match-fail;<br>
}<br>
```<br>
<br>
The imperative version is less pretty, though. Using
`match-success` as a<br>
placeholder:<br>
<br>
```<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent()) {<br>
t = that.get();<br>
match-success;<br>
}<br>
else<br>
match-fail;<br>
}<br>
```<br>
<br>
Both arms of the `if` feel excessively ceremonial here.
And if we chose to not<br>
make all deconstruction patterns unconditional,
deconstructors would likely need<br>
some explicit success as well:<br>
<br>
```<br>
pattern Point(int x, int y) {<br>
x = that.x;<br>
y = that.y;<br>
match-success;<br>
}<br>
```<br>
<br>
It might be tempting to try and eliminate the need for
explicit success by<br>
inferring it from whether or not the bindings are DA or
not, but this is<br>
error-prone, is less type-checkable, and falls apart
completely for patterns<br>
with no bindings.<br>
<br>
### Implicit failure in the functional approach<br>
<br>
One of the ceremonial-seeming aspects of `Optional::of`
above is having to say<br>
`else match-fail`, which doesn't feel like it adds a lot
of value. Perhaps we<br>
can be more concise without losing clarity. <br>
<br>
Most conditional patterns will have a predicate to
determine matching, and then<br>
some conditional code to compute the bindings and claim
success. Having to say<br>
"and if the predicate didn't hold, then I fail" seems like
ceremony for the<br>
author and noise for the reader. Instead, if a
conditional pattern falls off<br>
the end without matching, we could treat that as simply
not matching:<br>
<br>
```<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent())<br>
match (that.get());<br>
}<br>
```<br>
<br>
This says what we mean: if the optional is present, then
this pattern succeeds<br>
and bind the contents of the `Optional`. As long as our
"succeed" construct<br>
strongly enough connotes that we are terminating abruptly
and successfully, this<br>
code is perfectly clear. And most conditional patterns
will look a lot like<br>
`Optional::of`; do some sort of test and if it succeeds,
extract the state and<br>
bind it.<br>
<br>
At first glance, this "implicit fail" idiom may seem
error-prone or sloppy. But<br>
after writing a few dozen patterns, one quickly tires of
saying "else<br>
match-fail" -- and the reader doesn't necessarily
appreciate reading it either. <br>
<br>
Implicit failure also simplifies the selection of how we
explicitly indicate<br>
failure; using `return` in a pattern for "no match"
becomes pretty much a forced<br>
move. We observe that (in a void method), "return" and
"falling off the end"<br>
are equivalent; if "falling off the end" means "no match",
then so should an<br>
explicit `return`. So in those few cases where we need to
explicitly signal "no<br>
match", we can just use `return`. It won't come up that
often, but here's an<br>
example where it does: <br>
<br>
```<br>
static pattern(int that) powerOfTwo(int exp) {<br>
int exp = 0;<br>
<br>
if (that < 1)<br>
return; // explicit fail<br>
<br>
while (that > 1) {<br>
if (that % 2 == 0) {<br>
that /= 2;<br>
++exp;<br>
}<br>
else<br>
return; // explicit fail<br>
}<br>
match (exp);<br>
}<br>
```<br>
<br>
As a bonus, if `return` as match failure is a forced move,
we need only select a<br>
term for "successful match" (which obviously can't be
`return`). We could use<br>
`match` as we have in the examples, or a variant like
`matched` or `matches`.<br>
But rather than just creating a new control operator, we
have an opportunity to<br>
lean into the duality a little harder, by including the
pattern syntax in the<br>
match:<br>
<br>
```<br>
matches of(that.get());<br>
```<br>
<br>
or the (optionally?) qualified (inferring type arguments,
as we do at the use<br>
site):<br>
<br>
```<br>
matches Optional.of(that.get());<br>
```<br>
<br>
These "use the name" approaches trades a small amount of
verbosity to gain a<br>
higher degree of fidelity to the pattern use site (and to
evoke the comethod<br>
completion.) <br>
<br>
If we don't choose "implicit fail", we would have to
invent _two_ new control<br>
flow statements to indicate "success" and "failure". <br>
<br>
My personal position: for the functional approach,
implicit failure both makes<br>
the code simpler and clearer, and after you get used to
it, you don't want to go<br>
back. Whether we say `match` or `matches` or `matches
<pattern-name>` are all<br>
workable, though I like some variant that names the
pattern.<br>
<br>
### Implicit success in the imperative approach<br>
<br>
In the imperative approach, we can be implicit as well,
but it feels more<br>
natural (at least, initially) to choose implicit success
rather than failure.<br>
This works great for unconditional patterns:<br>
<br>
```<br>
pattern Point(int x, int y) {<br>
x = that.x;<br>
y = that.y;<br>
// implicit success<br>
}<br>
```<br>
<br>
but not quite as well for conditional patterns:<br>
<br>
```<br>
static<T> pattern(Optional<T> that) of(T t) {
<br>
if (that.isPresent()) {<br>
t = that.get();<br>
}<br>
else<br>
match-fail;<br>
// implicit success<br>
}<br>
```<br>
<br>
We can eliminate one of the arms of the if, with the more
concise (but<br>
convoluted) inversion:<br>
<br>
```<br>
static<T> pattern(Optional<T> that) of(T t) {
<br>
if (!that.isPresent()) <br>
match-fail;<br>
t = that.get();<br>
// implicit success<br>
}<br>
```<br>
<br>
Just as with the functional approach, if we choose
imperative and "implicit<br>
success", using `return` to indicate success is pretty
much a forced move. <br>
<br>
### Imperative is a trap<br>
<br>
If we assume that functional implies implicit failure, and
imperative implies<br>
implicit success, then our choices become: <br>
<br>
```<br>
class Optional<T> { <br>
public static<T> Optional<T> of(T t) { ...
}<br>
<br>
// imperative, implicit success<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent()) {<br>
t = that.get();<br>
}<br>
else<br>
match-fail;<br>
}<br>
<br>
// functional, implicit failure<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent())<br>
matches of(that.get());<br>
}<br>
}<br>
```<br>
<br>
Once we get past deconstructors, the imperative approach
looks worse by<br>
comparison because we need to assign all the bindings
(which is _O(n)_<br>
assignments) _and also_ indicate success or failure
somehow, whereas in the<br>
functional style all can be done together with a single
`matches` statement.<br>
<br>
Looking at the alternatives, except maybe for
unconditional patterns, the<br>
functional example above seems a lot more natural. The
imperative approach<br>
works with deconstructors (assuming they are not
conditional), but does not<br>
scale so well to conditionality -- which is the essence of
patterns.<br>
<br>
From a theoretical perspective, the method-comethod
duality also gives us a<br>
forceful nudge towards the functional approach. In a
method, the method<br>
arguments are specified as a positional list of
expressions at the use site: <br>
<br>
m(a, b, c)<br>
<br>
and these values are invisibly copied into the parameter
slots of the method<br>
prior to frame activation. The dual to that for a
comethod to similarly convey<br>
the bindings in a positional list of expressions (as they
must either all be<br>
produced or none), where they are copied into the slots
provided at the use<br>
site, as is indicated by `matches` in the above examples.
<br>
<br>
My personal position: the imperative style feels like a
trap. It seems<br>
"obvious" at first if we start with deconstructors, but
becomes increasingly<br>
difficult when we get past this case, and gets in the way
of other<br>
opportunities. The last gasp before acceptance is the
discomfort that dtor and<br>
ctor bodies are written in different styles, but in the
rear-view mirror, this<br>
feels like a non-issue. <br>
<br>
### Derive imperative from functional?<br>
<br>
If we start with "functional with implicit failure", we
can possibly rescue<br>
imperative by deriving a version of imperative from
functional, by "overloading"<br>
the match-success operator. <br>
<br>
If we have a pattern whose binding names are `b1..bn` of
types `B1..Bn`, then<br>
the `matches` operator must take a list of expressions
`e1..en` whose arity and<br>
types are compatible with `B1..Bn`. But we could allow
`matches` to also have a<br>
nilary form, which would have the effect of being
shorthand for <br>
<br>
matches <pattern-name>(b1, b2, ..., bn)<br>
<br>
where each of `b1..bn` must be DA at the point of
matching. This means that we<br>
could express patterns in either form:<br>
<br>
```<br>
class Optional<T> { <br>
public static<T> Optional<T> of(T t) { ...
}<br>
<br>
// imperative, derived from functional with implicit
failure<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent()) {<br>
t = that.get();<br>
matches of;<br>
}<br>
}<br>
<br>
public static<T> pattern(Optional<T> that)
of(T t) { <br>
if (that.isPresent())<br>
matches of(that.get());<br>
}<br>
}<br>
```<br>
<br>
This flexibility allows users to select a more verbose
expression in exchange<br>
for a clearer association of expressions and bindings,
though as we'll see, it<br>
does come with some additional constraints.<br>
<br>
### Wrapping an existing API<br>
<br>
Nearly every library has methods (sometimes sets of
methods) that are patterns<br>
in disguise, such as the pair of methods `isArray` and
`getComponentType` in<br>
`Class`, or the `Matcher` helper type in
`java.util.regex`. Library maintainers<br>
will likely want to wrap (or replace) these with real
patterns, so these can<br>
participate more effectively in conditional contexts, and
in some cases,<br>
highlight their duality with factory methods.<br>
<br>
Matching a string against a `j.u.r.Pattern` regular
expression has all the same<br>
elements as a pattern, just with an ad-hoc API (and one
that I have to look up<br>
every time). But we can fairly easily wrap a true pattern
around the existing<br>
API. To match against a `Pattern` today, we pass the
match candidate to<br>
`Pattern::matcher`, which returns a `Matcher` with
accessors `Matcher::matches`<br>
(did it match) and `Matcher::group` (conditionally extract
a particular capture<br>
group.) If we want to wrap this with a pattern called
`regexMatch`:<br>
<br>
```<br>
pattern(String that) regexMatch(String... groups) {<br>
Matcher m = this.matcher(that);<br>
if (m.matches())<br>
matches Pattern.regexMatch(IntStream.range(1,
m.groupCount())<br>
.map(Matcher::group)<br>
.toArray(String[]::new));<br>
// whole lotta matchin' goin' on<br>
}<br>
```<br>
<br>
This says that a `j.u.r.Pattern` has an instance pattern
called `regex`, whose<br>
match candidate is `String`, and which binds a varargs of
`String` corresponding<br>
to the capture groups. The implementation simply
delegates to the existing<br>
`j.u.r.Matcher` API. This means that `j.u.r.Pattern`
becomes a sort of "pattern<br>
object", and we can use it as a receiver at the use site:
<br>
<br>
```<br>
static Pattern As = Pattern.compile("(a*)");<br>
static Pattern Bs = Pattern.compile("(b*)");<br>
...<br>
switch (string) { <br>
case As.regexMatch(var as): ...<br>
case Bs.regexMatch(var bs): ...<br>
...<br>
}<br>
```<br>
<br>
### Odds and ends<br>
<br>
There are a number of loose ends here. We could choose
other names for the<br>
match-success and match-fail operations, including trying
to reuse `break` or<br>
`yield`. But, this reuse is tricky; it must be very clear
whether a given form<br>
of abrupt completion means "success" or "failure", because
in the case of<br>
patterns with no bindings, we will have no other syntactic
cues to help<br>
disambiguate. (I think having a single `matches`, with
implicit failure and<br>
`return` meaning failure, is the sweet spot here.)<br>
<br>
Another question is whether the binding list introduces
corresponding variables<br>
into the scope of the body. For imperative, the answer is
"surely yes"; for<br>
functional, the answer is "maybe" (unless we want to do
the trick where we<br>
derive imperative from functional, in which case the
answer is "yes" again.)<br>
<br>
If the binding list does not correspond to variables in
the body, this may be<br>
initially discomforting; because they do not declare
program elements, they may<br>
feel that they are left "dangling". But even if they are
not declaring<br>
_program_ elements, they are still declaring _API_
elements (similar to the<br>
return type of a method.) We will want to provide Javadoc
on the bindings, just<br>
like with parameters; we will want to match up binding
names in deconstructors<br>
with parameter names in constructors; we may even someday
want to support<br>
by-name binding at the use site (e.g., `case Foo(a: var
a)`). The names are<br>
needed for all of these, just not for the body. Names
still matter. My take<br>
here is that this is a transient "different is scary"
reaction, one that we<br>
would get over quickly.<br>
<br>
A final question is whether we should consider unqualified
names as implicitly<br>
qualified by `that` (and also `this`, for instance
patterns, with some conflict<br>
resolution). Users will probably grow tired of typing
`that.` all the time, and most of the time, the
unqualified use is perfectly readable.<br>
<br>
## Exhaustiveness <br>
<br>
There is one last syntax question in front of us: how to
indicate that a set of<br>
patterns are (claimed to be) exhaustive on a given match
candidate type. We see<br>
this with `Optional::of` and `Optional::empty`; it would
be sad if the compiler<br>
did not realize that these two patterns together were
exhaustive on `Optional`.<br>
This is not a feature that will be used often, but not
having it at all will be<br>
a repeated irritant.<br>
<br>
The best I've come up with is to call these `case`
patterns, where a set of<br>
`case` patterns for a given match candidate type in a
given class are asserted<br>
to be an exhaustive set:<br>
<br>
```<br>
class Optional<T> { <br>
static<T> Optional<T> of(T t) { ... }<br>
static<T> Optional<T> empty() { ... }<br>
<br>
static<T> case pattern of(T t) for
Optional<T> { ... }<br>
static<T> case pattern empty() for
Optional<T> { ... }<br>
}<br>
```<br>
<br>
Because they may not be truly exhaustive, `switch`
constructs will have to back<br>
up the static assumption of exhaustiveness with a dynamic
check, as we do for<br>
other sets of exhaustive patterns that may have remainder.<br>
<br>
I've experimented with variants of `sealed` but it felt
more forced, so this is<br>
the best I've come up with.<br>
<br>
## Example: patterns delegating to other patterns<br>
<br>
Pattern implementations must compose. Just as a subclass
constructor delegates<br>
to a superclass constructor, the same should be true for
deconstructors.<br>
Here's a typical superclass-subclass pair: <br>
<br>
```<br>
class A { <br>
private final int a;<br>
<br>
public A(int a) { this.a = a; }<br>
public pattern A(int a) { matches A(that.a); }<br>
}<br>
<br>
class B extends A { <br>
private final int b;<br>
<br>
public B(int a, int b) { <br>
super(a);<br>
this.b = b; <br>
}<br>
<br>
// Imperative style <br>
public pattern B(int a, int b) {<br>
if (that instanceof super(var aa)) {<br>
a = aa;<br>
b = that.b;<br>
matches B;<br>
}<br>
}<br>
<br>
// Functional style<br>
public pattern B(int a, int b) {<br>
if (that instanceof super(var a)) <br>
matches B(a, b);<br>
}<br>
}<br>
```<br>
<br>
(Ignore the flow analysis and totality for the time being;
we'll come back to<br>
this in a separate document.)<br>
<br>
The first thing that jumps out at us is that, in the
imperative version, we had<br>
to create a "garbage" variable `aa` to receive the
binding, because `a` was<br>
already in scope, and then we have to copy the garbage
variable into the real<br>
binding variable. Users will surely balk at this, and
rightly so. In the<br>
functional version (depending on the choices from "Odds
and Ends") we are free<br>
to use the more natural name and avoid the roundabout
locution.<br>
<br>
We might be tempted to fix the "garbage variable" problem
by inventing another<br>
sub-feature: the ability to use an existing variable as
the target of a binding,<br>
such as:<br>
<br>
```<br>
pattern Point(int a, int b) {<br>
if (this instanceof A(__bind a))<br>
b = this.b;<br>
}<br>
```<br>
<br>
But, I think the language is stronger without this
feature, for two reasons.<br>
First, having to reason about whether a pattern match
introduces a new binding<br>
or assigns to an existing variables is additional
cognitive load for users to<br>
reason about, and second, having assignment to locals
happening through<br>
something other than assignment introduces additional
complexity in finding<br>
where a variable is modified. While we can argue about
the general utility of<br>
this feature, bringing it in just to solve the
garbage-variable problem is<br>
particularly unattractive. <br>
<br>
## Pattern lambdas<br>
<br>
One final consideration is is that patterns may also have
a lambda form. Given<br>
a single-abstract-pattern (SAP) interface:<br>
<br>
```<br>
interface Converter<T,U> { <br>
pattern(T t) convert(U u);<br>
}<br>
```<br>
<br>
one can implement such a pattern with a lambda. Such a
lambda has one parameter<br>
(the match candidate), and its body looks like the body of
a declared pattern:<br>
<br>
```<br>
Converter<Integer, Short> c = <br>
i -> { <br>
if (i >= Short.MIN_VALUE && i <=
Short.MAX_VALUE)<br>
matches Converter.convert((short) i);<br>
};<br>
```<br>
<br>
Because the bindings of the pattern lambda are defined in
the interface, not in<br>
the lambda, this is one more reason not to like the
imperative version: it is<br>
brittle, and alpha-renaming bindings in the interface
would be a<br>
source-incompatible change.<br>
<br>
## Example gallery<br>
<br>
Here's all the pattern examples so far, and a few more,
using the suggested<br>
style (functional, implicit fail, implicit
`that`-qualification):<br>
<br>
```<br>
// Point dtor <br>
pattern Point(int x, int y) {<br>
matches Point(x, y);<br>
}<br>
<br>
// Optional -- static patterns for Optional::of,
Optional::empty<br>
static<T> case pattern(Optional<T> that) of(T
t) { <br>
if (isPresent())<br>
matches of(t);<br>
}<br>
<br>
static<T> case pattern(Optional<T> that)
empty() { <br>
if (!isPresent())<br>
matches empty();<br>
}<br>
<br>
// Class -- instance pattern for arrayClass (match
candidate type inferred)<br>
pattern arrayClass(Class<?> componentType) { <br>
if (that.isArray())<br>
matches arrayClass(that.getComponentType());<br>
}<br>
<br>
// regular expression -- instance pattern in j.u.r.Pattern<br>
pattern(String that) regexMatch(String... groups) {<br>
Matcher m = matcher(that);<br>
if (m.matches())<br>
matches Pattern.regexMatch(IntStream.range(1,
m.groupCount())<br>
.map(Matcher::group)<br>
.toArray(String[]::new));<br>
}<br>
<br>
// power of two (somewhere)<br>
static pattern(int that) powerOfTwo(int exp) {<br>
int exp = 0;<br>
<br>
if (that < 1)<br>
return;<br>
<br>
while (that > 1) {<br>
if (that % 2 == 0) {<br>
that /= 2;<br>
exp++;<br>
}<br>
else<br>
return;<br>
}<br>
matches powerOfTwo(exp);<br>
}<br>
```<br>
<br>
## Closing thoughts<br>
<br>
I came out of this exploration with very different
conclusions than I expected<br>
when going in. At first, the "inverse" syntax seemed
stilted, but over time it<br>
started to seem more obvious. Similarly, I went in
expecting to prefer the<br>
imperative approach for the body, but over time, started
to warm to the<br>
functional approach, and eventually concluded it was
basically a forced move if<br>
we want to support more than just deconstructors. And I
started out skeptical<br>
of "implicit fail", but after writing a few dozen patterns
with it, going back<br>
to fully explicit felt painful. All of this is to say,
you should hold your<br>
initial opinions at arm's length, and give the
alternatives a chance to sink in.<br>
<br>
For most _conditional_ patterns (and conditionality is at
the heart of pattern<br>
matching), the functional approach cleanly highlights both
the match predicate<br>
and the flow of values, and is considerably less fussy
than the imperative<br>
approach in the same situation; `Optional::of`,
`Class::arrayClass`, and `regex`<br>
look great here, much better than the would with
imperative. None of these<br>
illustrate delegation, but in the presence of delegation,
the gap gets even<br>
wider.<br>
<br>
</font> </div>
</blockquote>
</div>
</blockquote>
<br>
</body>
</html>