Towards member patterns
Brian Goetz
brian.goetz at oracle.com
Fri Jan 26 12:31:54 UTC 2024
> I think your proposal solves the cases where the type you are switching on is closed (final, sealed) but not if the type is open (non-sealed).
A bold claim! Let's see how this stacks up.
> Let's take an example, let suppose I've the following hierarchy
>
> public sealed interface Tree {
... snip ... sealed class, private implementation classes, public static
factories, public static patterns ... check.
> If I want to have a static method children that returns all the children of the Tree, using the pattern matching I would like to write
>
> static List<Tree> children(Tree tree) {
> return switch(tree) {
> case Tree.none() -> List.of();
> case Tree.cons(Tree child) -> List.of(child);
> };
> }
Full disclosure: we're not totally there yet. This switch isn't (yet)
exhaustive; we need a way to mark none+cons as being an exhaustive set.
That's on the list, but was looking to sync on the broad strokes first.
> As I said, it works great with a closed hierarchy, but now let suppose the hierarchy is not sealed, if the hierarchy is not sealed, having static factories make less sense because we do not know all the subtypes.
I don't see this. (As one example, consider List: it is open, yet there
are static factories like List.of(...)). We had static factories long
before we had sealed hierarchies. But let's keep going.
> So we have
>
> public interface Tree {}
> public enum None implemnts Tree { NONE }
> public class Cons implements Tree {
> private final Tree tree;
>
> public Cons(Tree tree) { this.tree = tree; }
> }
>
> and in the future, someone may add
> public class Node {
> private final Tree, left, right;
>
> public Node(Tree left, Tree right) { this.left = left; this.right = right; }
> }
>
> Because the hierarchy is open, we need to use the late binding here.
> So i may rewrite children like this
> static List<Tree> children(Tree tree) {
> return switch(tree) {
> case that.extract(List<Tree> list) -> list; // wrong syntax, it's just to convey the semantics
> };
> }
I'm not sure what this example is supposed to say, since `that` is only
defined inside the body of a pattern method. Are you trying to do
child-extraction as a pattern, rather than as an accessor? (This is a
modeling question.) I'm not sure this is a great modeling for a Tree,
but let's look past that. If so, Tree needs an _abstract pattern_ that
binds a List<Tree>. That's easy:
interface Tree<T> {
public __inverse Tree withChildren(List<T> children);
}
and the subclasses can each override it:
class Empty<T> implements Tree<T> {
public __inverse Tree withChildren(List<T> children) {
yield Collections.emptyList();
}
}
...
and the client can take an arbitrary Tree and match it:
case Tree.withChildren(var children) -> ...
So I don't see that this doesn't work, but I think I see where you got
confused.
> Here, we we want to call an abstract pattern method that will be implemented differently for each subclasses, but your proposal does not allow that (sorry for the pun).
Yes, it does. (This conversation would be easier if you could frame
this as a question ("Can I ...") rather than an statement ("It is not
possible...") which turns out to be incorrect.)
> Inside a pattern, there are two implicit values, we have 'this' as usual and we have 'that' (we call it that way) that represent the value actually matched.
Correct. Let's talk about the role of these two context variables.
Every pattern has a match candidate. This is the thing on the RHS of
the instanceof, or the selector in the switch. It is the thing about
which we ask "does the thing match the pattern."
Every pattern has a _primary type_. It is the minimal type for which
the match candidate could possibly match the pattern. For a record
pattern like `Point(int x, int y)`, the primary type is Point. (A
pattern is rejected at compile time as inapplicable if the type of the
match candidate is not cast-convertible to the primary type of the pattern.)
In the body of a pattern method, the match candidate is denoted with the
context variable `that`, whose type is the primary type of the pattern.
The compiler may have to make up some of the difference between the type
of the match candidate and the primary type:
Object o = ...
switch (o) {
case Foo(int x) -> ...
}
Here, the primary type of the Foo pattern is Foo, so to test if the case
matches, the compiler inserts an `instanceof Foo`, and if that succeeds,
casts `o` to `Foo`, and invokes the Foo pattern with that.
Not every pattern has a receiver, just like not every method has a
receiver. Constructors and instance methods have receivers; same with
their pattern counterparts. For deconstructors, both the receiver and
the match candidate are the same object. This is not true for all
instance patterns.
A receiver plays two roles in a pattern match, just as it does in a
method invocation:
- Finding the code to invoke by searching the class hiearchy
- Associating the implementing code with the state of the object, in
case the implementation of the pattern needs some state from the object
that declares it
Let's go through two examples to see the cases.
AN easy example is regular expressions. We have a class
j.u.regex.Pattern, which represents a compiled regex. A regular
expression match is a form of pattern match (there's a match candidate,
it is conditional, if it succeeds we extract the capture groups.)
Surely we should expose a "match" pattern on Pattern.
class Pattern {
public __inverse String regexMatch(String... groups) {
Matcher m = matcher(that);
if (m.matches())
__yield IntStream.range(1, m.groupCount())
.map(Matcher::group)
.toArray(String[]::new); }
}
We match it with an explicit receiver:
final Pattern As = Pattern.compile("([aA]*)");
...
if (aString instanceof As.regexMatch(String as)) { ... }
The body uses both `this` and `that`. When it goes to do the actual
matching, it takes the match candidate, `that`, and passes it to
`matcher()`; we are matching against the match candidate, not the
receiver. But it also uses the receiver in the same line of code,
quietly; the locution `matcher(that)` is really `this.matcher(that)`.
It is using the state of _this regex_ to determine the match logic. The
pattern needs both, and they are different objects.
In our `instanceof` test, there are two "parameters", though neither of
them looks like one: the match candidate (on the LHS of the instanceof)
and the receiver. These are packaged up as `that` and `this` for the
pattern invocation.
The other example is a conditional behavior on an object, such as "does
this List have any elements, and if so, give me one." We put an
abstract pattern on List:
interface List<T> {
public __inverse List<T> withElement(T element);
}
(It could also be a default pattern; works the same as default
methods.) The implementation in emptyList always fails. The
implementation in ArrayList might look like:
public __inverse List<T> withElement(T element) {
if (that.size > 0)
_yield that.elements[0];
}
Now, implementing this guy gets tricky, since we have two context
variables which are both of the same type, ArrayList<T>. (Maybe we have
to explicitly use a covariant "override" here; TBD.) But as it turns
out, the two will usually be the same object:
switch (aCollection) {
case List.withElement(var t): ...
}
How does this match work? Well, the primary type of List.withElement is
List<T>, so the compiler tests `aCollection instanceof List`, and if so,
casts the match candidate to List. Since there is no explicit receiver,
it uses the match candidate as the receiver also (this is like an
unbound method reference), and does the virtual method search, and finds
ArrayList::withElement, and invokes it. Different types of collections
will use different implementations of the pattern.
> Now, to finish the example, using '::' instead of '.', children in the first example should be written like this
Remember you're not supposed to use words like "should" ;)
> static List<Tree> children(Tree tree) {
> return switch(tree) {
> case Tree::extract(List<Tree> list) -> list;
case Tree.extract, but yes.
> I really think that not using 'that' as the receiver when calling an inverse instance method is a missing opportunity because without that (again :) ), there is no way to call an inverse abstract method, so no way to pattern match on an open hierarchy.
Hopefully I've cleared up part of the confusion; there are two ways to
denote an instance pattern in a match: bound and unbound, and when it is
unbound, it uses the match candidate as the receiver.
So if your statement is "there should also be a way to ...", it is
correct, but if your statement is "the receiver must be the match
candidate", then that is catastrophically wrong, because then you can't
do regex, type class witnesses, pattern objects, etc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20240126/e80767d7/attachment-0001.htm>
More information about the amber-spec-experts
mailing list