Towards member patterns

Brian Goetz brian.goetz at oracle.com
Fri Jan 26 12:31:54 UTC 2024


> I think your proposal solves the cases where the type you are switching on is closed (final, sealed) but not if the type is open (non-sealed).

A bold claim!  Let's see how this stacks up.

> Let's take an example, let suppose I've the following hierarchy
>
>    public sealed interface Tree {

... snip ... sealed class, private implementation classes, public static 
factories, public static patterns ... check.

> If I want to have a static method children that returns all the children of the Tree, using the pattern matching I would like to write
>
>    static List<Tree> children(Tree tree) {
>      return switch(tree) {
>        case Tree.none() -> List.of();
>        case Tree.cons(Tree child) -> List.of(child);
>      };
>    }

Full disclosure: we're not totally there yet.  This switch isn't (yet) 
exhaustive; we need a way to mark none+cons as being an exhaustive set.  
That's on the list, but was looking to sync on the broad strokes first.

> As I said, it works great with a closed hierarchy, but now let suppose the hierarchy is not sealed, if the hierarchy is not sealed, having static factories make less sense because we do not know all the subtypes.

I don't see this. (As one example, consider List: it is open, yet there 
are static factories like List.of(...)).  We had static factories long 
before we had sealed hierarchies.  But let's keep going.

> So we have
>
>    public interface Tree {}
>    public enum None implemnts Tree { NONE }
>    public class Cons implements Tree {
>      private final Tree tree;
>
>      public Cons(Tree tree) { this.tree = tree; }
>    }
>
> and in the future, someone may add
>    public class Node {
>      private final Tree, left, right;
>
>      public Node(Tree left, Tree right) { this.left = left; this.right = right; }
>    }
>
> Because the hierarchy is open, we need to use the late binding here.
> So i may rewrite children like this
>    static List<Tree> children(Tree tree) {
>      return switch(tree) {
>        case that.extract(List<Tree> list) -> list;   // wrong syntax, it's just to convey the semantics
>      };
>    }

I'm not sure what this example is supposed to say, since `that` is only 
defined inside the body of a pattern method.  Are you trying to do 
child-extraction as a pattern, rather than as an accessor?  (This is a 
modeling question.)  I'm not sure this is a great modeling for a Tree, 
but let's look past that.  If so, Tree needs an _abstract pattern_ that 
binds a List<Tree>.  That's easy:

     interface Tree<T> {
         public __inverse Tree withChildren(List<T> children);
     }

and the subclasses can each override it:

     class Empty<T> implements Tree<T> {
         public __inverse Tree withChildren(List<T> children) {
             yield Collections.emptyList();
         }
     }
     ...

and the client can take an arbitrary Tree and match it:

     case Tree.withChildren(var children) -> ...

So I don't see that this doesn't work, but I think I see where you got 
confused.
> Here, we we want to call an abstract pattern method that will be implemented differently for each subclasses, but your proposal does not allow that (sorry for the pun).

Yes, it does.  (This conversation would be easier if you could frame 
this as a question ("Can I ...") rather than an statement ("It is not 
possible...") which turns out to be incorrect.)

> Inside a pattern, there are two implicit values, we have 'this' as usual and we have 'that' (we call it that way) that represent the value actually matched.

Correct.  Let's talk about the role of these two context variables.

Every pattern has a match candidate.  This is the thing on the RHS of 
the instanceof, or the selector in the switch.  It is the thing about 
which we ask "does the thing match the pattern."

Every pattern has a _primary type_.  It is the minimal type for which 
the match candidate could possibly match the pattern.  For a record 
pattern like `Point(int x, int y)`, the primary type is Point.  (A 
pattern is rejected at compile time as inapplicable if the type of the 
match candidate is not cast-convertible to the primary type of the pattern.)

In the body of a pattern method, the match candidate is denoted with the 
context variable `that`, whose type is the primary type of the pattern.  
The compiler may have to make up some of the difference between the type 
of the match candidate and the primary type:

     Object o = ...
     switch (o) {
         case Foo(int x) -> ...
     }

Here, the primary type of the Foo pattern is Foo, so to test if the case 
matches, the compiler inserts an `instanceof Foo`, and if that succeeds, 
casts `o` to `Foo`, and invokes the Foo pattern with that.

Not every pattern has a receiver, just like not every method has a 
receiver.  Constructors and instance methods have receivers; same with 
their pattern counterparts.  For deconstructors, both the receiver and 
the match candidate are the same object.  This is not true for all 
instance patterns.

A receiver plays two roles in a pattern match, just as it does in a 
method invocation:

  - Finding the code to invoke by searching the class hiearchy
  - Associating the implementing code with the state of the object, in 
case the implementation of the pattern needs some state from the object 
that declares it

Let's go through two examples to see the cases.

AN easy example is regular expressions.  We have a class 
j.u.regex.Pattern, which represents a compiled regex.  A regular 
expression match is a form of pattern match (there's a match candidate, 
it is conditional, if it succeeds we extract the capture groups.)  
Surely we should expose a "match" pattern on Pattern.

     class Pattern {
         public __inverse String regexMatch(String... groups) {
             Matcher m = matcher(that);
             if (m.matches())
                 __yield IntStream.range(1, m.groupCount())
                                     .map(Matcher::group)
                                     .toArray(String[]::new); }
    }

We match it with an explicit receiver:

     final Pattern As = Pattern.compile("([aA]*)");
     ...
     if (aString instanceof As.regexMatch(String as)) { ... }

The body uses both `this` and `that`.  When it goes to do the actual 
matching, it takes the match candidate, `that`, and passes it to 
`matcher()`; we are matching against the match candidate, not the 
receiver.  But it also uses the receiver in the same line of code, 
quietly; the locution `matcher(that)` is really `this.matcher(that)`.  
It is using the state of _this regex_ to determine the match logic.  The 
pattern needs both, and they are different objects.

In our `instanceof` test, there are two "parameters", though neither of 
them looks like one: the match candidate (on the LHS of the instanceof) 
and the receiver.  These are packaged up as `that` and `this` for the 
pattern invocation.

The other example is a conditional behavior on an object, such as "does 
this List have any elements, and if so, give me one."   We put an 
abstract pattern on List:

     interface List<T> {
         public __inverse List<T> withElement(T element);
     }

(It could also be a default pattern; works the same as default 
methods.)  The implementation in emptyList always fails.  The 
implementation in ArrayList might look like:

     public __inverse List<T> withElement(T element) {
         if (that.size > 0)
             _yield that.elements[0];
     }

Now, implementing this guy gets tricky, since we have two context 
variables which are both of the same type, ArrayList<T>. (Maybe we have 
to explicitly use a covariant "override" here; TBD.) But as it turns 
out, the two will usually be the same object:

     switch (aCollection) {
         case List.withElement(var t): ...
     }

How does this match work?  Well, the primary type of List.withElement is 
List<T>, so the compiler tests `aCollection instanceof List`, and if so, 
casts the match candidate to List.  Since there is no explicit receiver, 
it uses the match candidate as the receiver also (this is like an 
unbound method reference), and does the virtual method search, and finds 
ArrayList::withElement, and invokes it.  Different types of collections 
will use different implementations of the pattern.
> Now, to finish the example, using '::' instead of '.', children in the first example should be written like this

Remember you're not supposed to use words like "should" ;)

>    static List<Tree> children(Tree tree) {
>      return switch(tree) {
>        case Tree::extract(List<Tree> list) -> list;

case Tree.extract, but yes.

> I really think that not using 'that' as the receiver when calling an inverse instance method is a missing opportunity because without that (again :) ), there is no way to call an inverse abstract method, so no way to pattern match on an open hierarchy.

Hopefully I've cleared up part of the confusion; there are two ways to 
denote an instance pattern in a match: bound and unbound, and when it is 
unbound, it uses the match candidate as the receiver.

So if your statement is "there should also be a way to ...", it is 
correct, but if your statement is "the receiver must be the match 
candidate", then that is catastrophically wrong, because then you can't 
do regex, type class witnesses, pattern objects, etc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20240126/e80767d7/attachment-0001.htm>


More information about the amber-spec-experts mailing list