[pattern-switch] Totality

Remi Forax forax at univ-mlv.fr
Sat Aug 29 23:50:15 UTC 2020


I've implemented of proof of concept of the semantics of the pattern matching at runtime,

The semantics of the different patterns is defined here [1]
As Brian said, matching is equivalent to implementing Optional<Result> match(Object o), but here given that we want also deconstruction, we need to have a supplementary argument to carry all the bound values, John has called it the carrier record in the past so i've kept that name.
Note that it not only carry all the bound variables, it also carry a special value __index__ which correspond to the index of the case of the switch.
  
So a pattern is defined like this: 
  public sealed interface Pattern {
    Optional<Record> match(Record carrier, Object o);
  }  

To deal with the deconstruction, i need 3 primitives
- deconstruct(Object value) that calls the deconstructor if the deconstructed object is a class
- extract(Record record, String name) that can extract the value of a record component from its name
- with(Record record, String name, Object value) that creates a new record with the value of the record component corresponding to the name changed

The implementation of those primitives relies on the reflection so it's slow and wrong in term of security, but it's enough for a prototype [2]

I've re-written several builders, really some glorified typed s-expressions, to help me to compose the different patterns [3].

With that if i have this hierarchy
  sealed interface Shape {}
  record Rect(Point start, Point end) implements Shape {}
  record Circle(Point center, int radius) implements Shape {}
  record Box<T>(T content) {}

I'm able to express a switch like this
    switch(x) {
      case Box(Rect(Point point1, Point point2)) -> 0;
      case Box(Circle(Point point, _ _)) -> 1;
    }
        
using this tree of patterns
    var pattern =
        _switch()
            ._case(_instanceof(Box.class,
                _destruct(b -> b.bind("content", "$content"),
                    _with("$content",
                        _or(b -> b
                            ._case(_instanceof(Rect.class,
                                _destruct(b2 -> b2.bind("start", "point1").bind("end", "point2"),
                                    _index(0))))
                            ._case(_instanceof(Circle.class,
                                _destruct(b2 -> b2.bind("center", "point"),
                                    _index(1))))))
                        )))
            .toPattern();

(_with() means takes a value from the carrier object and use it as the matching value for the downstream pattern)

and this record carrier
    record Carrier(int __index__, Shape $content, Point point1, Point point2, Point point) {
      Carrier() { this(-1, null, null, null, null); }
    }

I can write
    var box1 = new Box<>(new Rect(new Point(1, 2), new Point(3, 4)));
    pattern.match(new Carrier(), box1).orElseThrow()

the result is a Carrier instance containing 0 as __index__, the content of the box as $content, the start of the rectangle as point1 and the end of the rectangle as point3, the field 'point' will still be null.

There are more examples in [4].


The deconstruction is based on the names of the record components and not their declared positions in the record. 
You may think that this way of doing the deconstruction is not typesafe but the fact that all the fields of the carrier record are typed means that all the bound variables have the correct type.

Rémi

[1] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/Pattern.java
[2] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/GoryDetails.java
[3] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/PatternBuilder.java
[4] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/PatternTest.java

----- Mail original -----
> De: "Remi Forax" <forax at univ-mlv.fr>
> À: "Guy Steele" <guy.steele at oracle.com>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Samedi 29 Août 2020 02:50:13
> Objet: Re: [pattern-switch] Totality

> ----- Mail original -----
>> De: "Guy Steele" <guy.steele at oracle.com>
>> À: "Remi Forax" <forax at univ-mlv.fr>
>> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts"
>> <amber-spec-experts at openjdk.java.net>
>> Envoyé: Samedi 29 Août 2020 00:49:26
>> Objet: Re: [pattern-switch] Totality
> 
>>> On Aug 28, 2020, at 5:59 PM, forax at univ-mlv.fr wrote:
>>> 
>>> . . .
>>> Again, it should work like a cascade of if ... instanceof, so
>>>   case Pixel(var x, var y, var color) -> color
>>> should be equivalent to
>>>   if x instanceof Pixel p { yield p.color() }
>> 
>> But I do not believe that at all.  I do believe that
>> 
>>  case Pixel(var x, var y, var color) -> color
>> 
>> should be equivalent to
>> 
>>  if x instanceof Pixel(var x, var y, var color) p { yield p.color() }
>> 
>> or, if you prefer, to
>> 
>>  if x instanceof Pixel(var x, var y, var color) { yield color }
>> 
>> The point is that the switch label `case Pixel(var x, var y, var color)` does
>> not merely demand that the selector value be a Pixel; it demands that it be a
>> Pixel having a specific three-argument destructor.  It can be equivalent only
>> to an instanceof expression that makes those same demands.
>> 
>> If you want a switch clause that is equivalent to
>> 
>>  if x instanceof Pixel p { yield p.color() }
>> 
>> then you should write
>> 
>>   case Pixel p -> p.color()
> 
> 
> It doesn't have to be that way.
> 
> Let say Pixel have a deconstructor, something like
>  deconstructor Pixel { return this; }
> because Pixel is already a record, it can return itself
> 
> and when a pattern is used, you have something like this
>  x instanceof Pixel(var foo, var bar, var baz)
> 
> so for the compiler accessing to foo is equivalent of extracting Pixel::x,
> accessing to bar is equivalent to accessing to Pixel::y and accessing to baz is
> equivalent to accessing to Pixel::color,
> the compiler matches the positional parameters of the destructor with the
> position of the destructuring pattern
> 
> so for
>  if x instanceof Pixel(var foo, var bar, var baz) { yield baz }
> the equivalent code at runtime is
>  if x instanceof Pixel p { yield p.color(); }
> 
> This is very similar to the way enums works, when you declare an enum, each
> constant has a position but when you match an enum using a switch, at runtime
> the constants are matched by name not by position.
> Similarly the serialization of enums, which is a form of adhoc matching, is also
> done by name not by position.
> 
> Another way of seeing it, it's very similar to the way the VM resolve fields,
> it's a two step process, in the bytecode you have a name, and at runtime the VM
> find the corresponding offset.
> Here, the compiler matches the record resulting from the deconstructor with the
> destructuring pattern and insert in the bytecode the corresponding name,
> at runtime, we do the opposite, we call the deconstructor (because the matching
> is done at runtime, you don't need more than one) and extract the value from
> the name.
> 
> Rémi


More information about the amber-spec-experts mailing list