New pattern matching doc

Fri Jan 15 00:06:57 UTC 2021

----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Mercredi 13 Janvier 2021 21:24:55
> Objet: Re: New pattern matching doc

>> You should also talk about the last tier of the onion, the translation strategy,
>> because i may be wrong but ii think it will ripple to the two other parts
>> (syntax and semantics).
> 
> Of course we have to talk about that, eventually.  But there's a fine
> line between "pick a model that can be efficiently translated" and "let
> the translation-tail wag the model-dog."  I would like to focus on the
> object model first, and language model second, because a perfect
> translation cannot save a bad design, but a good design can tolerate an
> imperfect translation.  (And it is not the case that we haven't been
> thinking about translation as we go.)

A good design with an imperfect translation also leads to burnout and depression, remember Scala war stories from Paul Phillips at 2013? JVM Language Summit [1].

> 
>> At definition site, a pattern method is obviously a method and as Brian as
>> proposed earlier, bindings can be represented by a synthetic record as return
>> type, so something like
> 
> A synthetic record is obviously a good candidate, but not the only one.
> For patterns with only one binding, no record "box" is needed; for some
> patterns, having a synthetic "box" is just wasted motion (e.g., when the
> object can be act as its own witness to its bindings.)
> So we should keep synthetic records as possibilities, but we should not jump to this
> as The Answer.

Looks like a premature optimization to me, inline/primitive record are free (on stack).

> 
> There's an older doc at:
> https://github.com/openjdk/amber-docs/blob/master/site/design-notes/pattern-match-translation.md
> 
> that outlines an alternative strategy, which is potentially richer and
> more efficient, and which was discussed at the JVMLS talk that year.
> There, we have the method return not the matched values, but a Pattern
> (which is a constant!), where a Pattern is a bundle of method handles.
> The first method handle is a function from Target to Carrier (where
> Carrier is a runtime implementation detail), where a null carrier means
> "no match", and the other method handles are functions from Carrier to
> Binding_i.  These are all amenable to loading and composing via condy.
> In this world, no classfile has to utter the name of the carrier class,
> you just get it from one MH and feed it back to another.

Updated with what we know from Valhalla, the pattern method takes as parameter a carrier object, a method handle which is a combination of withfields called if it match and Carrier.default if it doesn't match.
This is the CPS (Contination Passing Style) solution, very similar to how Promise works in JS, you have a function to call if it works and a function to call if it doesn't work (here it's a constant).

With the same example
  public static pattern
    __name("parseInt")
    __target(String text)
    __inputArgs()
    __bindings(int value)
    try {
      int result = Integer.parseInt(text);
      return match(result);
    } catch(NumberFormatException e) {  // obviously, there is a better implementation,
      return no-match;                  // this is just an example
    }

can be translated to
  public static Object parseInt(String text, Object carrier, MethodHandle match, Object no_match) {
     try {
      int result = Integer.parseInt(text);
      return invokedynamic (match, carrier, result);
    } catch(NumberFormatException e) {
      return no_match;
    }
  }

You need an indy when it matches because of separate compilation, you don't want to throw a WrongMethodTypeException but the appropriate error if the type of the bindings have changed.
At use site, this is exactly as i've described in my previous mail if the Carrier object is opaque and created dynamically (pattern tree in condy, binding extractor as indy or constant method handle).

For me the main difference is that even calling a simple deconstructor to deconstruct something force you to use an invokedynamic (in fact 2).
  Point(var x, var y) = point; // this has to be an indy call
I believe the translation i propose is leaner at the price of creating more record (more static footprint).

The real question is that as i said previously, we can introduce tuples as being typed by a record (like a lambda is typed by a functional interface), so a pattern method in that case is equivalent to return the tuple that contains the bindings values, so less magic and in term of feature the pattern method decompose itself nicely to tuples + two new keywords match and no-match. 

  public record Result(boolean result, int value)  // this can be seen as the tuple (boolean, int)
  public static Result parseInt(String text) {
  public static parseInt(String text) {
     try {
      int result = Integer.parseInt(text);
      return (true, result);
    } catch(NumberFormatException e) {
      return (false, 0);
    }
  }

That's why i think it's important to talk about the translation strategy, like we have introduced the notion of record, i think we should introduce the notion of tuples + deconstructor (to do variable deconstruction) before starting to introduced pattern methods.

[...]

Rémi

[1] https://www.youtube.com/watch?v=v1wrWQcqLpo