Pattern Matching

Mon Mar 27 13:26:46 UTC 2017

Hi Andrey, 

> De: "Andrey Breslav" <andrey.breslav at jetbrains.com>
> À: forax at univ-mlv.fr, "John Rose" <john.r.rose at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Lundi 27 Mars 2017 13:04:36
> Objet: Re: Pattern Matching

> My two cents from the Kotlin camp:

> 1. We still don't have full structural matching in Kotlin, because smart casts +
> unconditional destructuring work well enough for most cases (unless you are a
> compiler person, but we realize that such people are a minority :)). Yes, there
> are use cases outside compilers, but my point is that smart casts hit the sweet
> spot.
> Plus, we didn't come up with a syntax for structural matching that would be both
> clear (variable usage vs definition) and clean (inserting val's everywhere
> makes complex patterns rather scary-looking).
> So, my take on this would be: better not do it for now, and wait until it gets
> wide adoption in C#, since we have the luxury of someone else being brave and
> taking the risks already.

yes, one way to do that is to do simple pattern matching for 10 and add structural matching for 11 (see below). 

> 2. We've been using per-component getter methods for destructuring, which is a
> huge performance win compared to Scala's unapply that creates a box on every
> (successful) match.

it's worst that that is you have more than one component, the return type of the extractor (the de-constructor) is an Option of a tuple, so you have two boxes :( 

> C++17 has gone roughly the same direction, but leveraged the templates with
> values as arguments. In our "open world" with static extension functions having
> independent functions for every component raises some questions, but if Java
> requires members there, it should be simpler.

getters and unapply are two faces of the same coin, 
unapply allows to share code when processing the extracted values, getters allows to avoid to calculate extracted values if you do not need them (if there are _ in the pattern), 
you can try to unify them by having an unapply that takes a representation of the variable you want, something like unapply(["x", "z"]), but trying to come with the right signature for unapply, at least until we do not have generics over primitive. 

As you said, a simpler solution is just to use getters, Java dukes are familiar with them after all. 

> 3. And to share a little bit about smart casts (I'm flattered that you are using
> our terminology here ;) ): as soon as you have smart casts, the urge to get
> intersection types into the language strengthens, because of cases like this:

> if (x is A) {
> if (x is B) {
> // x is A&B here
> }
> }

> We still get away without making intersection types explicit, and will probably
> not add them in the very nearest future, but they become a lot less exotic with
> smart casts.

Given the way the pattern matching pattern is constructed, you can not type the same variable more than once, so no intersection types can be created using the pattern matching. 
About adding 'is' in the language as an instanceof + smart cast, i think it's another issue that is not directly related to the pattern matching. 

regards, 
Rémi 

> On Sun, Mar 19, 2017 at 1:32 AM < forax at univ-mlv.fr > wrote:

>> ----- Mail original -----
>> > De: "John Rose" < john.r.rose at oracle.com >
>> > À: "Rémi Forax" < forax at univ-mlv.fr >
>> > Cc: "amber-spec-experts" < amber-spec-experts at openjdk.java.net >
>> > Envoyé: Samedi 18 Mars 2017 19:50:08
>> > Objet: Re: Pattern Matching

>> > On Mar 18, 2017, at 10:22 AM, Remi Forax < forax at univ-mlv.fr > wrote:

>> >> Hi guys,
>> >> i've already implemented a kind of pattern matching in a language that run on
>> >> the JVM (still closed source :( ),
>> >> so i've said to Brian that i will send a summary of how it is done.

>>> Thanks, Remi. This is useful. At many points it parallels thinking that we have
>> > been doing.

>>> I'm glad to see you use indy. As with the string-concat work, there is an
>> > opportunity to perform
>>> run-time optimizations that way, which an eager bytecode spinning approach can't
>> > touch.

>> yes !

>>> For example, the static compiler can inspect type relations among different case
>> > branches,
>> > and it can make requirements such as no dead cases, but it's only at runtime
>>> that the final type relations are available. And even if the static compiler
>> > ignores some type
>>> relations, the BSM for the switch can use such information to flatten the
>> > decision tree
>> > semi-statically.

>> yes

>>> (My thought ATM is to forbid dead code at compile time but allow it at runtime
>> > if
>>> the type relations have changed. I think that's in keeping with the way we
>> > manage
>> > other binary compatibility questions.

>> yes,
>> we have decided to throw an error at runtime in case of dead code mostly to
>> report a compilation bug because we do not have a real IDE/incremental
>> compilation.

>> > The runtime optimization can ignore the
>> > static presuppositions. This means that the semantics of switch need to be
>> > "as if" the switch is "really just" a linear decision chain. We save performance
>> > by using a declarative formulation to the BSM which allows the BSM code
>> > generator to reorganize the chain as a true switch, or shallow cascade of
>> > switches. BTW, we forgot the switch combinator. Maybe we can fix this in
>> > the next release? More indy leftovers! It should be used to compile switching
>> > on strings and enums, as well as any future pattern matches on constants.
>> > It should be optionally lazy and open, which is what I think you are also
>> > calling for at the end of your message.)

>> yes,
>> switching on enums should be done using the indy too, currently swapping two
>> constants is not backward compatible because they is maybe a switch somewhere.
>> and +1 for adding the switch combinator in 10. I would really like to be able to
>> tell the JIT that the tests has no side effect (maybe it can be another
>> combinator too) even if some part of the test are not inlinable.

>> > The theory of type safety of multiple dispatch has moved forward with Fortress.
>>> Alternatively, if you can sugar up a visitor deployment as if it were
>> > multimethods
>>> added as extensions, you could prove type-safety and still define apparent
>> > methods,
>> > outside the type capsule. When we get value types visitors will become cheaper.
>> > Maybe at that point we can think about doing multi-method sugar that compiles
>> > to invisible visitor classes. (I'm not suggesting we do this now!)

>> The main issue of the visitor is that you have to add the accept methods on the
>> class of the hierarchy before being able to use the visitor, this is equivalent
>> to be able to insert a method or a constant into a class, which is also
>> equivalent to be able to inject the implementation of a not yet existing
>> interface.

>> > Maybe one of our lambda leftovers will help with the problem of flow-typing.
>> > switch (val) { case A val: … } // relax shadowing rule for some bindings??
>> > It's a fraught question, because the anti-shadowing rule prevents confusions
>> > which a relaxation re-introduces, if users misuse the freedom of expression.
>> > The goal would be to guide users to shadow variables only when the shadowing
>> > binding is somehow a logical repeat of the original binding. This cannot be
>> > done automatically in all cases we care about. Tricky tradeoffs.

>> For lambdas, a lambda is an anonymous function, and even in JavaScript, opening
>> a function open a new scope.
>> If the switch as a syntax close to the lambda one (and less close to the C) it
>> may not be a big deal.

>> switch(val) {
>> A val -> ...
>> }

>> but given that usually you have more that one case, it will rapidly become
>> unreadable

>> switch(val) {
>> A val -> foo(val)
>> B val -> foo(val)
>> ...
>> }

>> > We are also struggling with unapply. There are several hard parts, including
>> > the encapsulation model and the API for delivering multiple components.
>> > The low-level parts probably require more indy-level combinators with
>> > special JVM optimizations. I wish we had full value types today so we
>> > could just do tuples for multiple-value packages. But even that is a
>> > simulation of the true shape of the problem, and simulation has overheads
>> > even if we stay out of the heap. A lower-level solution which requires
>> > no companion types at all (not even tuples) would be to reify argument
>> > lists per se, at the same level of abstraction as method handle types.

>> Will it work if the pattern support nested types ?
>> case A(B(x y), _): ...

>> You will have to have a kind of growable arguments list very similar to the one
>> you need to implement the method handle combinators.
>> Technically, for the mh combinators you also need to shrink the arguments list
>> but you can cheat if you are able to extract a sub part when doing the calls to
>> the direct method handle.
>> Jerome has used that trick when implementing invokedynamic on android.

>> > That's a building block I am currently experimenting with, as a value-based
>> > class which can be upgraded to a value type. The type information is
>> > encoded as (wait for it) a MethodType, interpreted with the arrows reversed.

>> The arrow reversed MethodType is a way to represent a flatten tuple of types to
>> a named tuple (the value type) with the same component types.

>> So in order to represent a tuple of multiple values which are the structural
>> value of a class, you need a reified tuple of types and you use a MethodType as
>> a hack.

>> Another way to see unapply is to see it as a kind 'maybe tailcall' in a
>> continuation,
>> if the pattern doesn't match, it tailcalls and checks the next pattern
>> otherwise, it drops the unnecessary parameters and calls the action.
>> In case of embedded component, the action itself is to do a 'maybe tailcall'
>> after the component values are on stack but if the pattern doesn't match it now
>> has to pop two stack frames.

>> class A {
>> int value;
>> B b;

>> void unapply(MethodHandle[int, B] match, MethodHandle reject) { // act as a
>> continuation
>> if (value == 0) {
>> reject.invokeExact(); // tail call to the next pattern
>> }
>> match.invokeExact(value, b); // insert the argument into the previous call,
>> calls unapply on b if nested, calls the action otherwise
>> }
>> }

>> > Thanks for the brain dump!

>> > — John

>> Rémi

> --
> Andrey Breslav
> Project Lead of Kotlin
> JetBrains
> http://kotlinlang.org/
> The Drive to Develop
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20170327/db95d4a6/attachment-0001.html>