Pattern Matching

Andrey Breslav andrey.breslav at jetbrains.com
Mon Mar 27 11:04:36 UTC 2017


My two cents from the Kotlin camp:

1. We still don't have full structural matching in Kotlin, because smart
casts + unconditional destructuring work well enough for most cases (unless
you are a compiler person, but we realize that such people are a minority
:)). Yes, there are use cases outside compilers, but my point is that smart
casts hit the sweet spot.
Plus, we didn't come up with a syntax for structural matching that would be
both clear (variable usage vs definition) and clean (inserting val's
everywhere makes complex patterns rather scary-looking).
So, my take on this would be: better not do it for now, and wait until it
gets wide adoption in C#, since we have the luxury of someone else being
brave and taking the risks already.

2. We've been using per-component getter methods for destructuring, which
is a huge performance win compared to Scala's unapply that creates a box on
every (successful) match. C++17 has gone roughly the same direction, but
leveraged the templates with values as arguments. In our "open world" with
static extension functions having independent functions for every component
raises some questions, but if Java requires members there, it should be
simpler.

3. And to share a little bit about smart casts (I'm flattered that you are
using our terminology here ;) ): as soon as you have smart casts, the urge
to get intersection types into the language strengthens, because of cases
like this:

if (x is A) {
    if (x is B) {
        // x is A&B here
    }
}

We still get away without making intersection types explicit, and will
probably not add them in the very nearest future, but they become a lot
less exotic with smart casts.

On Sun, Mar 19, 2017 at 1:32 AM <forax at univ-mlv.fr> wrote:

> ----- Mail original -----
> > De: "John Rose" <john.r.rose at oracle.com>
> > À: "Rémi Forax" <forax at univ-mlv.fr>
> > Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> > Envoyé: Samedi 18 Mars 2017 19:50:08
> > Objet: Re: Pattern Matching
>
> > On Mar 18, 2017, at 10:22 AM, Remi Forax <forax at univ-mlv.fr> wrote:
> >>
> >> Hi guys,
> >> i've already implemented a kind of pattern matching in a language that
> run on
> >> the JVM (still closed source :( ),
> >> so i've said to Brian that i will send a summary of how it is done.
> >
> > Thanks, Remi. This is useful.  At many points it parallels thinking that
> we have been doing.
> >
> > I'm glad to see you use indy.  As with the string-concat work, there is
> an opportunity to perform
> > run-time optimizations that way, which an eager bytecode spinning
> approach can't touch.
>
> yes !
>
> > For example, the static compiler can inspect type relations among
> different case branches,
> > and it can make requirements such as no dead cases, but it's only at
> runtime
> > that the final type relations are available.  And even if the static
> compiler ignores some type
> > relations, the BSM for the switch can use such information to flatten
> the decision tree
> > semi-statically.
>
> yes
>
> >
> > (My thought ATM is to forbid dead code at compile time but allow it at
> runtime if
> > the type relations have changed.  I think that's in keeping with the way
> we manage
> > other binary compatibility questions.
>
> yes,
> we have decided to throw an error at runtime in case of dead code mostly
> to report a compilation bug because we do not have a real IDE/incremental
> compilation.
>
> >  The runtime optimization can ignore the
> > static presuppositions.  This means that the semantics of switch need to
> be
> > "as if" the switch is "really just" a linear decision chain.  We save
> performance
> > by using a declarative formulation to the BSM which allows the BSM code
> > generator to reorganize the chain as a true switch, or shallow cascade of
> > switches.  BTW, we forgot the switch combinator.  Maybe we can fix this
> in
> > the next release?  More indy leftovers!  It should be used to compile
> switching
> > on strings and enums, as well as any future pattern matches on constants.
> > It should be optionally lazy and open, which is what I think you are also
> > calling for at the end of your message.)
>
> yes,
> switching on enums should be done using the indy too, currently swapping
> two constants is not backward compatible because they is maybe a switch
> somewhere.
> and +1 for adding the switch combinator in 10. I would really like to be
> able to tell the JIT that the tests has no side effect (maybe it can be
> another combinator too) even if some part of the test are not inlinable.
>
> >
> > The theory of type safety of multiple dispatch has moved forward with
> Fortress.
> > Alternatively, if you can sugar up a visitor deployment as if it were
> multimethods
> > added as extensions, you could prove type-safety and still define
> apparent methods,
> > outside the type capsule.  When we get value types visitors will become
> cheaper.
> > Maybe at that point we can think about doing multi-method sugar that
> compiles
> > to invisible visitor classes.  (I'm not suggesting we do this now!)
>
> The main issue of the visitor is that you have to add the accept methods
> on the class of the hierarchy before being able to use the visitor, this is
> equivalent to be able to insert a method or a constant into a class, which
> is also equivalent to be able to inject the implementation of a not yet
> existing interface.
>
> >
> > Maybe one of our lambda leftovers will help with the problem of
> flow-typing.
> >   switch (val) { case A val: … }  // relax shadowing rule for some
> bindings??
> > It's a fraught question, because the anti-shadowing rule prevents
> confusions
> > which a relaxation re-introduces, if users misuse the freedom of
> expression.
> > The goal would be to guide users to shadow variables only when the
> shadowing
> > binding is somehow a logical repeat of the original binding.  This
> cannot be
> > done automatically in all cases we care about.  Tricky tradeoffs.
>
> For lambdas, a lambda is an anonymous function, and even in JavaScript,
> opening a function open a new scope.
> If the switch as a syntax close to the lambda one (and less close to the
> C) it may not be a big deal.
>
>  switch(val) {
>    A val -> ...
>  }
>
> but given that usually you have more that one case, it will rapidly become
> unreadable
>
>   switch(val) {
>    A val -> foo(val)
>    B val -> foo(val)
>    ...
>  }
>
>
> >
> > We are also struggling with unapply.  There are several hard parts,
> including
> > the encapsulation model and the API for delivering multiple components.
> > The low-level parts probably require more indy-level combinators with
> > special JVM optimizations.  I wish we had full value types today so we
> > could just do tuples for multiple-value packages.  But even that is a
> > simulation of the true shape of the problem, and simulation has overheads
> > even if we stay out of the heap.  A lower-level solution which requires
> > no companion types at all (not even tuples) would be to reify argument
> > lists per se, at the same level of abstraction as method handle types.
>
> Will it work if the pattern support nested types ?
>   case A(B(x y), _): ...
>
> You will have to have a kind of growable arguments list very similar to
> the one you need to implement the method handle combinators.
> Technically, for the mh combinators you also need to shrink the arguments
> list but you can cheat if you are able to extract a sub part when doing the
> calls to the direct method handle.
> Jerome has used that trick when implementing invokedynamic on android.
>
> > That's a building block I am currently experimenting with, as a
> value-based
> > class which can be upgraded to a value type.  The type information is
> > encoded as (wait for it) a MethodType, interpreted with the arrows
> reversed.
>
> The arrow reversed MethodType is a way to represent a flatten tuple of
> types to a named tuple (the value type) with the same component types.
>
> So in order to represent a tuple of multiple values which are the
> structural value of a class, you need a reified tuple of types and you use
> a MethodType as a hack.
>
> Another way to see unapply is to see it as a kind 'maybe tailcall' in a
> continuation,
> if the pattern doesn't match, it tailcalls and checks the next pattern
> otherwise, it drops the unnecessary parameters and calls the action.
> In case of embedded component, the action itself is to do a 'maybe
> tailcall' after the component values are on stack but if the pattern
> doesn't match it now has to pop two stack frames.
>
> class A {
>   int value;
>   B b;
>
>   void unapply(MethodHandle[int, B] match, MethodHandle reject) {  // act
> as a continuation
>      if (value == 0) {
>        reject.invokeExact(); // tail call to the next pattern
>      }
>      match.invokeExact(value, b);    // insert the argument into the
> previous call, calls unapply on b if nested, calls the action otherwise
>   }
> }
>
>
> >
> > Thanks for the brain dump!
> >
> > — John
>
> Rémi
>
-- 
Andrey Breslav
Project Lead of Kotlin
JetBrains
http://kotlinlang.org/
The Drive to Develop
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20170327/03a2c9a6/attachment.html>


More information about the amber-spec-experts mailing list