combinators for pattern matching

Sun Mar 19 00:05:20 UTC 2017

On Mar 18, 2017, at 3:32 PM, forax at univ-mlv.fr wrote:
> 
> switching on enums should be done using the indy too, currently swapping two constants is not backward compatible because they is maybe a switch somewhere.

Actually it's OK to permute enums because each switch uses a local mapping table.
But that causes overheads (partly because arrays are not constant-foldable, another
place we need frozen arrays).  A switch combinator for enums would optimize the
case where caller and callee agreed on enum order, while covering the corner cases
with a @Stable indirection table.  That's better in two ways from what we do now.
In general, the static compiler would "guess" at some sort of enumeration (or perfect
hash function) of the inputs and trust the runtime to correct any flaws.

> and +1 for adding the switch combinator in 10.

This would be a perfect non-Oracle contribution from the mlvm crowd.  Hint, hint.
(Sadly, Michael Haupt is not available for the next round of combinators.)
Let's follow up on mlvm-dev if we want to discuss that combinator apart from
its use in Amber.

> I would really like to be able to tell the JIT that the tests has no side effect (maybe it can be another combinator too) even if some part of the test are not inlinable. 

That's a tough one.  It's not like Java to allow the user to assert something that cannot be
proven by the JVM (usually by static types or runtime checks).

/** Returns a method handle that operates exactly like target,
 * but passes all side effect attempts to the indicated collector. */
MethodHandle MHs.asPureMethod(MethodHandle target, SideEffectCollector effects);

That's a research project.  I'd almost settle for a hardwired collector
which just throws IllegalSideEffectException or some such.
(This is a version of freezing for arrays or other objects, but for methods.)

Since it's a research project, it's not something I'm going to say more about
in this venue.  We can talk on mlvm-dev about the details if you want.

>> 
>> The theory of type safety of multiple dispatch has moved forward with Fortress.
>> Alternatively, if you can sugar up a visitor deployment as if it were multimethods
>> added as extensions, you could prove type-safety and still define apparent methods,
>> outside the type capsule.  When we get value types visitors will become cheaper.
>> Maybe at that point we can think about doing multi-method sugar that compiles
>> to invisible visitor classes.  (I'm not suggesting we do this now!)
> 
> The main issue of the visitor is that you have to add the accept methods on the class of the hierarchy before being able to use the visitor, this is equivalent to be able to insert a method or a constant into a class, which is also equivalent to be able to inject the implementation of a not yet existing interface.

But you can usually write the accept methods once, universally, right?
The multi-method part is often just the specific visit* methods of the visitor.
The parts of the multi-methods which switch on the acceptor object types
can be factored out just once.

Here's the pattern I'm thinking of in multimethods:

multimethod WalkY.Result walkY(NodeTypeX node, WalkY walker);
==> value class WalkY$Walker implements NodeTypeTop.Visitor { …
           visit(NodeTypeX node) { return walkY(node, this.walker); } … }

The hooks for all the NodeTypeX things would go once into the node hierarchy,
assuming such a hierarchy can be pre-configured in a uniform manner, while
the various ad hoc methods would go into an ad hoc WalkY walker value class.
At this point I would need to study the "expr problem" literature to see how this
ties in (I'm sure it's not a new idea), so I'll just say it seems useful as a FUTURE
(not NOW) option for sugary multi-methods, and leave it there.  No way are we
doing multi-methods any time soon.

But:  The combined pattern of visitor + factory is sometimes called a metamorphism.
I think that what we are after, in the end, is a full set of hooks for building ad hoc
metamorphisms, without having to add more logic to the participating data classes.
A matcher hook is really a co-constructor, which provides a mirror-image computation
to the constructor.  One takes some values and produces an object from the value,
while the other takes an object and produces some values from the object.
Both halves are important, and it is also important to make the two halves
work together gracefully.  IMO that's the deep reason why notations for patterns
look like notations for object construction expressions.  I am hoping we can
exploit this mirror structure in Amber designs.

> 
>> 
>> The goal would be to guide users to shadow variables only when the shadowing
>> binding is somehow a logical repeat of the original binding.  This cannot be
>> done automatically in all cases we care about.  Tricky tradeoffs.
> 
> For lambdas, a lambda is an anonymous function, and even in JavaScript, opening a function open a new scope.  
> If the switch as a syntax close to the lambda one (and less close to the C) it may not be a big deal.
> 
> switch(val) {
>   A val -> ...
> }
> 
> but given that usually you have more that one case, it will rapidly become unreadable
> 
>  switch(val) {
>   A val -> foo(val)
>   B val -> foo(val)
>   …
> }

Actually, I find that a reasonable compromise between pure flow typing and explicit rebinding.

Yes, the names are stuttery, but there's a logical reason for it; we are explicitly instructing the
compiler to rebind the name after each case label.  Compare similar stuttering in "this.foo = foo"
or "map.computeIfAbsent(k, k -> k.toString())".

>> 
>> A lower-level solution which requires
>> no companion types at all (not even tuples) would be to reify argument
>> lists per se, at the same level of abstraction as method handle types.
> 
> Will it work if the pattern support nested types ?
>  case A(B(x y), _): …

Yes.  You need a composite MH which takes a possible A (with possible B)
and returns an argument list of x,y.  The type of the argument list is totally
ad hoc.  Consider:
   case A(B(x,_),C(y)): …
In that case, the argument list of (x,y) might have types that never appear
together, except at this one place in the source code.

> 
> You will have to have a kind of growable arguments list very similar to the one you need to implement the method handle combinators.
> Technically, for the mh combinators you also need to shrink the arguments list but you can cheat if you are able to extract a sub part when doing the calls to the direct method handle.

I think that means you need argument list utilities (are they combinators?)
which perform for argument lists what dropArgs, insertArgs, collectArgs,
spreadArgs, etc., do for MHs.  Actually, the existing MH combinators
will do all of this, given two more MH hooks which allow method handles
to map between the two forms of argument lists: normal, and "collected
into a single slug".  With a little forcing, collectArguments and spreadArguments
could be trained to look for the argument list type as well as array types,
et voila.

> Jerome has used that trick when implementing invokedynamic on android.

Ignoring arguments is a good trick.  If you have an argument list pointer,
you can ignore both a prefix and a suffix.

> 
>> That's a building block I am currently experimenting with, as a value-based
>> class which can be upgraded to a value type.  The type information is
>> encoded as (wait for it) a MethodType, interpreted with the arrows reversed.
> 
> The arrow reversed MethodType is a way to represent a flatten tuple of types to a named tuple (the value type) with the same component types.

Hold on, there are no names here, just positions.  That's important!  We call Math.max on (DD) not on (Dx;Dy;).

> So in order to represent a tuple of multiple values which are the structural value of a class, you need a reified tuple of types and you use a MethodType as a hack.

The MT just provides the dynamic type framework.  (The return type is always void.class.)
The List API provides weakly-typed access, and there also has to be a way to get
strongly-typed MH-based access if you know the MT in advance.

> Another way to see unapply is to see it as a kind 'maybe tailcall' in a continuation,

Yes, but the JVM is very bad at tailcalls, so that's not practical.  And in any case,
even if tailcalls were great, you'd leave behind your local state, which will make
compilers unhappy and lead to the boxing of frame-state (in closures or whatnot).

It's the difference between internal and external iterators all over again.
I think we need both options.  And in this case I think we want to build the
internal (CPS flavor) on top of the external (accessor flavor), and not vice versa.

The CPU flavor suffers from bad support for tailcall, loss of local state, and one
more thing:  It's a little too eager.  It assumes that the client wants *all* of the
component values from the match.  It would be nice (though it's not required)
if you could say "_" for a component pattern, and avoid the cost of reifying
the component.  Why is this important? Well, the component might actually
cost something to reify; standard examples are defensive copying of array
components, or creation of invariant-enforcing view objects.  We can't handwave
these away by saying that pattern-matchable objects will always be so simple
that component extraction will always be cheap.  (This is a limitation of
the argument-list approach also, so we might want something more here.)

None of this bears on the surface syntax of the language, but it's
important for some of us to think about since classfiles containing
compiled Amber code need to be compact and fully optimizable
by the Java runtime.  In particular, if we didn't have indy to perform
boilerplate generation for us at runtime, our class files would become
terribly bloated by the boilerplate features we are adding.  A one
line source file could expand by 100x if all the boilerplate had
to be statically elaborated into the classfile.  With indy-style
bootstrapping we can avoid paying the cost for boilerplate
functions until they are actually used.

— John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20170318/c597cee8/attachment-0001.html>