Translation of pattern matching
Remi Forax
forax at univ-mlv.fr
Sun Aug 6 18:12:20 UTC 2017
Ok,
here is my take on the translation to bytecode:
https://github.com/forax/mjolnir/tree/master/src/main/java/fr.umlv.mjolnir.amber/fr/umlv/mjolnir/amber
With a simple test:
https://github.com/forax/mjolnir/blob/master/src/test/java/fr.umlv.mjolnir.amber/fr/umlv/mjolnir/amber/PatternMatchingTests.java
The translation is more or less the one of described in the document pattern-match-translation with the difference that it uses one carrier (the uber carrier) for the whole pattern matching instead of one by pattern.
The current limitations are:
- it doesn't allow matching on the result of an extraction (Pattern.and() is not implemented)
- it supposes that the switch takes an Object as argument.
- you can have only one de-constructor by class (i feel lucky).
- the escape analysis fails because the result of invokedynamic acts as a join of several creations of the uber carrier, something the VM is not able to cope.
- the construction of the MethodHandle tree is not lazy (it doesn't work like an inlining cache but more like an if instanceof chain)
So in the test, i share the same condy with the invokedynamic and the methods that extract a value inside the switch. This allow every parties to share the same pattern thus agree on the shape of the uber carrier
The uber carrier is a dynamically generated tuple, the class TupleHandle takes a MethodType (with void as return type) and provides a constructor method handle to create a tuple and a way to get a method handle for each components.
Internally the a tuple is erased as a concatenation of object fields and long fields represented by the class TupleHandle.Form. The idea is that for one form, there is one dynamically generated class, so every tuples derived from a form will uses the same class.
A Pattern is a Form and a lambda that construct a method handle tree that will match the pattern from the Form of the uber tuple. When the pattern is constructed, it calculates the Form that will contains all extractions.
>From the Form corresponding to the uber carrier, it is possible to as for a TupleHandle from a MethodType that map the tuple virtual field to the Form class actual field (and initialises the remaining fields to 0, false, etc).
The protocol uses to match an object is to add two fields, a selector and a boolean, to each TupleHandle, the selector is bound to the number of the case (and -1 for default) and the boolean indicate if the matching succeed or not.
By example, the class Point is defined like this.
public static class Point {
private final int x;
private final int y;
...
@Deconstruct({ int.class, int.class })
public Object deconstructor(MethodHandle carrier) throws Throwable {
if (x == y) {
return carrier.invokeExact(false, 0, 0); // reject
}
return carrier.invokeExact(true, x, y);
}
}
The de-constructor is a method annotation with @Deconstruct. The value of @Deconstruct indicates the intended signature. When the MethodHandle tree is constructed , the MethodHandle taken as argument of the de-constructor is mapped to the constructor TupleHandle able to read the MethodType corresponding to the value of the de-constructor.
At that point, the selector index, is also bind, so it will be available after matching. if a tuple is constructed with true at first argument, it means that the pattern match, and the following arguments are the de-constructed value.
So in the example, if x equals y , the pattern does not match, otherwise it matches and returns the value of x as first argument and y as second argument.
For the follwowing pattern matching switch:
Object carrier = indy [$0] (o);
switch(component [[int.class, boolean.class], 0, $0] () {
case Point p(x, y):
return "Point " + x + ' ' + y;
case User(name, _):
return "User " + name;
default:
return "no match";
}
The corresponding pseudo bytecode is:
[$0 = condy(Point.class, User.class)]
Object carrier = indy [$0] (o);
switch(component [[int.class, boolean.class], 0, $0] () {
case 0:
int x = component [[int.class, boolean.class, int.class, int.class], 2, $0] ();
int y = component [[int.class, boolean.class, int.class, int.class], 3, $0] ();
return "Point " + x + ' ' + y;
case 1:
String name = component [[String.class], 2, $0] ();
return "User " + name;
default:
return "no match";
}
The Java code that uses Mjolnir is:
Bootstrap<Pattern> condyLocation = lookup -> condy(lookup, Point.class, User.class);
Object carrier = get(lookup -> indy(lookup, get(condyLocation))).invokeExact(o);
switch((int)get(lookup -> component(lookup, type(), 0, get(condyLocation))).invokeExact(carrier)) {
case 0: {
int x = (int)get(lookup -> component(lookup, type(int.class, int.class), 2, get(condyLocation))).invokeExact(carrier);
int y = (int)get(lookup -> component(lookup, type(int.class, int.class), 3, get(condyLocation))).invokeExact(carrier);
return "Point " + x + ' ' + y;
}
case 1: {
String name = (String)get(lookup -> component(lookup, type(String.class), 2, get(condyLocation))).invokeExact(carrier);
return "User " + name;
}
default:
return "no match";
}
The second round (after my vacation) is to:
- create only one uber carrier for the whole pattern empty (like vdefault) at the starts and update it (vwith) when extracting
- allow TupleHandle to see a part of the uber carrier that doesn't start at the beginning (to implement Pattern.and()).
cheers,
Rémi
----- Mail original -----
> De: "Remi Forax" <forax at univ-mlv.fr>
> À: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>, "amber-dev" <amber-dev at openjdk.java.net>, "Gavin Bierman"
> <gavin.bierman at oracle.com>, "Brian Goetz" <brian.goetz at oracle.com>
> Envoyé: Samedi 5 Août 2017 03:55:16
> Objet: Re: Translation of pattern matching
> Thanks Maurizio,
> so there is no need to an uber carrier, because you compose the carrier (as far
> as I understand).
>
> You still need a way to pass the carrier from indy to the switch.
>
> I think I will test the ArgumentList translation to see how it goes, did I
> mention that prototyping things with mjolnir was easier :)
>
> Remi
>
>
>
>
>
> On August 4, 2017 1:19:32 PM PDT, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>>Maybe I'm missing something - but I think there is a disconnect between
>>
>>this code and what the DtorHandle really looks like (described here
>>[1]). More specifically, operations like 'component' do not give you
>>access to bindings, they return method handles which can be used to
>>access the binding that is stored on some opaque carrier (Brian please
>>correct me if wrong).
>>
>>[1] -
>>http://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html
>>
>>http://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html
>>
>>
>>On 04/08/17 12:49, Remi Forax wrote:
>>> Ok, i do not understand the translation of pattern matching to
>>bytecode,
>>> it seems that something is missing.
>>>
>>> Here is the example of translation that Brian uses in its talk last
>>Wednesday,
>>>
>>> static final DtorHandle p1 = ... LDC dtor for NegNode(NegNode(var n))
>>...
>>> static final DtorHandle p2 = ... LDC dtor for NegNode(var n) ...
>>> static final DtorHandle p3 = ... LDC dtor for AddNode(IntNode(0), var
>>right) ...
>>>
>>> Node simplify(Node n) {
>>> int selector = indy[ bsm=Switchomatic, args=[p1, p2, p3 ... ]
>>](n)
>>> return switch(selector) {
>>> case 0 -> n;
>>>
>>> case 1 -> let int n=p1.component(0) in simplify(n);
>>> case 2 -> let int n=p2.component(0) in simplify(new
>>NegNode(simplify(n)));
>>>
>>> case 3 -> let right=p3.component(0) in simplify(right);
>>> ...
>>> };
>>> }
>>>
>>> I do not see how the exploded values (stored in the carrier object)
>>can be stored inside the DtorHandle, (making them mutable BTW) can
>>work.
>>> Perhaps, there is a need for an uber-carrier object, which is mutable
>>(or not if the return of invokedynamic is conceptually a pair
>>(selector+uber-carrier)),
>>> to store the the carrier object as a field so the switch (the one
>>just after indy i the translation) can access to the carrier object.
>>>
>>> Like the carrier object, the uber-carrier has to be created at
>>runtime in order to avoid the same binary compatible issue discuss by
>>Brian about the carrier object.
>>>
>>> The other solution is that all carrier type of the DtorHandle as to
>>be the same type (like the ArgumentList object proposed by John), in
>>that case, the return value of an indy is a pair selector+argumentList
>>(
>>> The ArgumentList being a kind of dynamically sized tuple)
>>>
>>> I've solved the problem by disallowing the action part of a pattern
>>matching being able to do side effects on local variables, so all the
>>actions can be pass as boostrap arguments of the indy the same way the
>>LambdaMetaFactory works (an action is a static method seen as a
>>constant MethodHandle, with the difference that the captured value do
>>not need to be bound, they can be passed as argument of indy). This is
>>something that can be done for the pattern matching that uses the
>>expression switch but i doubt it's possible to explain to people that
>>you can not do side effects if they use the classical switch syntax.
>>> Anyway, it may mean that it should exist two translation strategies,
>>one for the classical switch and one for the expression switch.
>>>
>>> Rémi
>>>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
More information about the amber-dev
mailing list