New pattern matching doc
forax at univ-mlv.fr
forax at univ-mlv.fr
Wed Jan 13 19:11:58 UTC 2021
----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Mardi 12 Janvier 2021 18:31:30
> Objet: Re: New pattern matching doc
You should also talk about the last tier of the onion, the translation strategy, because i may be wrong but ii think it will ripple to the two other parts
(syntax and semantics).
So we have
- a definition site that defines a deconstructor, an instance pattern method or a static pattern method and
- a use site inside a switch or an instanceof the uses the deconstruction pattern, the qualified pattern, the unqualified pattern, etc
Unlike with a lambda, in the general case the definition site and the use site are not in the same class so we will have to deal with separate compilation issues but i will brush that aside for now.
At definition site, a pattern method is obviously a method and as Brian as proposed earlier, bindings can be represented by a synthetic record as return type, so something like
public static pattern
__name("parseInt")
__target(String text)
__inputArgs()
__bindings(int value)
try {
int result = Integer.parseInt(text);
return match(result);
} catch(NumberFormatException e) { // obviously, there is a better implementation,
return no-match; // this is just an example
}
can be translated to
public record $Bindings$__int(boolean match, int value)
public static $Bindings$__int parseInt(String text) {
try {
int result = Integer.parseInt(text);
return new $Bindings$__int(true, result);
} catch(NumberFormatException e) {
return $Bindings$__int.default;
}
}
The first boolean "match" of the record Bindings__int indicating that the pattern method is not total thus can say that there is no match.
The synthetic record visibility has to be the same as the pattern method (so an overriding method can see the record).
The synthetic record has to be stable between recompilation so mangling the types of the bindings can be a solution.
Given that we want to support overloading, the record corresponding to the bindings as to be present as return type, so overloads are pattern methods with the same name, may be the same parameters but a different return type. To support covariance of binding types, we generate a bridge method that convert the values of the more precise record to the less precise one.
At use site, for an instanceof
o instanceof Integer.parseInt(var i) { ... }
can be translated to
private record Carrier(boolean match, int i)
var carrier = invokedynamic (o)Carrier "instanceof" [TypePattern(String.class, MethodPattern(MethodHandle($Bindings$__int parseInt(String), int Carrier.i)))]
if (carrier.match) {
var i = carrier.i;
...
}
and for a switch
switch(o) {
case Integer.parseInt(var i) -> ...
default -> ...
}
can be translated tp
private record Carrier(int index, int i0)
var carrier = invokedynamic (o)Carrier "switch" [0: TypePattern(String.class, MethodPattern(MethodHandle($Bindings$__int parseInt(String), int Carrier.i0)))]
switch (carrier.index) {
case 0 -> {
var i = carrier.i0;
...
}
default -> ...
}
A tree of pattern is represented by a tree of Pattern, what invokedynamic does is to call a method that derives a MethodHandle for each pattern (or list of patterns in case of a switch), and install that method handle as target. The MethodPattern takes a method handle which is the method to invoke and a list of record components in the Carrier record (a reference to a record can be empty to implement the Empty binding, '_').
At runtime, in case of the MethodPattern, the method handle calls the corresponding pattern method, loads the bindings from the return value and store them in the Carrier object.
As we discuss earlier, the Carrier record can be generated at runtime and hidden at compile time behind an Object, in that case the reference to Carrier.i0 is replaced by an index (and a type) in the Pattern tree and the value need to be retrieved using invokedynamic too.
Now the question,
- should we allow users to define their own binding record, it's less magic, obviously the name is stable but it's more boilerplate ?
- if the Carrier records are defined at runtime, maybe the record component types (at least the non primitive one) should be erased ? To enable better sharing.
Rémi
More information about the amber-spec-experts
mailing list