New pattern matching doc
Brian Goetz
brian.goetz at oracle.com
Tue Jan 12 17:31:30 UTC 2021
> yes,
> do you have a use case for a pattern with input arguments ?
Yes!
1. Regex. A regular expression is "morally" a pattern, and we would
like to expose it as such. Doing so has many benefits, not only
unifying matching mechanisms, but also because patterns compose better.
A regex `(a*)(b*)` is like a partial pattern which binds an String[2].
The simplest way to get there is to be able to wrap things like regex
with an actual pattern. For example (using the deliberately awful syntax):
static pattern
__name = "regex"
__target(String stringToMatch)
__inputArgs(String regexString) // or Pattern
__bindings(String[] groups)
So this is an ordinary pattern that takes as input a regex string, and
produces as bindings a String[]. You could write this pattern in just a
few lines of code, and now any regex can be pattern matched. This
connects with the array patterns we talked about this week:
if (string instanceof regex("(a*)(b*)", String[] { var asString,
var bsString })
^input arg
^ String array pattern
(There is a syntax bikeshed to paint for the client here too, but we're
not painting that today.)
2. The pattern version of Map::get. Map.get is like a pattern with an
input argument; there's a map that is the target, an input argument for
the key, and a binding for the value.
3. In the JSON example I gave in the document, the `stringKey()`
patterns take key names as input args (like Map.get.)
There are more but I think these examples support the idea that this is
not a wild or esoteric thing.
>> so on a simple example, if the bindings are only types (here, in between brackets)
>>
>> with this declaration
>> class Foos {
>> static pattern [int, CharSequence] bar() { ... }
>> }
>>
>> and this pattern matching
>> switch(o) {
>> case Foos.bar(var a, var b) -> ...
>> }
>>
>> as you said given that a and b are declared as var, the compiler will only checks the arity of the bindings and infers that the type of a is int and the type of b is CharSequence.
>>
>> If there is overriding between instance pattern methods, we may allow
>> - the type of the binding to be covariant (like the return type of an instance method)
>> - more bindings in the overridden method than in the base method
For the first, we might; the bindings are like returns, so this is
possible (though will introduce complexity that should be evaluated for
its return.) For the second, I'm not sure that's an override; feels
more like an overload in the subclass.
Pattern implementations can delegate to other pattern implementations,
just as methods can call other methods, and constructors can call their
super constructors. This turns out to be a useful idiom, so in the case
of your "overload with more bindings", it seems more likely we'd have
(using your syntax idiom):
class Super {
pattern [ int ] foo() { ... }
}
class Sub {
@Override
pattern [ int ] foo() { ... }
pattern [ int, int ] foo() {
... invoke single-int foo to bind first binding ...
... bind second binding ....
}
}
Much like a telescoping constructor.
>>
>> We also talk about requiring names when declaring the bindings
>> class Foos {
>> static pattern [length: int, text: CharSequence] bar() { ... }
>> }
>> but given that the matching between the binding is positional, technically we don't need names but we may still keep them
If we are modeling the body as imperative code, like constructors, we
need names for the bindings and targets, to assign to them. If we are
modeling the body as something that "calls" a terminal "method" like
`match(a,b,c)`, then the names are technically optional, but ...
omitting them causes other problems (like, what would the specification
say? "The third `int` parameter is ...")
More information about the amber-spec-observers
mailing list