Patterns: declaration
Victor Nazarov
asviraspossible at gmail.com
Fri Jan 22 23:04:31 UTC 2021
Hello,
I like the main concept proposal, i.e. symmetry between methods and
patterns and where this is moving generally. I'd like to comment on two
places that doesn't look undisputable: static pattern declaration and usage
of patterns with input arguments.
1) static pattern declaration
Implicit magic target parameter is not ideal. It's not obvious what target
is by looking at the declaration and the symmetry between declaration and
usage is broken.
I'd like to emphasize the difference between pattern declaration and
methods references. For methods references you can declare and implement
methods however you like and later reference it somewhere. You don't need
to know that first argument is somewhat special when you*implement* the
method. For patterns the situation is different, you need to know where the
target is when you implement the pattern.
Because of this difference I like the alternative proposal with special
keyword better:
static <T> pattern<Optional<T>>(T value) of() {...} // Uses `this` or
`target` keyword.
2) passing input arguments
I think two consequetive parenthes-pairs are problematic
* the first pair brings method call mental model and it looks like the
special Pattern-object is the result of this call, but it's very different
from how it works.
* Another is that users expect syntax to be compostable. When a user see
case pattern1(var a, var b) -> ...;
case pattern2(x, y)(var w, var v) -> ...;
It's reasonable to expect that
pattern1(var a, var b)
Works the same way as
pattern2(x, y)
But these two pairs of parentheses are very different.
I'd like to instead propose another syntax that mimics already existing
construct:
String A_AND_B = "(a*)(b*)";
switch (s) {
case StringUtils.(A_AND_B)regex(var as, var bs) -> ...;
}
The syntax is analogues to explicit type-argument application
(List.<Integer>.of()), but uses parentheses and applies values instead of
types.
--
Victor Nazarov
P. S. Sorry if something is off. I've typed this message from my phone...
On Fri, Jan 22, 2021, 17:10 Brian Goetz <brian.goetz at oracle.com> wrote:
> Here's the first half of an *extremely rough* doc-in-progress that
> starts to show more of the syntax of the declaration of a pattern -- but
> does yet not dive into the _body_ of a pattern (which we should still
> leave for a later discussion). This is still more about object model
> than directly about syntax; many people have been asking "I'm having a
> hard time imagining what this looks like in a class file", so this
> should help a bit. It also fleshes out some of the requirements,
> specifically patterns delegating to other patterns.
>
>
>
> ## Beyond built-in patterns
>
> Related documents have explored a number of built-in pattern types,
> where the
> compiler knows enough about the pattern semantics to implement them
> directly,
> such as type patterns or record deconstruction patterns. However, we
> also wish
> to support direct declaration of patterns as class members, such as
> deconstruction patterns for non-record classes, static patterns, and
> instance
> patterns.
>
> Given a pattern match like:
>
> ```
> if (p instanceof Point(int x, int y)) { ... }
> ```
>
> assuming the pattern matches, we need to capture somewhere how the
> bindings `x`
> and `y` are extracted from the `Point`. If `Point` is a record, we can
> do this
> without help from the user, but otherwise, someone is going to need to
> write
> some code somewhere to capture the applicability test and state extraction
> corresponding to this pattern. (One can imagine various "tricks" for
> doing so
> (i.e., extracting fields, calling getters, etc), but these tricks
> quickly run
> out of gas for classes whose state is not trivially and transparently
> coupled to
> its API.) We'll call the various forms of this _member patterns_, which
> are
> expressed as a new kind of class member.
>
> > Member patterns are executable class members, just like constructors and
> > methods.
>
> Abstractly, a pattern operates on a single target parameter (which for
> instance
> and deconstruction patterns is just the receiver), and either succeeds
> or fails
> based on some condition. If it succeeds, it produces zero or more
> output values
> (the destructured components, or _bindings_.)
>
> #### Pattern declarations in other languages
>
> We may be able to obtain some inspiration by looking at what other
> languages do.
> By the late 1990s, pattern matching over structural types (sequences and
> tuples)
> was a common feature of functional languages; languages that supported
> nominal
> product types (e.g., Haskell) also supported pattern matching over nominal
> product types. In the mid-2000s, languages like [F#][fs-patterns] and
> [Scala][scala-patterns] extended the notion to additionally apply to
> abstract
> data types (those whose concrete representation was not necessary known
> to the
> client.) However, in both F# and Scala, patterns remain distinctly `static`
> entities; they are not a first-class part of the object hierarchy, and
> remain at
> the periphery of the object model.
>
> Scala takes the approach of modeling patterns as static methods with the
> name
> `unapply`. An `unapply` method takes a single parameter (type checked
> against
> the matchee) and has some flexibility in its return type; it can return
> `boolean` if there are no bindings, or `Option[T]` (where `T` is often a
> tuple
> type) to carry the binding parameters. A matcher for `Point(x,y)` would
> look
> like:
>
> ```
> object Point {
> Option[Tuple[int, int]] unapply() { ... }
> }
> ```
>
> This is a pragmatic approach, with pros and cons. It is nice that it is
> just an
> ordinary method, which can be called either directly or through a
> pattern match.
> On the other hand, there is lots of magic -- the language imputes special
> semantics based on method name, enclosing class, argument list, and
> return type.
> Also, because these methods are effectively static, there is no ability to
> overload, override, have abstract patterns, etc.
>
> C# takes a more principled, but also more limited, approach. As of C# 9, a
> class can write one or more `Deconstruct` methods (magic name), using `out`
> parameters to receive the bindings (a feature already present in C#). A
> `Deconstruct` method must be total (the deconstruction cannot fail), and
> there
> is no way to write instance or abstract patterns (though the equivalent
> of some
> static patterns can be achieved with extension methods.)
>
> Both of these approaches are adding patterns "around the edges"; we are
> proposing a pattern matching model that integrates much more deeply into
> the
> object model.
>
> #### Deconstruction patterns in Java
>
> The simplest form of pattern is a _deconstruction pattern_, which is the
> dual of
> a constructor. And, just as constructors are an odd mix of instance and
> static
> behavior -- they are instance members (they have a receiver) but are not
> inherited and can only be accessed through their class name -- their
> duals have
> the exact same odd mix. Another oddity of constructors is that they
> don't have
> names -- the name supplied at the instantiate site is the name of a
> class, not
> a class member. It makes sense for deconstruction patterns to also
> share this
> characteristic.
>
> The target of a deconstruction pattern is obvious -- it is the receiver.
> Deconstruction patterns are _total_ on their target type -- as long as the
> target is valid as a receiver, the pattern always succeeds. We'd like
> for the
> arity, types, order, and names of the binding variables to be manifest
> in the
> declaration code, as this is part of the classes API, and for the
> similarity
> with the corresponding constructor, if there is one, to be obvious both
> at the
> declaration and use site. This might look like:
>
> ```{.java}
> class Point {
> int x, y;
>
> Point(int x, int y) {
> // constructor body
> }
>
> deconstructor(int x, int y) {
> // deconstructor body
> }
> }
> ```
>
> There is a strong symmetry between the declarations of the constructor and
> deconstructor. The constructor takes as arguments `x` and `y` and
> produces a
> `Point`; the deconstructor takes a `Point` and produces as bindings `x`
> and `y`,
> and the declaration of the argument list of the constructor and the
> binding list
> of the deconstructor are syntactically similar. The locution
> `deconstructor(BINDINGS)` is meant to evoke the notion of "returning" the
> specified set of bindings. For classes that model entities more or less
> transparently, as `Point` does, it is entirely reasonable to match
> constructors
> with deconstructors whose binding lists correspond to the constructor
> parameter
> list.
>
> There is also a symmetry between the creation of an object with a
> constructor,
> and the destructuring of an object with a deconstruction pattern. A client
> matches against a deconstruction pattern as follows:
>
> ```
> case Point(var x, var y):
> ```
>
> Here, we interpret `Point` as a class name, and look in the `Point`
> class for a
> deconstruction pattern with compatible binding arity and types --
> analogously to
> how we interpret the instance creation expression `new Point(x, y)`.
>
> We've left out the declaration of the body for now; we'll return to this
> after
> our tour of the member pattern model. But it's important to note that,
> just as
> the constructor is free to perform defensive copying on the inputs, the
> deconstructor needs a way to do the same with the bindings.
>
> Deconstruction patterns are unusual in that they are always _total_; if the
> target is of the correct type, the deconstruction pattern should always
> complete
> successfully. However, not all patterns are total; a pattern like
> `Optional.of(var x)` will only match a subset of instances of
> `Optional`. As a
> simplifying approximation, we'll say that all other member pattern are
> partial;
> the bodies of these patterns will need a way to denote this as well.
>
> There's also no reason that deconstruction patterns cannot appear in
> interfaces,
> as long the interface supports enough API to implement it. For example, we
> might want `Map.Entry` to support a deconstruction pattern that yields
> key and
> value:
>
> ```
> interface Entry<K,V> {
> K getKey();
> V getValue();
>
> deconstructor(K key, V value) {
> // invoke getKey() and getValue() and bind them appropriately
> }
> }
> ```
>
> This would allow us to iterate through the elements of a `Map` more
> directly:
>
> ```
> for (Map.Entry(var k, var v) : map.entrySet()) {
> // use k and v
> }
> ```
>
> #### Instance patterns
>
> The next simplest form of pattern declarations are _instance patterns_,
> which
> are duals of instance methods. As with deconstruction patterns, the
> target of
> an instance pattern is the receiver, but unlike deconstruction patterns,
> instance patterns can be partial. An instance pattern declaration looks
> like:
>
> ```
> class Class<T> {
> public pattern(Class<?> componentType) arrayClass() {
> ... body ...
> }
> }
> ```
>
> A significant difference from deconstruction patterns is that instance
> patterns
> have names (just as instance methods do); later, we'll see that they can
> also
> have input arguments.
>
> At the use site, instance patterns are usually unqualified, referring to an
> instance pattern on the static type of the match target. So if we have a
> `Class` in hand, we can match on those classes that represent arrays as:
>
> ```
> switch (clazz) {
> case arrayClass(var componentType):
> }
> ```
>
> #### Static patterns
>
> Static patterns are slightly more complicated than instance patterns,
> because
> they lack a receiver -- and so the match target must be identified some
> other
> way. We can do this by indicating the target type and name as a pattern
> parameter. The duality between the "receiver" and the first parameter of a
> static method is not only well-known to the JVM), but also has precedent
> in the
> language: the method reference `Foo::bar` could refer to a static method of
> `Foo` or to an instance method (an "unbound method reference"), in which
> case
> the first argument becomes the receiver.
>
> ```
> class Optional<T> {
> static<T> Optional<T> of(T t) { ... }
>
> static<T> pattern(T t) of(Optional<T> target) { ... }
>
> static<T> Optional<T> empty() { ... }
>
> static<T> pattern() empty(Optional<T> target) { ... }
> }
> ```
>
> This leads to the familiar symmetry at the use site: we construct
> `Optional`
> instances with `Optional.empty()` and `Optional.of(t)`, and match
> against them
> like:
>
> ```
> switch (opt) {
> case Optional.of(var t): ...
> case Optional.empty(): ...
> }
> ```
>
> #### Patterns with input arguments
>
> Pattern matching fuse a test with zero or more conditional extractions,
> which is
> something we already do all the time. Regular expression parsing is an
> example
> of this combination -- when we match against a regular expression, we
> get both a
> boolean result (did it match), and, if it did, we can then extract the
> "groups"
> bounded by `(...)` in the regular expression. It would be nice if we could
> represent a regular expression as an ordinary pattern match, rather
> than using
> an ad-hoc calling sequence, while still delegating to the same underlying
> regular expression library. So how do we turn a regular expression into a
> pattern?
>
> We already have almost all the tools needed; a regular expression is like a
> pattern which binds a `String...`, whose target is a `String`. We could,
> for
> each regular expression, write a static pattern:
>
> ```
> Pattern asbs = Pattern.compile("(a*)(b*)");
>
> static pattern(String...) asbsMatch(String target) { ...
> ... use asbs to match target, and extract the groups ...
> }
> ```
>
> This works -- we could ask if a string matches the pattern
> `asbsMatch()`, and
> bind the groups if so (further refining with nested patterns if we
> like.) But
> it seems a little clunky to have to write this method for every regular
> expression; we'd like to write it once, and parameterize it with the
> regular
> expression we want. This means we need to be able to specify an additional
> input argument beyond the match target. This might be declared:
>
> ```
> static pattern(String... groups) regex(String target, String regex) {
> ... use regex library to match against regex, extract bindings ...
> }
> ```
>
> Here, the first parameter denotes the target of the match (as before); the
> remaining parameters must be passed at the use site. This might be
> written:
>
> ```
> String AS_AND_BS = "(a*)(b*)";
> case regex(AS_AND_BS, String[] { var aString, var bString }):
> ```
>
> where the first "parameter" is the input parameter, and the second is a
> nested
> pattern that matches the sole binding variable. This is easy enough for
> compilers to work out, but not so great for humans; it is not always
> obvious
> where the data ends and the patterns begin. (This gets worse if there are
> syntactic constructs that could be either arguments or nested patterns,
> such as
> denoting constant patterns with literals.) So it is probably preferable
> to
> syntactically reflect the two different sorts of arguments, such as:
>
> ```
> case regex(AS_AND_BS)(String[] { var aString, var bString }):
> ```
>
> If a pattern has two argument lists, the first is the input arguments,
> and the
> second receives the bindings. If the first list is empty, as it usually
> is,
> it can be omitted. Since our regex pattern is a varargs pattern, we
> could also
> elide the `String[]` pattern scaffolding:
>
> ```
> case regex(AS_AND_BS)(String aString, String bString):
> ```
>
> We would likely want to enforce the requirement that expressions used as
> input
> arguments be _effectively final_, to eliminate any confusion about the
> timing
> of when they are evaluated.
>
> #### Alternatives to Map::get
>
> The `Map::get` method takes a key and returns the corresponding value,
> if the
> mapping is present in the map, or `null` if the mapping is not present.
> This is
> a problematic API choice for a number of reasons. First, if the map
> supports
> null values, it is impossible to distinguish a mapping to null from
> mapping-not-present; if we try to distinguish between these cases with
> `Map::containsKey`, we risk a possible race condition. And, as we
> integrate
> inline classes and primitives into the generic type system as we are in
> Project Valhalla, for some types, `null` will not be available as a
> sentinel.
>
> If we had patterns in the language when we designed Collections, we
> might have
> reached for a pattern, which captures both the "is the mapping present",
> and
> "if so, what is the key mapped to" in one go:
>
> ```
> interface Map {
> pattern(V value) withMapping(K key);
> }
> ```
>
> and clients would be required to confront this conditionality directly:
>
> ```
> if (m instanceof Map.withMapping(k)(var v)) { ... use v ... }
> ```
>
> ## Composition
>
> Just as constructors compose (a subclass constructor can delegate to a
> superclass constructor, and constructors/factories can delegate to
> constructors/factories for their components), patterns should compose as
> well.
> Let's gather some requirements for composition of pattern declarations.
>
> ```
> class A {
> private int a;
>
> public A(int a) { this.a = a; }
> public deconstructor(int a) { ... }
> }
>
> class B extends A {
> private int b;
>
> public B(int a, int b) { super(a); this.b = b; }
>
> public pattern(int a, int b) {
> // want to delegate to superclass deconstructor
> }
> }
> ```
>
> The class `A` was designed for subclassing, and subclass constructors will
> delegate to constructors in `A`. Since deconstruction is the dual of
> construction, it stands to reason that the deconstruction pattern in `B`
> can
> delegate to the deconstruction pattern in `A`, and then add whatever
> B-specific
> deconstruction logic is needed.
>
> If we use composition instead of inheritance, we have a similar
> requirement. In
> this next example, class `D` has two members of type `C`, and `D`'s
> constructor
> delegates to the `B` constructor; we would want its deconstructor to be
> able
> to do the same thing.
>
> ```
> class C {
> private int c;
>
> public C(int c) { this.c = c; }
> public deconstructor(int c) { ... }
> }
>
> class D {
> C c1;
> C c2;
>
> public D(int x, y) {
> this.c1 = new C(x);
> this.c2 = new C(y);
> }
>
> public deconstructor(int x, int y) {
> // delegate to C pattern to extract x from c1 and y from c2
> }
> }
> ```
>
> In these examples, we've composed deconstruction (total) patterns into
> bigger
> patterns, so we didn't have to worry about what happens when the inner
> pattern
> fails. But we also want to be able to compose partial patterns into
> partial
> patterns. For example, suppose we have:
>
> ```
> interface Shape {
>
> // if component is visible, provide bounds
> public pattern(Rect bound) visibleShape();
> }
> ```
>
> And suppose we have a class that represents the composition of two
> shapes; we
> would like it to implement the `visibleShape()` pattern by delegating to
> the
> `visibleShape()` pattern for each of the shapes, fail if either of these
> fail,
> and if both succeed, return the bounding rectangle for both shapes:
>
> ```
> class CompoundShape implements Shape {
> Shape s1 = ...
> Shape s2 = ...
>
> public pattern(Rect r) visibleShape() {
> ...
> if (s1 instanceof visibleShape(var r1)
> && s2 instanceof visibleShape(var r2)) {
> // succeed with boundingRect(r1, r2)
> }
> else {
> // fail
> }
> }
> }
> ```
>
>
More information about the amber-spec-comments
mailing list