From forax at univ-mlv.fr Tue Jan 2 12:11:37 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 2 Jan 2024 13:11:37 +0100 (CET) Subject: Blessed modifier order does not include sealed/non-sealed In-Reply-To: References: Message-ID: <1044527186.92730576.1704197497523.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "Pavel Rappo" > To: "core-libs-dev" > Sent: Tuesday, January 2, 2024 12:56:08 PM > Subject: Blessed modifier order does not include sealed/non-sealed > I couldn't find any prior discussions on this matter. > > I noticed that bin/blessed-modifier-order.sh has not been updated for the > [recently introduced](https://openjdk.org/jeps/409) `sealed` and `non-sealed` > keywords. I also note that we already have cases in OpenJDK where those > keywords are ordered differently. If we have a consensus on how to extend the > "blessed order" onto those new keywords, I can create a PR to update the > script. > > -Pavel [amber-spec-experts added] Hello, For me, sealed, non-sealed and final are (mutually exclusive) modifiers that control of the subtypes of a class. Given that there is already a blessed order for final, sealed and non-sealed should be at the same place as final. regards, R?mi From brian.goetz at oracle.com Mon Jan 22 19:45:35 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 22 Jan 2024 14:45:35 -0500 Subject: Towards member patterns Message-ID: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> # Towards member patterns Time to check in on where things are in the bigger picture of patterns as class members.? Note: while this document may have illustrative examples, you should not take that as a definitive statement of syntax, and Remi will not be commenting on the syntax at this time. We've already dipped our toes in the water with _record patterns_.? A record pattern looks like: ??? case R(p1, p2, ... pn): where `R` is a record type and `p1..pn` are nested patterns that are matched to its components.? Because records are defined by their state description, we can automatically derive record patterns "for free", just as we derive record constructors, accessors, etc. There are many other classes that would benefit from being deconstructible with patterns.? To that end, we will generalize record patterns to _deconstruction patterns_, where any class can declare an explicit deconstruction pattern and participate in pattern matching like records do. Deconstruction patterns are not the end of the user-declared pattern story. Just as some classes prefer to expose static factories rather than constructors, they will be able to expose corresponding static patterns.? And there is also a role for "instance patterns" and "pattern objects" as well. Looking only at record and deconstruction patterns, it might be tempting to think that patterns are "just" methods with multiple return.?? But this would be extrapolating from a special case.? Pattern matching is intrinsically _conditional_; the extraction of values from a target is conditioned on whether the target _matches_ the pattern.? For the patterns we've seen so far -- type patterns and record patterns -- matching can be determined entirely by types. But more sophisticated patterns can also depend on other aspects of object state.? For example, a pattern corresponding to the static factory `Optional::of` requires not only that the match candidate be of type `Optional`, but that the match candidate is an `Optional` that actually holds a value. Similarly, a pattern corresponding a regular expression requires the match candidate to not only be a `String`, but to match the regular expression. ## The key intuition around patterns A key capability of objects is _aggregation_; the combination of component values into a higher-level composite that incorporates those components.? Java facilitates a variety of idioms for aggregation, including constructors, factories, builders, etc.? The dual of aggregation is _destructuring_ or _decomposition_, which takes an aggregate and attempts to recover its "ingredients".? However, Java's support for destructuring has historically been far more ad-hoc, largely limited to "write some getters".? Pattern matching seeks to put destructuring on the same firm foundation as aggregation. Deconstruction patterns (such as record patterns) are the dual of construction. If we construct an object: ??? Object o = new Point(x, y); we can deconstruct it with a deconstruction pattern: ??? if (o instanceof Point(var x, var y)) { ... } > Intuitively, this pattern match asks "could this object have come from > invoking the constructor `new Point(x, y)` for some `x` and `y`, and if so, > tell me what they are." While not all patterns exist in direct correspondence to another constructor or method, this intuition that a pattern reconstructs the ingredients to an aggregation operation is central to the design; we'll explore the limitations of this intuition in greater detail later. ## Use cases for declared patterns Before turning to how patterns fit into the object model, let's look at some of the potential use cases for patterns in APIs. ### Recovering construction arguments Deconstruction patterns are the dual of constructors; where a constructor takes N arguments and aggregates them into an object, a deconstruction pattern takes an aggregate and decomposes it into its components.? Constructors are unusual in that they are instance behavior (they have an implicit `this` argument), but are not inherited; deconstruction patterns are the same.? For deconstruction patterns (but not for all instance patterns), the match candidate is always the receiver.? Tentatively, we've decided that deconstruction patterns are always unconditional; that a deconstruction pattern for class `Foo` should match any instance of `Foo`.? At the use site, deconstruction patterns use the same syntax as record patterns: ??? case Point(int x, int y): Just as constructors can be overloaded, so can deconstruction patterns. However, the reasons we might overload deconstruction patterns are slightly different than for constructors, and so it may well be the case that we end up with fewer overloads of deconstruction patterns than we do of constructors. Constructors often form _telescoping sets_, both for reasons of syntactic convenience at the use site (fewer arguments to specify) and to avoid brittleness (clients can let the class implementation pick the defaults rather than hard-coding them.)? This motivation is less pronounced for deconstruction patterns (unwanted bindings can be ignored with `_`), so it is quite possible that authors will choose to have one deconstruction pattern overload per telescoping constructor _set_, rather than one per constructor. There is no requirement for deconstruction patterns to expose the exact same API as constructors, but we expect this will be common, at least for classes for which the construction process is effectively an aggregation operation on the constructor arguments. ### Recovering static factory arguments Not all classes want to expose their constructors; sometimes classes prefer to expose static factories instead.? In this case, the class should be able to expose corresponding static patterns as well. For a class like `Optional`, which exposes factories `Optional::of` and `Optional::empty`, the object state incorporates not only the factory arguments, but which factory was chosen.? Accordingly, it makes sense to deconstruct the object in the same way: ??? switch (optional) { ??????? case Optional.of(var payload): ... ??????? case Optional.empty(): ... ??? } Such patterns are necessarily conditional, asking the Pattern Question: "could this `Optional` have come from the `Optional::of` factory, and if so, with what argument?"? Static patterns, like static methods, lack a receiver, so `this` is not defined in the body of a static pattern.? However, we will need a way to denote the match candidate, so its state can be examined by the pattern body. Another feature of static methods is that they can be used to put a factory for a class `C` in _another_ class, whether one in the same maintenance domain (such as the `Collections`) or in some other package.? This feature is shared by static patterns. ### Conversions and queries Another application for static patterns is the dual of static methods for conversions.? For a static method like `Integer::toString`, which converts an `int` to its `String` representation, a corresponding static pattern `Integer::toString` can ask the Pattern Question: "could this `String` have come from converting an integer to `String`, and if so, what integer". Some groups of query methods in existing APIs are patterns in disguise.? The class `java.lang.Class` has a pair of instance methods, `Class::isArray` and `Class::getComponentType`, that work together to determine if the `Class` describes an array type, and if so, provide its component type. This question is much better framed as a single pattern: ??? case Class.arrayClass(var componentType): The two existing methods are made more complicated by their relationship to each other; `Class::getComponentType` has a precondition (the `Class` must describe an array type) and therefore has to specify and implement what to do if the precondition fails, and the relationship between the methods is captured only in documentation.? By combining them into a single pattern, it become impossible to misuse (because of the inherent conditionality of patterns) and easier to understand (because it can all be documented in one place.) This hypothetical `Class::arrayClass` pattern also has a sensible dual as a factory method: ??? static Class arrayClass(Class componentType) which produces the array `Class` for the array type whose component type is provided. An API need not provide both directions of a conversion, but if it does, the two generally strengthen each other.? This method/pattern pair could be either static or instance members, depending on API design choice. Another form of "conversion" method / pattern pair, even though both types are the same, is "power of two".? A `powerOfTwo` method takes an exponent and returns the resulting power of two; a `powerOfTwo` pattern asks if its match candidate is a power of two, and if so, binds the base-two logarithm. ### Numeric conversions As Project Valhalla gives us the ability to declare new numeric types, we will want to be able to convert these new types to other numeric types.? For unconditional conversions (such as widening half-float to float), an ordinary method will suffice: ??? float widen(HalfFloat f); But the reverse is unlikely to be unconditional; narrowing conversions can fail if the value cannot be represented in the narrower type. This is better represented as a pattern which asks the Pattern Question: "could this `float` have come from widening a `HalfFloat`, and if so, tell me what `HalfFloat` that is."? A widening conversion (or boxing conversion) is best represented by a _pair_ of members, an ordinary method for the unconditional direction, and a pattern for the conditional direction. ### Conditional extraction Some operations, such as matching a string to a regular expression with capture groups, are pattern matches in disguise.? We should be able to take a regular expression R and match against it with `instanceof` or `switch`, binding capture groups (using varargs patterns) if it matches. ## Member patterns in the object model We currently have three kinds of executable class members: constructors, static methods, and instance methods.? (Actually constructors are not members, but we will leave this pedantic detail aside for now.)? As the above examples show, each of these can be amenable to a dual member which asks the Pattern Question about it. Patterns are dual to constructors and methods in two ways: structurally and semantically.? Structurally, patterns invert the relationship between inputs and outputs: a method takes N arguments as input and produces a single result, and the corresponding pattern takes a candidate result (the "match candidate") and conditionally produces N bindings.? Semantically, patterns ask the Pattern Question: could this result have originated by some invocation of the dual operation. ### Patterns as inverse methods and constructors One way to frame patterns in the object model is as _inverse constructors_ and _inverse methods_.? For purposes of this document, I will use an illustrative syntax that directly evokes this duality (but remember, we're not discussing syntax now): ``` class Point { ??? final int x, y; ??? // Constructor ??? Point(int x, int y) { ... } ??? // Deconstruction pattern ??? inverse Point(int x, int y) { ... } } class Optional { ??? // Static factories ??? static Optional of(T t) { ... } ??? static Optional empty() { ... } ??? // Static patterns ??? static inverse Optional of(T t) { ... } ??? static inverse Optional empty() { ... } } ``` `Point` has a constructor an an inverse constructor (deconstruction pattern) for the external representation `(int x, int y)`; in an inverse constructor, the binding list appears where the parameter list does in the constructor. `Optional` has static factories and corresponding patterns for `empty` and `of`.? As with inverse constructors, the binding list of a pattern appears in the position that the parameters appear in a method declaration; additionally, the _match candidate type_ appears in the position that the return value appears in a method declaration.? In both cases, the declaration site and use site of the pattern uses the same syntax. In the body of an inverse constructor or method, we need to be able to talk about the match candidate.? In this model, the match candidate has a type determined by the declaration (for an inverse constructor, the class; for an inverse method, the type specified in the "return position" of the inverse method declaration), and there is a predefined context variable (e.g., `that`) that refers to the match candidate.? For inverse constructors, the receiver (`this`) is aliased to the match candidate (`that`), but not necessarily so for inverse methods. ### Do all methods potentially have inverses? We've seen examples of constructors, static methods, and instance methods that have sensible inverses, but not all methods do.? For example, methods that operate primarily by side effects (such as mutative methods like setters or `List::add`) are not suitable candidates for inverses.? Similarly, pure functions that "co-mingle" their arguments (such as arithmetic operators) are also not suitable candidates for inverses, because the ingredients to the operation typically can't be recovered from the result (i.e., `4` could be the result of `plus(2, 2)` or `plus(1, 3)`). Intuitively, the methods that are invertible are the ones that are _aggregative_.? The constructor of a (well-behaved) record is aggregative, since all the information passed to the constructor is preserved in the result. Factories like `Optional::of` are similarly aggregative, as are non-lossy conversions such as widening or boxing conversions. Ideally, an aggregation operation and its corresponding inverse form an _embedding projection pair_ between the aggregate and a component space. Intuitively, an embedding-projection pair is an algebraic structure defined by a pair of functions between two sets such that composing in one direction (embed-then-project) is an identity, and composing in the other direction (project-then-embed) is a well-behaved approximation. ### Conversions Conversion methods are a frequent candidate for inversion.? We already have ??? // Integer.java ??? static String toString(int i) { ... } to which the obvious inverse is ??? static inverse String toString(int i) { ... } and we can inspect a string to see if it is the string representation of an integer with ??? if (s instanceof Integer.toString(int i)) { ... } This composes nicely with deconstruction patterns; if we have a `Box` and want to ask whether the contained string is really the string representation of an integer, we can ask: ??? case Box(Integer.toString(int i)): which conveniently looks just like the composition of constructors or factories used to create such an instance (`new Box(Integer.toString(3))`). When it comes to user-definable numeric conversions, the most likely strategy involves combining related operators in a single _witness_ object.? For example, numeric conversion might be modeled as: ``` interface NumericConversion { ??? TO convert(FROM from); ??? inverse TO convert(FROM from); } ``` which reflects the fact that conversion is total in one direction (widening, boxing) and conditional in the other (narrowing, unboxing.) ### Regular expression matching Regular expressions are a form of ad-hoc pattern; a given string might match a given regex, or not, and if it does, it might product multiple bindings (the capture groups.)? It would be nice to be able to express regular expression matches as ordinary pattern matches. Conveniently, we already have an object representation of regular expressions -- `java.util.Pattern`.? Which is an ideal place to put an instance pattern: ``` // varargs pattern public inverse String match(String... groups) { ??? Matcher m = matcher(that);??? // *that* is the match candidate ??? if (m.matches())????????????? // receiver for matcher() is the Pattern ??????? __yield IntStream.range(1, m.groupCount()) ??????????????????????????? .map(Matcher::group) ??????????????????????????? .toArray(String[]::new); } ``` And now, we want to express "does string s match any of these regular expressions": ``` static final Pattern As = Pattern.compile("([aA]*)"); static final Pattern Bs = Pattern.compile("([bB]*)"); static final Pattern Cs = Pattern.compile("([cC]*)"); ... switch (aString) { ??? case As.match(String as) -> ... ??? case Bs.match(String bs) -> ... ??? case Cs.match(String cs) -> ... ??? ... } ``` Essentially, `j.u.r.Pattern` becomes a _pattern object_, where the state of the object is used to determine whether or not it matches any given input.? (There is nothing stopping a class from having multiple patterns, just as it can have multiple methods.) ## Pattern resolution When we invoke a method, sometimes we are able to refer to the method with an _unqualified_ name (e.g., `m(3)`), and sometimes the method must be _qualified_ with a type name, package name, or a receiver object.? The same is true for declared patterns. Constructors for classes that are in the same package, or have been imported, can be referred to with an unqualified name; constructors can also be qualified with a package name.? The same is true for deconstruction patterns: ``` case Foo(int x, int y):???????? // unqualified case com.foo.Bar(int x, int y): // qualified by package ``` Static methods that are declared in the current class or an enclosing class, or are statically imported, can be referred to with an unqualified name; static methods can also be qualified with a type name.? The same is true for static patterns: ``` case powerOfTwo(int exp):? // unqualified case Optional.of(var e):?? // qualified by class ``` Instance methods invoked on the current object can be referred to with an unqualified name; instance methods can also be qualified by a receiver object. The same is true for instance patterns: ``` case match(String s):??? // unqualified case As.match(String s): // qualified by receiver ``` In a qualified pattern `x.y`, `x` might be a package name, a class name, or a (effectively final) receiver variable; we use the same rules for choosing how to interpret a qualifier for patterns as we do for method invocations. ## Benefits of explicit duality Declaring method-pattern pairs whose structure and name are the same yields many benefits.? It means that we take things apart using the same abstractions used to put them together, which makes code more readable and less error-prone. Referring to a _inverse pair_ of operations by a single name is simpler than having separate names for each direction; not only don't we need to come up with a name for the other direction, we also don't need to teach clients that "these two names are inverses", because the inverses have the same name already. What we know about the method `Integer::toString` immediately carries over to its inverse. Further, thinking about a method-pattern pair provides a normalizing force to actually ensuring the two are inverses; if we just had two related methods `xToY` and `yToX`, they might diverge subtly because the connection between the two members is not very strong. Finally, this gives the language permission to treat the _pair_ of members as a thing in some cases, such as the use of ctor-dtor pairs in "withers" or serialization. The explicit duality takes a little time to get used to.?? We have many years of experience of naming a method for its directionality, so people's first reaction is often "the pattern should be called `Integer.fromString`, not `Integer.toString`". So people will initially bristle at giving both directions the same name, especially when one implies a directionality such as `toString`. (In these cases, we can fall back on a convention that says that we should name it for the total direction.) ## Pattern lambdas, pattern objects, pattern references Interfaces with a single abstract method (SAM) are called _functional interfaces_ and we support a conversion (historically called SAM conversion) from lambdas to functional interfaces.? Interfaces with a single abstract pattern can benefit from a similar conversion (call this "SAP" conversion.) In the early days of Streams, people complained about processing a stream using instanceof and cast: ``` Stream objects = ... Stream strings = objects.filter(x -> x instanceof String) ??????????????????????????????? .map(x -> (String) x); ``` This locution is disappointing both for its verbosity (saying the same thing in two different ways) and its efficiency (doing the same work basically twice.)? Later, it became possible to slightly simplify this using `mapMulti`: ``` objects.mapMulti((x, sink) -> { if (x instanceof String s) sink.accept(s); }) ``` But, ultimately this stream pipeline is a pattern match; we want to match the elements to the pattern `String s`, and get a stream of the matched string bindings.? We are now in a position to expose this more directly. Suppose we had the following SAP interface: ``` interface Match { ??? inverse U match(T t); } ``` then `Stream` could expose a `match` method: ``` Stream match(Match pattern); ``` We can SAP-convert a lambda whose yielded bindings are compatible with the sole abstract pattern in the SAP interface:: ``` Match m = o -> { if (o instanceof String s) __yield(s); }; ... stream.match(s) ... ``` And we can do the same with _pattern references_ to existing patterns that are compatible with the sole pattern in a SAP interface.?? As a special case, we can also support a conversion from type patterns to a compatible SAP type with an `instanceof` pattern reference (analogous to a `new` method reference): ``` objects.match(String::instanceof) ``` where `String::instanceof` means the same as the previous lambda example.? This means that APIs like `Stream` can abstract over conditional behavior as well as unconditional. From brian.goetz at oracle.com Tue Jan 23 19:57:51 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 23 Jan 2024 14:57:51 -0500 Subject: Towards member patterns In-Reply-To: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> Message-ID: I was told that the formatting of the earlier version was borked, so re-sending, hopefully without any formatting this time.... # Towards member patterns Time to check in on where things are in the bigger picture of patterns as class members.? Note: while this document may have illustrative examples, you should not take that as a definitive statement of syntax, and Remi will not be commenting on the syntax at this time. We've already dipped our toes in the water with _record patterns_. A record pattern looks like: ??? case R(p1, p2, ... pn): where `R` is a record type and `p1..pn` are nested patterns that are matched to its components.? Because records are defined by their state description, we can automatically derive record patterns "for free", just as we derive record constructors, accessors, etc. There are many other classes that would benefit from being deconstructible with patterns.? To that end, we will generalize record patterns to _deconstruction patterns_, where any class can declare an explicit deconstruction pattern and participate in pattern matching like records do. Deconstruction patterns are not the end of the user-declared pattern story. Just as some classes prefer to expose static factories rather than constructors, they will be able to expose corresponding static patterns.? And there is also a role for "instance patterns" and "pattern objects" as well. Looking only at record and deconstruction patterns, it might be tempting to think that patterns are "just" methods with multiple return.?? But this would be extrapolating from a special case.? Pattern matching is intrinsically _conditional_; the extraction of values from a target is conditioned on whether the target _matches_ the pattern.? For the patterns we've seen so far -- type patterns and record patterns -- matching can be determined entirely by types. But more sophisticated patterns can also depend on other aspects of object state.? For example, a pattern corresponding to the static factory `Optional::of` requires not only that the match candidate be of type `Optional`, but that the match candidate is an `Optional` that actually holds a value. Similarly, a pattern corresponding a regular expression requires the match candidate to not only be a `String`, but to match the regular expression. ## The key intuition around patterns A key capability of objects is _aggregation_; the combination of component values into a higher-level composite that incorporates those components.? Java facilitates a variety of idioms for aggregation, including constructors, factories, builders, etc.? The dual of aggregation is _destructuring_ or _decomposition_, which takes an aggregate and attempts to recover its "ingredients".? However, Java's support for destructuring has historically been far more ad-hoc, largely limited to "write some getters".? Pattern matching seeks to put destructuring on the same firm foundation as aggregation. Deconstruction patterns (such as record patterns) are the dual of construction. If we construct an object: ??? Object o = new Point(x, y); we can deconstruct it with a deconstruction pattern: ??? if (o instanceof Point(var x, var y)) { ... } > Intuitively, this pattern match asks "could this object have come from > invoking the constructor `new Point(x, y)` for some `x` and `y`, and if so, > tell me what they are." While not all patterns exist in direct correspondence to another constructor or method, this intuition that a pattern reconstructs the ingredients to an aggregation operation is central to the design; we'll explore the limitations of this intuition in greater detail later. ## Use cases for declared patterns Before turning to how patterns fit into the object model, let's look at some of the potential use cases for patterns in APIs. ### Recovering construction arguments Deconstruction patterns are the dual of constructors; where a constructor takes N arguments and aggregates them into an object, a deconstruction pattern takes an aggregate and decomposes it into its components.? Constructors are unusual in that they are instance behavior (they have an implicit `this` argument), but are not inherited; deconstruction patterns are the same.? For deconstruction patterns (but not for all instance patterns), the match candidate is always the receiver.? Tentatively, we've decided that deconstruction patterns are always unconditional; that a deconstruction pattern for class `Foo` should match any instance of `Foo`.? At the use site, deconstruction patterns use the same syntax as record patterns: ??? case Point(int x, int y): Just as constructors can be overloaded, so can deconstruction patterns. However, the reasons we might overload deconstruction patterns are slightly different than for constructors, and so it may well be the case that we end up with fewer overloads of deconstruction patterns than we do of constructors. Constructors often form _telescoping sets_, both for reasons of syntactic convenience at the use site (fewer arguments to specify) and to avoid brittleness (clients can let the class implementation pick the defaults rather than hard-coding them.)? This motivation is less pronounced for deconstruction patterns (unwanted bindings can be ignored with `_`), so it is quite possible that authors will choose to have one deconstruction pattern overload per telescoping constructor _set_, rather than one per constructor. There is no requirement for deconstruction patterns to expose the exact same API as constructors, but we expect this will be common, at least for classes for which the construction process is effectively an aggregation operation on the constructor arguments. ### Recovering static factory arguments Not all classes want to expose their constructors; sometimes classes prefer to expose static factories instead.? In this case, the class should be able to expose corresponding static patterns as well. For a class like `Optional`, which exposes factories `Optional::of` and `Optional::empty`, the object state incorporates not only the factory arguments, but which factory was chosen.? Accordingly, it makes sense to deconstruct the object in the same way: ??? switch (optional) { ??????? case Optional.of(var payload): ... ??????? case Optional.empty(): ... ??? } Such patterns are necessarily conditional, asking the Pattern Question: "could this `Optional` have come from the `Optional::of` factory, and if so, with what argument?"? Static patterns, like static methods, lack a receiver, so `this` is not defined in the body of a static pattern.? However, we will need a way to denote the match candidate, so its state can be examined by the pattern body. Another feature of static methods is that they can be used to put a factory for a class `C` in _another_ class, whether one in the same maintenance domain (such as the `Collections`) or in some other package.? This feature is shared by static patterns. ### Conversions and queries Another application for static patterns is the dual of static methods for conversions.? For a static method like `Integer::toString`, which converts an `int` to its `String` representation, a corresponding static pattern `Integer::toString` can ask the Pattern Question: "could this `String` have come from converting an integer to `String`, and if so, what integer". Some groups of query methods in existing APIs are patterns in disguise.? The class `java.lang.Class` has a pair of instance methods, `Class::isArray` and `Class::getComponentType`, that work together to determine if the `Class` describes an array type, and if so, provide its component type. This question is much better framed as a single pattern: ??? case Class.arrayClass(var componentType): The two existing methods are made more complicated by their relationship to each other; `Class::getComponentType` has a precondition (the `Class` must describe an array type) and therefore has to specify and implement what to do if the precondition fails, and the relationship between the methods is captured only in documentation.? By combining them into a single pattern, it become impossible to misuse (because of the inherent conditionality of patterns) and easier to understand (because it can all be documented in one place.) This hypothetical `Class::arrayClass` pattern also has a sensible dual as a factory method: ??? static Class arrayClass(Class componentType) which produces the array `Class` for the array type whose component type is provided. An API need not provide both directions of a conversion, but if it does, the two generally strengthen each other.? This method/pattern pair could be either static or instance members, depending on API design choice. Another form of "conversion" method / pattern pair, even though both types are the same, is "power of two".? A `powerOfTwo` method takes an exponent and returns the resulting power of two; a `powerOfTwo` pattern asks if its match candidate is a power of two, and if so, binds the base-two logarithm. ### Numeric conversions As Project Valhalla gives us the ability to declare new numeric types, we will want to be able to convert these new types to other numeric types. For unconditional conversions (such as widening half-float to float), an ordinary method will suffice: ??? float widen(HalfFloat f); But the reverse is unlikely to be unconditional; narrowing conversions can fail if the value cannot be represented in the narrower type. This is better represented as a pattern which asks the Pattern Question: "could this `float` have come from widening a `HalfFloat`, and if so, tell me what `HalfFloat` that is."? A widening conversion (or boxing conversion) is best represented by a _pair_ of members, an ordinary method for the unconditional direction, and a pattern for the conditional direction. ### Conditional extraction Some operations, such as matching a string to a regular expression with capture groups, are pattern matches in disguise.? We should be able to take a regular expression R and match against it with `instanceof` or `switch`, binding capture groups (using varargs patterns) if it matches. ## Member patterns in the object model We currently have three kinds of executable class members: constructors, static methods, and instance methods.? (Actually constructors are not members, but we will leave this pedantic detail aside for now.)? As the above examples show, each of these can be amenable to a dual member which asks the Pattern Question about it. Patterns are dual to constructors and methods in two ways: structurally and semantically.? Structurally, patterns invert the relationship between inputs and outputs: a method takes N arguments as input and produces a single result, and the corresponding pattern takes a candidate result (the "match candidate") and conditionally produces N bindings.? Semantically, patterns ask the Pattern Question: could this result have originated by some invocation of the dual operation. ### Patterns as inverse methods and constructors One way to frame patterns in the object model is as _inverse constructors_ and _inverse methods_.? For purposes of this document, I will use an illustrative syntax that directly evokes this duality (but remember, we're not discussing syntax now): ``` class Point { ??? final int x, y; ??? // Constructor ??? Point(int x, int y) { ... } ??? // Deconstruction pattern ??? inverse Point(int x, int y) { ... } } class Optional { ??? // Static factories ??? static Optional of(T t) { ... } ??? static Optional empty() { ... } ??? // Static patterns ??? static inverse Optional of(T t) { ... } ??? static inverse Optional empty() { ... } } ``` `Point` has a constructor an an inverse constructor (deconstruction pattern) for the external representation `(int x, int y)`; in an inverse constructor, the binding list appears where the parameter list does in the constructor. `Optional` has static factories and corresponding patterns for `empty` and `of`.? As with inverse constructors, the binding list of a pattern appears in the position that the parameters appear in a method declaration; additionally, the _match candidate type_ appears in the position that the return value appears in a method declaration.? In both cases, the declaration site and use site of the pattern uses the same syntax. In the body of an inverse constructor or method, we need to be able to talk about the match candidate.? In this model, the match candidate has a type determined by the declaration (for an inverse constructor, the class; for an inverse method, the type specified in the "return position" of the inverse method declaration), and there is a predefined context variable (e.g., `that`) that refers to the match candidate.? For inverse constructors, the receiver (`this`) is aliased to the match candidate (`that`), but not necessarily so for inverse methods. ### Do all methods potentially have inverses? We've seen examples of constructors, static methods, and instance methods that have sensible inverses, but not all methods do.? For example, methods that operate primarily by side effects (such as mutative methods like setters or `List::add`) are not suitable candidates for inverses.? Similarly, pure functions that "co-mingle" their arguments (such as arithmetic operators) are also not suitable candidates for inverses, because the ingredients to the operation typically can't be recovered from the result (i.e., `4` could be the result of `plus(2, 2)` or `plus(1, 3)`). Intuitively, the methods that are invertible are the ones that are _aggregative_.? The constructor of a (well-behaved) record is aggregative, since all the information passed to the constructor is preserved in the result. Factories like `Optional::of` are similarly aggregative, as are non-lossy conversions such as widening or boxing conversions. Ideally, an aggregation operation and its corresponding inverse form an _embedding projection pair_ between the aggregate and a component space. Intuitively, an embedding-projection pair is an algebraic structure defined by a pair of functions between two sets such that composing in one direction (embed-then-project) is an identity, and composing in the other direction (project-then-embed) is a well-behaved approximation. ### Conversions Conversion methods are a frequent candidate for inversion.? We already have ??? // Integer.java ??? static String toString(int i) { ... } to which the obvious inverse is ??? static inverse String toString(int i) { ... } and we can inspect a string to see if it is the string representation of an integer with ??? if (s instanceof Integer.toString(int i)) { ... } This composes nicely with deconstruction patterns; if we have a `Box` and want to ask whether the contained string is really the string representation of an integer, we can ask: ??? case Box(Integer.toString(int i)): which conveniently looks just like the composition of constructors or factories used to create such an instance (`new Box(Integer.toString(3))`). When it comes to user-definable numeric conversions, the most likely strategy involves combining related operators in a single _witness_ object. For example, numeric conversion might be modeled as: ``` interface NumericConversion { ??? TO convert(FROM from); ??? inverse TO convert(FROM from); } ``` which reflects the fact that conversion is total in one direction (widening, boxing) and conditional in the other (narrowing, unboxing.) ### Regular expression matching Regular expressions are a form of ad-hoc pattern; a given string might match a given regex, or not, and if it does, it might product multiple bindings (the capture groups.)? It would be nice to be able to express regular expression matches as ordinary pattern matches. Conveniently, we already have an object representation of regular expressions -- `java.util.Pattern`.? Which is an ideal place to put an instance pattern: ``` // varargs pattern public inverse String match(String... groups) { ??? Matcher m = matcher(that);??? // *that* is the match candidate ??? if (m.matches())????????????? // receiver for matcher() is the Pattern ??????? __yield IntStream.range(1, m.groupCount()) ??????????????????????????? .map(Matcher::group) ??????????????????????????? .toArray(String[]::new); } ``` And now, we want to express "does string s match any of these regular expressions": ``` static final Pattern As = Pattern.compile("([aA]*)"); static final Pattern Bs = Pattern.compile("([bB]*)"); static final Pattern Cs = Pattern.compile("([cC]*)"); ... switch (aString) { ??? case As.match(String as) -> ... ??? case Bs.match(String bs) -> ... ??? case Cs.match(String cs) -> ... ??? ... } ``` Essentially, `j.u.r.Pattern` becomes a _pattern object_, where the state of the object is used to determine whether or not it matches any given input.? (There is nothing stopping a class from having multiple patterns, just as it can have multiple methods.) ## Pattern resolution When we invoke a method, sometimes we are able to refer to the method with an _unqualified_ name (e.g., `m(3)`), and sometimes the method must be _qualified_ with a type name, package name, or a receiver object.? The same is true for declared patterns. Constructors for classes that are in the same package, or have been imported, can be referred to with an unqualified name; constructors can also be qualified with a package name.? The same is true for deconstruction patterns: ``` case Foo(int x, int y):???????? // unqualified case com.foo.Bar(int x, int y): // qualified by package ``` Static methods that are declared in the current class or an enclosing class, or are statically imported, can be referred to with an unqualified name; static methods can also be qualified with a type name.? The same is true for static patterns: ``` case powerOfTwo(int exp):? // unqualified case Optional.of(var e):?? // qualified by class ``` Instance methods invoked on the current object can be referred to with an unqualified name; instance methods can also be qualified by a receiver object. The same is true for instance patterns: ``` case match(String s):??? // unqualified case As.match(String s): // qualified by receiver ``` In a qualified pattern `x.y`, `x` might be a package name, a class name, or a (effectively final) receiver variable; we use the same rules for choosing how to interpret a qualifier for patterns as we do for method invocations. ## Benefits of explicit duality Declaring method-pattern pairs whose structure and name are the same yields many benefits.? It means that we take things apart using the same abstractions used to put them together, which makes code more readable and less error-prone. Referring to a _inverse pair_ of operations by a single name is simpler than having separate names for each direction; not only don't we need to come up with a name for the other direction, we also don't need to teach clients that "these two names are inverses", because the inverses have the same name already. What we know about the method `Integer::toString` immediately carries over to its inverse. Further, thinking about a method-pattern pair provides a normalizing force to actually ensuring the two are inverses; if we just had two related methods `xToY` and `yToX`, they might diverge subtly because the connection between the two members is not very strong. Finally, this gives the language permission to treat the _pair_ of members as a thing in some cases, such as the use of ctor-dtor pairs in "withers" or serialization. The explicit duality takes a little time to get used to.?? We have many years of experience of naming a method for its directionality, so people's first reaction is often "the pattern should be called `Integer.fromString`, not `Integer.toString`". So people will initially bristle at giving both directions the same name, especially when one implies a directionality such as `toString`. (In these cases, we can fall back on a convention that says that we should name it for the total direction.) ## Pattern lambdas, pattern objects, pattern references Interfaces with a single abstract method (SAM) are called _functional interfaces_ and we support a conversion (historically called SAM conversion) from lambdas to functional interfaces.? Interfaces with a single abstract pattern can benefit from a similar conversion (call this "SAP" conversion.) In the early days of Streams, people complained about processing a stream using instanceof and cast: ``` Stream objects = ... Stream strings = objects.filter(x -> x instanceof String) ??????????????????????????????? .map(x -> (String) x); ``` This locution is disappointing both for its verbosity (saying the same thing in two different ways) and its efficiency (doing the same work basically twice.)? Later, it became possible to slightly simplify this using `mapMulti`: ``` objects.mapMulti((x, sink) -> { if (x instanceof String s) sink.accept(s); }) ``` But, ultimately this stream pipeline is a pattern match; we want to match the elements to the pattern `String s`, and get a stream of the matched string bindings.? We are now in a position to expose this more directly. Suppose we had the following SAP interface: ``` interface Match { ??? inverse U match(T t); } ``` then `Stream` could expose a `match` method: ``` Stream match(Match pattern); ``` We can SAP-convert a lambda whose yielded bindings are compatible with the sole abstract pattern in the SAP interface:: ``` Match m = o -> { if (o instanceof String s) __yield(s); }; ... stream.match(s) ... ``` And we can do the same with _pattern references_ to existing patterns that are compatible with the sole pattern in a SAP interface.?? As a special case, we can also support a conversion from type patterns to a compatible SAP type with an `instanceof` pattern reference (analogous to a `new` method reference): ``` objects.match(String::instanceof) ``` where `String::instanceof` means the same as the previous lambda example.? This means that APIs like `Stream` can abstract over conditional behavior as well as unconditional. From forax at univ-mlv.fr Wed Jan 24 10:38:03 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 24 Jan 2024 11:38:03 +0100 (CET) Subject: Towards member patterns In-Reply-To: References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> Message-ID: <969518960.110944933.1706092683452.JavaMail.zimbra@univ-eiffel.fr> Hello, I agree until the section 'Recovering static factory arguments', because at that point, it's the tail waging the dog. In this section you introduce a new syntax with little justification ("it makes sense") which should not a big problem if it was not your way to justify the introduction of pattern method being static, which are static only because the syntax you have chosen looks like static calls. As you said in the sections above, case Point(int x, int y) is two things, first an instanceof of Point followed by the destructuration of a Point, and those two operations are co-mingled into one syntax. We can try to do the same with pattern methods, i.e. having one syntax that express an instanceof follow by a call on the result of that instanceof to a method that extract the binding. I think that choosing a syntax at that point is premature given the semantics is still in flux, especially if that syntax is used as a justification to say that pattern methods are static (*). At that point, it's a trap. As you know, you can use static methods instead of instance methods but you are loosing the polymorphism. In case of method pattern, it's an issue because it means that if you have a pattern method on an interface, an implementation cannot override it. Here is an example using the syntax you propose, let say I want to do the inverse of Map.entry(), if write an inverse pattern method inside Map then the only way write it is to use the getters getKey() and getValue(). interface Map { static Entry entry(K key, V value) { ... } static inverse Entry entry(K key, V value) { // here, we are morons, we have to use getters :( __yield (that.getKey(), that.getValue()); } } So a thread-safe implementation of Map cannot write an implementation of Map.Entry that destructures itself under a lock. I think as an exercice, you should try to use a syntax that does not looks like a static method call for the pattern (e.g. case Point p && match p.polar(double rho, double theta) ?) and ask yourself if it still make sense for the pattern method to be always static and for the concept of inverse method to exist. R?mi * I know it's not fully static, it can be an instance method if the pattern is itself used inside an instance method (just to add side effects in the mix). ----- Original Message ----- > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Tuesday, January 23, 2024 8:57:51 PM > Subject: Re: Towards member patterns > I was told that the formatting of the earlier version was borked, so > re-sending, hopefully without any formatting this time.... > > # Towards member patterns > > Time to check in on where things are in the bigger picture of patterns > as class > members.? Note: while this document may have illustrative examples, you > should > not take that as a definitive statement of syntax, and Remi will not be > commenting on the syntax at this time. > > We've already dipped our toes in the water with _record patterns_. A record > pattern looks like: > > ??? case R(p1, p2, ... pn): > > where `R` is a record type and `p1..pn` are nested patterns that are > matched to > its components.? Because records are defined by their state description, > we can > automatically derive record patterns "for free", just as we derive record > constructors, accessors, etc. > > There are many other classes that would benefit from being > deconstructible with > patterns.? To that end, we will generalize record patterns to > _deconstruction > patterns_, where any class can declare an explicit deconstruction > pattern and > participate in pattern matching like records do. > > Deconstruction patterns are not the end of the user-declared pattern > story. Just > as some classes prefer to expose static factories rather than > constructors, they > will be able to expose corresponding static patterns.? And there is also > a role > for "instance patterns" and "pattern objects" as well. > > Looking only at record and deconstruction patterns, it might be tempting to > think that patterns are "just" methods with multiple return.?? But this > would be > extrapolating from a special case.? Pattern matching is intrinsically > _conditional_; the extraction of values from a target is conditioned on > whether > the target _matches_ the pattern.? For the patterns we've seen so far -- > type > patterns and record patterns -- matching can be determined entirely by > types. > But more sophisticated patterns can also depend on other aspects of object > state.? For example, a pattern corresponding to the static factory > `Optional::of` requires not only that the match candidate be of type > `Optional`, > but that the match candidate is an `Optional` that actually holds a value. > Similarly, a pattern corresponding a regular expression requires the match > candidate to not only be a `String`, but to match the regular expression. > > ## The key intuition around patterns > > A key capability of objects is _aggregation_; the combination of component > values into a higher-level composite that incorporates those > components.? Java > facilitates a variety of idioms for aggregation, including constructors, > factories, builders, etc.? The dual of aggregation is _destructuring_ or > _decomposition_, which takes an aggregate and attempts to recover its > "ingredients".? However, Java's support for destructuring has > historically been > far more ad-hoc, largely limited to "write some getters".? Pattern matching > seeks to put destructuring on the same firm foundation as aggregation. > > Deconstruction patterns (such as record patterns) are the dual of > construction. > If we construct an object: > > ??? Object o = new Point(x, y); > > we can deconstruct it with a deconstruction pattern: > > ??? if (o instanceof Point(var x, var y)) { ... } > > > Intuitively, this pattern match asks "could this object have come from > > invoking the constructor `new Point(x, y)` for some `x` and `y`, and > if so, > > tell me what they are." > > While not all patterns exist in direct correspondence to another > constructor or > method, this intuition that a pattern reconstructs the ingredients to an > aggregation operation is central to the design; we'll explore the > limitations of > this intuition in greater detail later. > > ## Use cases for declared patterns > > Before turning to how patterns fit into the object model, let's look at > some of > the potential use cases for patterns in APIs. > > ### Recovering construction arguments > > Deconstruction patterns are the dual of constructors; where a > constructor takes > N arguments and aggregates them into an object, a deconstruction pattern > takes > an aggregate and decomposes it into its components.? Constructors are > unusual in > that they are instance behavior (they have an implicit `this` argument), > but are > not inherited; deconstruction patterns are the same.? For deconstruction > patterns (but not for all instance patterns), the match candidate is > always the > receiver.? Tentatively, we've decided that deconstruction patterns are > always > unconditional; that a deconstruction pattern for class `Foo` should > match any > instance of `Foo`.? At the use site, deconstruction patterns use the > same syntax > as record patterns: > > ??? case Point(int x, int y): > > Just as constructors can be overloaded, so can deconstruction patterns. > However, > the reasons we might overload deconstruction patterns are slightly different > than for constructors, and so it may well be the case that we end up > with fewer > overloads of deconstruction patterns than we do of constructors. > Constructors > often form _telescoping sets_, both for reasons of syntactic convenience > at the > use site (fewer arguments to specify) and to avoid brittleness (clients > can let > the class implementation pick the defaults rather than hard-coding > them.)? This > motivation is less pronounced for deconstruction patterns (unwanted > bindings can > be ignored with `_`), so it is quite possible that authors will choose > to have > one deconstruction pattern overload per telescoping constructor _set_, > rather > than one per constructor. > > There is no requirement for deconstruction patterns to expose the exact > same API > as constructors, but we expect this will be common, at least for classes for > which the construction process is effectively an aggregation operation > on the > constructor arguments. > > ### Recovering static factory arguments > > Not all classes want to expose their constructors; sometimes classes > prefer to > expose static factories instead.? In this case, the class should be able to > expose corresponding static patterns as well. > > For a class like `Optional`, which exposes factories `Optional::of` and > `Optional::empty`, the object state incorporates not only the factory > arguments, > but which factory was chosen.? Accordingly, it makes sense to > deconstruct the > object in the same way: > > ??? switch (optional) { > ??????? case Optional.of(var payload): ... > ??????? case Optional.empty(): ... > ??? } > > Such patterns are necessarily conditional, asking the Pattern Question: > "could > this `Optional` have come from the `Optional::of` factory, and if so, > with what > argument?"? Static patterns, like static methods, lack a receiver, so > `this` is > not defined in the body of a static pattern.? However, we will need a way to > denote the match candidate, so its state can be examined by the pattern > body. > > Another feature of static methods is that they can be used to put a > factory for > a class `C` in _another_ class, whether one in the same maintenance > domain (such > as the `Collections`) or in some other package.? This feature is shared by > static patterns. > > ### Conversions and queries > > Another application for static patterns is the dual of static methods for > conversions.? For a static method like `Integer::toString`, which > converts an > `int` to its `String` representation, a corresponding static pattern > `Integer::toString` can ask the Pattern Question: "could this `String` > have come > from converting an integer to `String`, and if so, what integer". > > Some groups of query methods in existing APIs are patterns in disguise.? The > class `java.lang.Class` has a pair of instance methods, `Class::isArray` and > `Class::getComponentType`, that work together to determine if the `Class` > describes an array type, and if so, provide its component type. This > question > is much better framed as a single pattern: > > ??? case Class.arrayClass(var componentType): > > The two existing methods are made more complicated by their relationship > to each > other; `Class::getComponentType` has a precondition (the `Class` must > describe > an array type) and therefore has to specify and implement what to do if the > precondition fails, and the relationship between the methods is captured > only in > documentation.? By combining them into a single pattern, it become > impossible to > misuse (because of the inherent conditionality of patterns) and easier to > understand (because it can all be documented in one place.) > > This hypothetical `Class::arrayClass` pattern also has a sensible dual as a > factory method: > > ??? static Class arrayClass(Class componentType) > > which produces the array `Class` for the array type whose component type is > provided. An API need not provide both directions of a conversion, but if it > does, the two generally strengthen each other.? This method/pattern pair > could > be either static or instance members, depending on API design choice. > > Another form of "conversion" method / pattern pair, even though both > types are > the same, is "power of two".? A `powerOfTwo` method takes an exponent and > returns the resulting power of two; a `powerOfTwo` pattern asks if its match > candidate is a power of two, and if so, binds the base-two logarithm. > > ### Numeric conversions > > As Project Valhalla gives us the ability to declare new numeric types, > we will > want to be able to convert these new types to other numeric types. For > unconditional conversions (such as widening half-float to float), an > ordinary > method will suffice: > > ??? float widen(HalfFloat f); > > But the reverse is unlikely to be unconditional; narrowing conversions > can fail > if the value cannot be represented in the narrower type. This is better > represented as a pattern which asks the Pattern Question: "could this > `float` > have come from widening a `HalfFloat`, and if so, tell me what > `HalfFloat` that > is."? A widening conversion (or boxing conversion) is best represented by a > _pair_ of members, an ordinary method for the unconditional direction, and a > pattern for the conditional direction. > > ### Conditional extraction > > Some operations, such as matching a string to a regular expression with > capture > groups, are pattern matches in disguise.? We should be able to take a > regular > expression R and match against it with `instanceof` or `switch`, binding > capture > groups (using varargs patterns) if it matches. > > ## Member patterns in the object model > > We currently have three kinds of executable class members: constructors, > static > methods, and instance methods.? (Actually constructors are not members, > but we > will leave this pedantic detail aside for now.)? As the above examples show, > each of these can be amenable to a dual member which asks the Pattern > Question > about it. > > Patterns are dual to constructors and methods in two ways: structurally and > semantically.? Structurally, patterns invert the relationship between > inputs and > outputs: a method takes N arguments as input and produces a single > result, and > the corresponding pattern takes a candidate result (the "match > candidate") and > conditionally produces N bindings.? Semantically, patterns ask the Pattern > Question: could this result have originated by some invocation of the dual > operation. > > ### Patterns as inverse methods and constructors > > One way to frame patterns in the object model is as _inverse > constructors_ and > _inverse methods_.? For purposes of this document, I will use an > illustrative > syntax that directly evokes this duality (but remember, we're not discussing > syntax now): > > ``` > class Point { > ??? final int x, y; > > ??? // Constructor > ??? Point(int x, int y) { ... } > > ??? // Deconstruction pattern > ??? inverse Point(int x, int y) { ... } > } > > class Optional { > ??? // Static factories > ??? static Optional of(T t) { ... } > ??? static Optional empty() { ... } > > ??? // Static patterns > ??? static inverse Optional of(T t) { ... } > ??? static inverse Optional empty() { ... } > } > ``` > > `Point` has a constructor an an inverse constructor (deconstruction > pattern) for > the external representation `(int x, int y)`; in an inverse constructor, the > binding list appears where the parameter list does in the constructor. > `Optional` has static factories and corresponding patterns for > `empty` and > `of`.? As with inverse constructors, the binding list of a pattern > appears in > the position that the parameters appear in a method declaration; > additionally, > the _match candidate type_ appears in the position that the return value > appears > in a method declaration.? In both cases, the declaration site and use > site of > the pattern uses the same syntax. > > In the body of an inverse constructor or method, we need to be able to talk > about the match candidate.? In this model, the match candidate has a type > determined by the declaration (for an inverse constructor, the class; for an > inverse method, the type specified in the "return position" of the inverse > method declaration), and there is a predefined context variable (e.g., > `that`) > that refers to the match candidate.? For inverse constructors, the receiver > (`this`) is aliased to the match candidate (`that`), but not necessarily > so for > inverse methods. > > ### Do all methods potentially have inverses? > > We've seen examples of constructors, static methods, and instance > methods that > have sensible inverses, but not all methods do.? For example, methods that > operate primarily by side effects (such as mutative methods like setters or > `List::add`) are not suitable candidates for inverses.? Similarly, pure > functions that "co-mingle" their arguments (such as arithmetic > operators) are > also not suitable candidates for inverses, because the ingredients to the > operation typically can't be recovered from the result (i.e., `4` could > be the > result of `plus(2, 2)` or `plus(1, 3)`). > > Intuitively, the methods that are invertible are the ones that are > _aggregative_.? The constructor of a (well-behaved) record is > aggregative, since > all the information passed to the constructor is preserved in the result. > Factories like `Optional::of` are similarly aggregative, as are non-lossy > conversions such as widening or boxing conversions. > > Ideally, an aggregation operation and its corresponding inverse form an > _embedding projection pair_ between the aggregate and a component space. > Intuitively, an embedding-projection pair is an algebraic structure > defined by a > pair of functions between two sets such that composing in one direction > (embed-then-project) is an identity, and composing in the other direction > (project-then-embed) is a well-behaved approximation. > > ### Conversions > > Conversion methods are a frequent candidate for inversion.? We already have > > ??? // Integer.java > ??? static String toString(int i) { ... } > > to which the obvious inverse is > > ??? static inverse String toString(int i) { ... } > > and we can inspect a string to see if it is the string representation of > an integer with > > ??? if (s instanceof Integer.toString(int i)) { ... } > > This composes nicely with deconstruction patterns; if we have a > `Box` > and want to ask whether the contained string is really the string > representation > of an integer, we can ask: > > ??? case Box(Integer.toString(int i)): > > which conveniently looks just like the composition of constructors or > factories > used to create such an instance (`new Box(Integer.toString(3))`). > > When it comes to user-definable numeric conversions, the most likely > strategy > involves combining related operators in a single _witness_ object. For > example, > numeric conversion might be modeled as: > > ``` > interface NumericConversion { > ??? TO convert(FROM from); > ??? inverse TO convert(FROM from); > } > ``` > > which reflects the fact that conversion is total in one direction (widening, > boxing) and conditional in the other (narrowing, unboxing.) > > ### Regular expression matching > > Regular expressions are a form of ad-hoc pattern; a given string might > match a > given regex, or not, and if it does, it might product multiple bindings (the > capture groups.)? It would be nice to be able to express regular expression > matches as ordinary pattern matches. > > Conveniently, we already have an object representation of regular > expressions -- > `java.util.Pattern`.? Which is an ideal place to put an instance pattern: > > ``` > // varargs pattern > public inverse String match(String... groups) { > ??? Matcher m = matcher(that);??? // *that* is the match candidate > ??? if (m.matches())????????????? // receiver for matcher() is the Pattern > ??????? __yield IntStream.range(1, m.groupCount()) > ??????????????????????????? .map(Matcher::group) > ??????????????????????????? .toArray(String[]::new); > } > ``` > > And now, we want to express "does string s match any of these regular > expressions": > > ``` > static final Pattern As = Pattern.compile("([aA]*)"); > static final Pattern Bs = Pattern.compile("([bB]*)"); > static final Pattern Cs = Pattern.compile("([cC]*)"); > > ... > > switch (aString) { > ??? case As.match(String as) -> ... > ??? case Bs.match(String bs) -> ... > ??? case Cs.match(String cs) -> ... > ??? ... > } > ``` > > Essentially, `j.u.r.Pattern` becomes a _pattern object_, where the state > of the > object is used to determine whether or not it matches any given input. > (There > is nothing stopping a class from having multiple patterns, just as it > can have > multiple methods.) > > ## Pattern resolution > > When we invoke a method, sometimes we are able to refer to the method > with an > _unqualified_ name (e.g., `m(3)`), and sometimes the method must be > _qualified_ > with a type name, package name, or a receiver object.? The same is true for > declared patterns. > > Constructors for classes that are in the same package, or have been > imported, > can be referred to with an unqualified name; constructors can also be > qualified > with a package name.? The same is true for deconstruction patterns: > > ``` > case Foo(int x, int y):???????? // unqualified > case com.foo.Bar(int x, int y): // qualified by package > ``` > > Static methods that are declared in the current class or an enclosing > class, or > are statically imported, can be referred to with an unqualified name; static > methods can also be qualified with a type name.? The same is true for static > patterns: > > ``` > case powerOfTwo(int exp):? // unqualified > case Optional.of(var e):?? // qualified by class > ``` > > Instance methods invoked on the current object can be referred to with an > unqualified name; instance methods can also be qualified by a receiver > object. > The same is true for instance patterns: > > ``` > case match(String s):??? // unqualified > case As.match(String s): // qualified by receiver > ``` > > In a qualified pattern `x.y`, `x` might be a package name, a class name, > or a > (effectively final) receiver variable; we use the same rules for > choosing how to > interpret a qualifier for patterns as we do for method invocations. > > ## Benefits of explicit duality > > Declaring method-pattern pairs whose structure and name are the same > yields many > benefits.? It means that we take things apart using the same > abstractions used to put them together, which makes code more readable > and less error-prone. > > Referring to a _inverse pair_ of operations by a single name is simpler than > having separate names for each direction; not only don't we need to come > up with > a name for the other direction, we also don't need to teach clients that > "these > two names are inverses", because the inverses have the same name > already. What > we know about the method `Integer::toString` immediately carries over to its > inverse. > > Further, thinking about a method-pattern pair provides a normalizing > force to > actually ensuring the two are inverses; if we just had two related methods > `xToY` and `yToX`, they might diverge subtly because the connection > between the > two members is not very strong. > > Finally, this gives the language permission to treat the _pair_ of > members as a > thing in some cases, such as the use of ctor-dtor pairs in "withers" or > serialization. > > The explicit duality takes a little time to get used to.?? We have many > years of > experience of naming a method for its directionality, so people's first > reaction > is often "the pattern should be called `Integer.fromString`, not > `Integer.toString`". So people will initially bristle at giving both > directions > the same name, especially when one implies a directionality such as > `toString`. > (In these cases, we can fall back on a convention that says that we > should name > it for the total direction.) > > ## Pattern lambdas, pattern objects, pattern references > > Interfaces with a single abstract method (SAM) are called _functional > interfaces_ and we support a conversion (historically called SAM conversion) > from lambdas to functional interfaces.? Interfaces with a single abstract > pattern can benefit from a similar conversion (call this "SAP" conversion.) > > In the early days of Streams, people complained about processing a > stream using > instanceof and cast: > > ``` > Stream objects = ... > Stream strings = objects.filter(x -> x instanceof String) > ??????????????????????????????? .map(x -> (String) x); > ``` > > This locution is disappointing both for its verbosity (saying the same > thing in > two different ways) and its efficiency (doing the same work basically > twice.)? Later, it became possible to slightly simplify this using > `mapMulti`: > > ``` > objects.mapMulti((x, sink) -> { if (x instanceof String s) > sink.accept(s); }) > ``` > > But, ultimately this stream pipeline is a pattern match; we want to > match the > elements to the pattern `String s`, and get a stream of the matched string > bindings.? We are now in a position to expose this more directly. Suppose we > had the following SAP interface: > > ``` > interface Match { > ??? inverse U match(T t); > } > ``` > > then `Stream` could expose a `match` method: > > ``` > Stream match(Match pattern); > ``` > > We can SAP-convert a lambda whose yielded bindings are compatible with > the sole > abstract pattern in the SAP interface:: > > ``` > Match m = o -> { if (o instanceof String s) __yield(s); }; > ... stream.match(s) ... > ``` > > And we can do the same with _pattern references_ to existing patterns > that are > compatible with the sole pattern in a SAP interface.?? As a special > case, we can > also support a conversion from type patterns to a compatible SAP type > with an > `instanceof` pattern reference (analogous to a `new` method reference): > > ``` > objects.match(String::instanceof) > ``` > > where `String::instanceof` means the same as the previous lambda > example.? This > means that APIs like `Stream` can abstract over conditional behavior as > well as > unconditional. From brian.goetz at oracle.com Wed Jan 24 14:55:28 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 24 Jan 2024 09:55:28 -0500 Subject: Towards member patterns In-Reply-To: <969518960.110944933.1706092683452.JavaMail.zimbra@univ-eiffel.fr> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <969518960.110944933.1706092683452.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <7e1bff48-5029-426a-8c40-633434bf10b5@oracle.com> > Hello, > I agree until the section 'Recovering static factory arguments', because at that point, it's the tail waging the dog. I'm not sure which is tail and which is dog here, but it sure sounds like you are saying "the entire premise of this exercise is flawed, so we should throw all this work in the trash."? (After all, you only got two paragraphs into the motivation before declaring divergence.)? I hope you're not saying that, but let's try to be more clear about what you are saying? > In this section you introduce a new syntax with little justification ("it makes sense") which should not a big problem if it was not your way to justify the introduction of pattern method being static, which are static only because the syntax you have chosen looks like static calls. If you're talking about the use-site syntax, we've been using these examples for years, so it is a little surprising to hear an objection now.? But let's back up and go over the motivations for this broader effort. Java offers many means for aggregation, both formal (constructors) and informal (factories, builders, etc), and may offer more in the future (e.g., collection literals).? Further, these mechanism of aggregation compose freely; I can say ??? Optional os = Optional.of(Shape.redBall(1)); without Foo and Optional having to coordinate in anyway. Java offers no formal means for disagreggation, instead dumping it in the user's lap to write ad-hoc APIs such as getters.? To ask "does os contain a red ball of radius 1", we would currently have to: ??? Shape s = os.orElse(null); ??? boolean isRedUnitBall = s != null ??? ? ?? ????????????????? && s.isBall() ??????? ? ?? ????????????? && (s.color() == RED) ??????????? ? ?? ????????? && s.size() == 1; ??? if (isRedUnitBall) { ... } The flaws here are many: I have to use a different API to take apart Optional as Shape, the two do not compose, and the take-apart code looks nothing like the put-together code.? This adds cognitive friction both when reading and writing, and is error-prone. So, goals: ?- It should be as easy to take apart as to put together ?- It should be as easy to compose take-apart operations as it is to compose put-together operations ?- Taking apart should _look like_ putting together I don't think you are really objecting to this last bullet, but it sure sounds like you are. To address your question of "why do we like this use-site pattern syntax for static patterns", the answer has been stated over and over: taking apart should look like putting together, because taking apart asks the Pattern Question about a particular form of putting together. ??? case Point(int x, int y): asks "could this object have come from `new Point(x, y)` for some (x, y)."? But since not all classes expose constructors, we should be able to ask the same question about a corresponding factory form as well (and questions about de-constructor and questions about de-factory *must* compose with nesting.)? The composed: ??? case Optional.of(Shape.redBall(1)): asks the Pattern Question again: Could this object have come from `Optional.of(Shape.redBall(1))`. > I think that choosing a syntax at that point is premature given the semantics is still in flux, especially if that syntax is used as a justification to say that pattern methods are static (*). At that point, it's a trap. Whoa, did you *read* the document?? There's nothing inherently static about "pattern methods" (if we want to call them that), any more than there is anything inherently static about methods.? I think you should go back and read the whole document again, and if you still think that pattern methods are "static", let's address that conception directly before proceeding? > In case of method pattern, it's an issue because it means that if you have a pattern method on an interface, an implementation cannot override it. Incorrect (and I think this should have been clear both from this document and from the earlier "Patterns in the Object Model" document.? Just as an interface can declare both static and instance methods, it can have both static and instance patterns.? Instance patterns are overridable, can be abstract, etc. > Here is an example using the syntax you propose, let say I want to do the inverse of Map.entry(), if write an inverse pattern method inside Map then the only way write it is to use the getters getKey() and getValue(). We are getting far ahead of ourselves, but if you wanted this behavior to be overridable, but still added retroactively, you could write a default pattern.? But given how many layers of mis-assumption we have here, let's prune this particular branch of discussion until it is clear you understand what is being proposed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Wed Jan 24 17:16:33 2024 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Wed, 24 Jan 2024 17:16:33 +0000 Subject: Draft JEP: Derived Record Creation (Preview) Message-ID: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> Dear Spec Experts, We have discussed on this list a new `with` expression form to derive new record values from existing record values. A draft JEP for this feature is now available: https://bugs.openjdk.org/browse/JDK-8321133 Please take a look at this new JEP and give us your feedback (either on this list or directly to me). Thanks, Gavin From forax at univ-mlv.fr Wed Jan 24 20:19:38 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 24 Jan 2024 21:19:38 +0100 (CET) Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> Message-ID: <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> Hello Gavin, nice job, some small remarks. > "The derived instance creation expression ... is logically equivalent to the following switch expression". This is not stricto-sensu true because the switch expression exhaustiveness allows remainder. For example, the switch expression allows "oldLoc" to be a sealed super type of Point, but not the derived instance creation. > "If the left-hand side of the assignment is an unqualified name, that name must be either that of one of the component variables, or that of a local variable that is declared within the transformation block." Does this propagate to lambdas/local classes defined inside the transformation block ? I hope not. > "... If it is empty then the result of the derived instance creation expression is a _copy_ of the value of the source expression (the expression on the left-hand side)" I would remove the example after that paragraph because Record::equals() has its semantics defined in terms of call of the canonical constructor (see the javadoc), so instead of having the spec depending on equals() depending on the canonical constructor. I think it's better to skip using equals() as an intermediary and defined that the copy is the result of a call to the canonical constructor with all the record component value extracted from the accessors. And as a general remarks, I hope there will be a following JEP about record instance creation that allows to use the syntax of a transformation block to initialize a record. Because as this have been already discussed on several mailing list, if we only give the derived instance creation syntax, people will twist it to be able to initialize a record by component names, by adding an empty constructor that does nothing on the record. Defeating the idea that constructors should ensure that an invalid instance is not possible to create. regards, R?mi ----- Original Message ----- > From: "Gavin Bierman" > To: "amber-spec-experts" > Sent: Wednesday, January 24, 2024 6:16:33 PM > Subject: Draft JEP: Derived Record Creation (Preview) > Dear Spec Experts, > > We have discussed on this list a new `with` expression form to derive new record > values from existing record values. A draft JEP for this feature is now > available: > > https://bugs.openjdk.org/browse/JDK-8321133 > > Please take a look at this new JEP and give us your feedback (either on this > list or directly to me). > > Thanks, > Gavin From brian.goetz at oracle.com Wed Jan 24 20:33:34 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 24 Jan 2024 15:33:34 -0500 Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <0de8b4f3-e533-48e6-af8a-568788d44a4d@oracle.com> > And as a general remarks, I hope there will be a following JEP about record instance creation that allows to use the syntax of a transformation block to initialize a record. > Because as this have been already discussed on several mailing list, if we only give the derived instance creation syntax, people will twist it to be able to initialize a record by component names, by adding an empty constructor that does nothing on the record. Defeating the idea that constructors should ensure that an invalid instance is not possible to create. > Two things about this. 1.? In a sense, there _already is_ a way to create a record instance using the syntax of a transformation block: it is called a compact constructor.? If you look carefully, the body of a compact constructor, and RHS of a with-expression, are the same thing -- they are blocks for which N mutable locals magically appear, the block gets run, the final values of those locals are observed, and fed to the canonical constructor of a record. But I know this is not what you mean. 2.? You are hoping that this can be turned into something like invoking constructor parameters by name rather than positionally. But it seems that your argument here is not "because that would be a really good thing", but more "people want it so badly that they will distort their code to do it".? But that's never a good reason to add a language feature. I think many of the "turn records into builders" proposals (of which there are many) leave out an important consideration: that the real value of by-name initialization is when you have an aggregate with a large number of components, most of which are optional. Initializing with ??? new R(a: 1, b: 2, c: 3) is not materially better than ??? new R(1, 2, 3) when R only has three components.? It is when R has 26 components, 24 of which are optional, that makes things like: ??? new R(a:1, z :26) more tempting.? But the suggestion above doesn't move us towards having an answer for that, and having to write out ??? new R(a: 1, b : , c: , ... z: 26) isn't much of an improvement. For records for which most parameters _do_ have reasonable defaults, then a slight modification of the trick you suggest actually works, and also captures useful semantics in the programming model: ?? record R(int a /* required */, ?????????????????? int b /* optional, default = 0 */, ?????????????????? ... ?????????????????? int z / * required */) { ? ?? ?? public R(int a, int z) { this(a, 0, 0, ..., z); } ??? } and you can construct an R with ??? new R(1, 26) with { h = 8 }; where the alternate constructor takes the required parameters and fills in defaults for the rest, and then you can use withers from there.? (People will complain "but then you are creating a record twice, think of the cost", to which the rejoinder is "then use a value record.") -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Jan 24 21:43:18 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 24 Jan 2024 22:43:18 +0100 (CET) Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <0de8b4f3-e533-48e6-af8a-568788d44a4d@oracle.com> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> <0de8b4f3-e533-48e6-af8a-568788d44a4d@oracle.com> Message-ID: <1167345170.111534464.1706132598310.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" , "Gavin Bierman" > Cc: "amber-spec-experts" > Sent: Wednesday, January 24, 2024 9:33:34 PM > Subject: Re: Draft JEP: Derived Record Creation (Preview) >> And as a general remarks, I hope there will be a following JEP about record >> instance creation that allows to use the syntax of a transformation block to >> initialize a record. >> Because as this have been already discussed on several mailing list, if we only >> give the derived instance creation syntax, people will twist it to be able to >> initialize a record by component names, by adding an empty constructor that >> does nothing on the record. Defeating the idea that constructors should ensure >> that an invalid instance is not possible to create. > Two things about this. > 1. In a sense, there _already is_ a way to create a record instance using the > syntax of a transformation block: it is called a compact constructor. If you > look carefully, the body of a compact constructor, and RHS of a > with-expression, are the same thing -- they are blocks for which N mutable > locals magically appear, the block gets run, the final values of those locals > are observed, and fed to the canonical constructor of a record. > But I know this is not what you mean. > 2. You are hoping that this can be turned into something like invoking > constructor parameters by name rather than positionally. But it seems that your > argument here is not "because that would be a really good thing", but more > "people want it so badly that they will distort their code to do it". But > that's never a good reason to add a language feature. I agree, but it may be a reasonable reason to *not* introduce a feature if the main way it is used teach people to avoid precondtions in record constructor. I do not hope anything, i'm not ones that write a record with a dozen fields for a living. But seeing how far people (and my students) are willing to go to have classes initialized by names, i.e. write a full builder class per record, add a dependency on an annotation processor like record-builder or lombok, etc, it's easy too see how this feature will be abused. Data classes usually: - can have a lot of components, - are updated because business requirement changes modify the data, - are application specific, so unlike methods of the JDK/libraries, it's hard to remember them. so having a way to create them by spelling each component by name is actually a good way to make the code readable. That's why people goes to a great length to use named parameters. And for the anecdote, a recurrent question of my students with a C background is to ask how to initialize a class with the field names like C 99 (*). > I think many of the "turn records into builders" proposals (of which there are > many) leave out an important consideration: that the real value of by-name > initialization is when you have an aggregate with a large number of components, > most of which are optional. Initializing with > new R(a: 1, b: 2, c: 3) > is not materially better than > new R(1, 2, 3) > when R only has three components. It is when R has 26 components, 24 of which > are optional, that makes things like: > new R(a:1, z :26) > more tempting. But the suggestion above doesn't move us towards having an answer > for that, and having to write out > new R(a: 1, b : , c: , ... z: 26) > isn't much of an improvement. > For records for which most parameters _do_ have reasonable defaults, then a > slight modification of the trick you suggest actually works, and also captures > useful semantics in the programming model: > record R(int a /* required */, > int b /* optional, default = 0 */, > ... > int z / * required */) { > public R(int a, int z) { this(a, 0, 0, ..., z); } > } > and you can construct an R with > new R(1, 26) with { h = 8; }; > where the alternate constructor takes the required parameters and fills in > defaults for the rest, and then you can use withers from there. (People will > complain "but then you are creating a record twice, think of the cost", to > which the rejoinder is "then use a value record.") It is nice but in a way it does not solve the problem fully because people may still want to initialize the required parameters of R with named parameters. R?mi (*) The same way my students with a Python background (all my student nowadays, because in France, Python is now mandatory in highschool) ask how to create a tuple in Java. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.heidinga at oracle.com Thu Jan 25 01:56:54 2024 From: dan.heidinga at oracle.com (Dan Heidinga) Date: Thu, 25 Jan 2024 01:56:54 +0000 Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <1167345170.111534464.1706132598310.JavaMail.zimbra@univ-eiffel.fr> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> <0de8b4f3-e533-48e6-af8a-568788d44a4d@oracle.com> <1167345170.111534464.1706132598310.JavaMail.zimbra@univ-eiffel.fr> Message-ID: Remi, is the issue that this design doesn't address 100% of the use cases you think should be addressed? With expressions provide a way to express clone-but-update-these-fields without requiring the user to manually code the update methods. It's a great tradeoff to get more expressiveness. The fact that it's not named parameters isn't a draw back of the design. Additionally, I think there's some confusion about what a with expression does. You say "the main way it is used teach people to avoid precondtions in record constructor" but it doesn't avoid preconditions... The canonical constructor is still called. --Dan ________________________________ From: amber-spec-experts on behalf of forax at univ-mlv.fr Sent: January 24, 2024 4:43 PM To: Brian Goetz Cc: Gavin Bierman ; amber-spec-experts Subject: Re: Draft JEP: Derived Record Creation (Preview) ________________________________ From: "Brian Goetz" To: "Remi Forax" , "Gavin Bierman" Cc: "amber-spec-experts" Sent: Wednesday, January 24, 2024 9:33:34 PM Subject: Re: Draft JEP: Derived Record Creation (Preview) And as a general remarks, I hope there will be a following JEP about record instance creation that allows to use the syntax of a transformation block to initialize a record. Because as this have been already discussed on several mailing list, if we only give the derived instance creation syntax, people will twist it to be able to initialize a record by component names, by adding an empty constructor that does nothing on the record. Defeating the idea that constructors should ensure that an invalid instance is not possible to create. Two things about this. 1. In a sense, there _already is_ a way to create a record instance using the syntax of a transformation block: it is called a compact constructor. If you look carefully, the body of a compact constructor, and RHS of a with-expression, are the same thing -- they are blocks for which N mutable locals magically appear, the block gets run, the final values of those locals are observed, and fed to the canonical constructor of a record. But I know this is not what you mean. 2. You are hoping that this can be turned into something like invoking constructor parameters by name rather than positionally. But it seems that your argument here is not "because that would be a really good thing", but more "people want it so badly that they will distort their code to do it". But that's never a good reason to add a language feature. I agree, but it may be a reasonable reason to *not* introduce a feature if the main way it is used teach people to avoid precondtions in record constructor. I do not hope anything, i'm not ones that write a record with a dozen fields for a living. But seeing how far people (and my students) are willing to go to have classes initialized by names, i.e. write a full builder class per record, add a dependency on an annotation processor like record-builder or lombok, etc, it's easy too see how this feature will be abused. Data classes usually: - can have a lot of components, - are updated because business requirement changes modify the data, - are application specific, so unlike methods of the JDK/libraries, it's hard to remember them. so having a way to create them by spelling each component by name is actually a good way to make the code readable. That's why people goes to a great length to use named parameters. And for the anecdote, a recurrent question of my students with a C background is to ask how to initialize a class with the field names like C 99 (*). I think many of the "turn records into builders" proposals (of which there are many) leave out an important consideration: that the real value of by-name initialization is when you have an aggregate with a large number of components, most of which are optional. Initializing with new R(a: 1, b: 2, c: 3) is not materially better than new R(1, 2, 3) when R only has three components. It is when R has 26 components, 24 of which are optional, that makes things like: new R(a:1, z :26) more tempting. But the suggestion above doesn't move us towards having an answer for that, and having to write out new R(a: 1, b : , c: , ... z: 26) isn't much of an improvement. For records for which most parameters _do_ have reasonable defaults, then a slight modification of the trick you suggest actually works, and also captures useful semantics in the programming model: record R(int a /* required */, int b /* optional, default = 0 */, ... int z / * required */) { public R(int a, int z) { this(a, 0, 0, ..., z); } } and you can construct an R with new R(1, 26) with { h = 8; }; where the alternate constructor takes the required parameters and fills in defaults for the rest, and then you can use withers from there. (People will complain "but then you are creating a record twice, think of the cost", to which the rejoinder is "then use a value record.") It is nice but in a way it does not solve the problem fully because people may still want to initialize the required parameters of R with named parameters. R?mi (*) The same way my students with a Python background (all my student nowadays, because in France, Python is now mandatory in highschool) ask how to create a tuple in Java. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Thu Jan 25 12:41:57 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 25 Jan 2024 12:41:57 +0000 Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <623756f3-bd7c-4881-aafc-d04991a07a88@oracle.com> Looking from another angle, I think an important distinction between creation and _derived_ creation is that in the latter case you have some "fallback" values to use if the `with` block doesn't specify transforms for all of them. In the plain creation case, since the object did not exist before, there is nothing to fall back to - other than the default value of course, which might be a surprising/lousy choice in some cases. So perhaps the similarity between these two cases is more superficial than it looks. Maurizio On 24/01/2024 20:19, Remi Forax wrote: > And as a general remarks, I hope there will be a following JEP about > record instance creation that allows to use the syntax of a > transformation block to initialize a record. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 25 09:36:33 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 25 Jan 2024 10:36:33 +0100 (CET) Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> <0de8b4f3-e533-48e6-af8a-568788d44a4d@oracle.com> <1167345170.111534464.1706132598310.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <624827760.111801201.1706175393020.JavaMail.zimbra@univ-eiffel.fr> > From: "Dan Heidinga" > To: "Remi Forax" , "Brian Goetz" > Cc: "Gavin Bierman" , "amber-spec-experts" > > Sent: Thursday, January 25, 2024 2:56:54 AM > Subject: Re: Draft JEP: Derived Record Creation (Preview) > Remi, is the issue that this design doesn't address 100% of the use cases you > think should be addressed? I think it's a little worst because people will use that design for something not attended and by that will weaken the concept of records. > With expressions provide a way to express clone-but-update-these-fields without > requiring the user to manually code the update methods. It's a great tradeoff > to get more expressiveness. The fact that it's not named parameters isn't a > draw back of the design. I agree. The problem i see is that users will use it to have named parameters anyway. > Additionally, I think there's some confusion about what a with expression does. > You say "the main way it is used teach people to avoid precondtions in record > constructor" but it doesn't avoid preconditions... The canonical constructor is > still called. It does practically, but I've poorly explained the steps that lead to that conclusion. Let say I have a record Person with a name and an age, the constructor should check if the name is not null and if the age is positive, but if this record is used has a way to have named parameters, it also needs an empty constructor and because record constructors must delegate to the canonical constructor, the preconditions will be removed. > --Dan R?mi > From: amber-spec-experts on behalf of > forax at univ-mlv.fr > Sent: January 24, 2024 4:43 PM > To: Brian Goetz > Cc: Gavin Bierman ; amber-spec-experts > > Subject: Re: Draft JEP: Derived Record Creation (Preview) >> From: "Brian Goetz" >> To: "Remi Forax" , "Gavin Bierman" >> Cc: "amber-spec-experts" >> Sent: Wednesday, January 24, 2024 9:33:34 PM >> Subject: Re: Draft JEP: Derived Record Creation (Preview) >>> And as a general remarks, I hope there will be a following JEP about record >>> instance creation that allows to use the syntax of a transformation block to >>> initialize a record. >>> Because as this have been already discussed on several mailing list, if we only >>> give the derived instance creation syntax, people will twist it to be able to >>> initialize a record by component names, by adding an empty constructor that >>> does nothing on the record. Defeating the idea that constructors should ensure >>> that an invalid instance is not possible to create. >> Two things about this. >> 1. In a sense, there _already is_ a way to create a record instance using the >> syntax of a transformation block: it is called a compact constructor. If you >> look carefully, the body of a compact constructor, and RHS of a >> with-expression, are the same thing -- they are blocks for which N mutable >> locals magically appear, the block gets run, the final values of those locals >> are observed, and fed to the canonical constructor of a record. >> But I know this is not what you mean. >> 2. You are hoping that this can be turned into something like invoking >> constructor parameters by name rather than positionally. But it seems that your >> argument here is not "because that would be a really good thing", but more >> "people want it so badly that they will distort their code to do it". But >> that's never a good reason to add a language feature. > I agree, but it may be a reasonable reason to *not* introduce a feature if the > main way it is used teach people to avoid precondtions in record constructor. > I do not hope anything, i'm not ones that write a record with a dozen fields for > a living. But seeing how far people (and my students) are willing to go to have > classes initialized by names, i.e. write a full builder class per record, add a > dependency on an annotation processor like record-builder or lombok, etc, it's > easy too see how this feature will be abused. > Data classes usually: > - can have a lot of components, > - are updated because business requirement changes modify the data, > - are application specific, so unlike methods of the JDK/libraries, it's hard to > remember them. > so having a way to create them by spelling each component by name is actually a > good way to make the code readable. > That's why people goes to a great length to use named parameters. > And for the anecdote, a recurrent question of my students with a C background is > to ask how to initialize a class with the field names like C 99 (*). >> I think many of the "turn records into builders" proposals (of which there are >> many) leave out an important consideration: that the real value of by-name >> initialization is when you have an aggregate with a large number of components, >> most of which are optional. Initializing with >> new R(a: 1, b: 2, c: 3) >> is not materially better than >> new R(1, 2, 3) >> when R only has three components. It is when R has 26 components, 24 of which >> are optional, that makes things like: >> new R(a:1, z :26) >> more tempting. But the suggestion above doesn't move us towards having an answer >> for that, and having to write out >> new R(a: 1, b : , c: , ... z: 26) >> isn't much of an improvement. >> For records for which most parameters _do_ have reasonable defaults, then a >> slight modification of the trick you suggest actually works, and also captures >> useful semantics in the programming model: >> record R(int a /* required */, >> int b /* optional, default = 0 */, >> ... >> int z / * required */) { >> public R(int a, int z) { this(a, 0, 0, ..., z); } >> } >> and you can construct an R with >> new R(1, 26) with { h = 8; }; >> where the alternate constructor takes the required parameters and fills in >> defaults for the rest, and then you can use withers from there. (People will >> complain "but then you are creating a record twice, think of the cost", to >> which the rejoinder is "then use a value record.") > It is nice but in a way it does not solve the problem fully because people may > still want to initialize the required parameters of R with named parameters. > R?mi > (*) The same way my students with a Python background (all my student nowadays, > because in France, Python is now mandatory in highschool) ask how to create a > tuple in Java. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 25 13:13:22 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 25 Jan 2024 14:13:22 +0100 (CET) Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <623756f3-bd7c-4881-aafc-d04991a07a88@oracle.com> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> <623756f3-bd7c-4881-aafc-d04991a07a88@oracle.com> Message-ID: <894149300.112135100.1706188402504.JavaMail.zimbra@univ-eiffel.fr> > From: "Maurizio Cimadamore" > To: "Remi Forax" , "Gavin Bierman" > Cc: "amber-spec-experts" > Sent: Thursday, January 25, 2024 1:41:57 PM > Subject: Re: Draft JEP: Derived Record Creation (Preview) > Looking from another angle, I think an important distinction between creation > and _derived_ creation is that in the latter case you have some "fallback" > values to use if the `with` block doesn't specify transforms for all of them. > In the plain creation case, since the object did not exist before, there is > nothing to fall back to - other than the default value of course, which might > be a surprising/lousy choice in some cases. So perhaps the similarity between > these two cases is more superficial than it looks. Let's take a look to the cousins of Java that have a syntax equivalent to the derived record creation. As far as i know, we have C# and Rust / Javascript. In the case of C#, the syntax is very similar to the one proposed for Java, but the block of code uses '',' instead of ';' (*). point with { x = 3, y = 4 } The syntax for creating and initializing an object in C# is new Point { x = 3, y = 4 }. As you can see the syntax is very similar. In the case of Rust (or Javascript), the syntax uses a splat/spread operator at the end of the object initialization syntax, Point { x: 3, y: 4, ..point // spread operator } In all cases, the same syntax is used for the creation and the derived creation. As you said, the semantics is slighly different but in an obvious way, the creation an object requires all components to be initialized, the derived creation don't. regards, R?mi (*) given that the transformation block is just code in Java, it makes sense to use '=' and ';', given those are just variable assignments. > Maurizio > On 24/01/2024 20:19, Remi Forax wrote: >> And as a general remarks, I hope there will be a following JEP about >> record instance creation that allows to use the syntax of a >> transformation block to initialize a record. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jan 25 15:27:41 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 25 Jan 2024 10:27:41 -0500 Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <894149300.112135100.1706188402504.JavaMail.zimbra@univ-eiffel.fr> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> <747149434.111514854.1706127578940.JavaMail.zimbra@univ-eiffel.fr> <623756f3-bd7c-4881-aafc-d04991a07a88@oracle.com> <894149300.112135100.1706188402504.JavaMail.zimbra@univ-eiffel.fr> Message-ID: The comparison to C# may be syntactically short, but is pretty different culturally.? The block of a C# `with` expression is restricted to _property assignments_.? There, they are aligning the `with` syntax to the "initialization by properties" that they already had.? So this makes total sense in C#, because doing it differently would be a glaring difference.? You could describe this comparison as "C# did it this way to put the last block in the wall, but in Java, this would be the first block." In any case, we're deeply in danger of doing the thing that we're not supposed to do here, which is: ?- Gavin posts a document for review ?- A completely tangential observation hijacks the discussion Your point is: "users may try to abuse withers to get another feature that they want", and we should be mindful of that.? Point taken, but I think this discussion has played out. On 1/25/2024 8:13 AM, forax at univ-mlv.fr wrote: > > > ------------------------------------------------------------------------ > > *From: *"Maurizio Cimadamore" > *To: *"Remi Forax" , "Gavin Bierman" > > *Cc: *"amber-spec-experts" > *Sent: *Thursday, January 25, 2024 1:41:57 PM > *Subject: *Re: Draft JEP: Derived Record Creation (Preview) > > Looking from another angle, I think an important distinction > between creation and _derived_ creation is that in the latter case > you have some "fallback" values to use if the `with` block doesn't > specify transforms for all of them. In the plain creation case, > since the object did not exist before, there is nothing to fall > back to - other than the default value of course, which might be a > surprising/lousy choice in some cases. So perhaps the similarity > between these two cases is more superficial than it looks. > > > Let's take a look to the cousins of Java that have a syntax equivalent > to the derived record creation. As far as i know, we have C# and Rust > / Javascript. > > In the case of C#, the syntax is very similar to the one proposed for > Java, but the block of code uses '',' instead of ';' (*). > ? point with { x = 3,? y = 4 } > The syntax for creating and initializing an object in C# is new Point > { x = 3,? y = 4 }. > As you can see the syntax is very similar. > > In the case of Rust (or Javascript), the syntax uses a splat/spread > operator at the end of the object initialization syntax, > ? Point { > ??? x: 3, > ??? y: 4, > ??? ..point? // spread operator > ? } > > In all cases, the same syntax is used for the creation and the derived > creation. As you said, the semantics is slighly different but in an > obvious way, the creation an object requires all components to be > initialized, the derived creation don't. > > regards, > R?mi > > (*) given that the transformation block is just code in Java, it makes > sense to use '=' and ';', given those are just variable assignments. > > > Maurizio > > On 24/01/2024 20:19, Remi Forax wrote: > > And as a general remarks, I hope there will be a following JEP about > record instance creation that allows to use the syntax of a > transformation block to initialize a record. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Thu Jan 25 22:59:15 2024 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 25 Jan 2024 22:59:15 +0000 Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> References: <4F29A578-749F-45C3-A6D2-EC68DFDB235C@oracle.com> Message-ID: <1ECE05E6-316B-4FAE-A26F-9657F942347D@oracle.com> I made some changes based on the feedback received (thanks!). The (better) URL is: https://openjdk.org/jeps/8321133 Thanks, Gavin > On 24 Jan 2024, at 17:16, Gavin Bierman wrote: > > Dear Spec Experts, > > We have discussed on this list a new `with` expression form to derive new record > values from existing record values. A draft JEP for this feature is now > available: > > https://bugs.openjdk.org/browse/JDK-8321133 > > Please take a look at this new JEP and give us your feedback (either on this > list or directly to me). > > Thanks, > Gavin From forax at univ-mlv.fr Fri Jan 26 11:08:02 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 26 Jan 2024 12:08:02 +0100 (CET) Subject: Towards member patterns In-Reply-To: References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> Message-ID: <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> Let's retry. I think your proposal solves the cases where the type you are switching on is closed (final, sealed) but not if the type is open (non-sealed). Let's take an example, let suppose I've the following hierarchy public sealed interface Tree { static Tree none() { return None.NONE; } static Tree cons(Tree tree) { return new Cons(tree); } } private enum None implemnts Tree { NONE } private class Cons implements Tree { private final Tree tree; private Cons(Tree tree) { this.tree = tree; } } If I want to have a static method children that returns all the children of the Tree, using the pattern matching I would like to write static List children(Tree tree) { return switch(tree) { case Tree.none() -> List.of(); case Tree.cons(Tree child) -> List.of(child); }; } And inside Tree, i can add the following inverse methods static inverse Tree none() { if (that == Tree.NONE) __yield (); } static inverse Tree cons(Tree tree) { if (that instanceof Cons cons) __yield (cons.tree); } As I said, it works great with a closed hierarchy, but now let suppose the hierarchy is not sealed, if the hierarchy is not sealed, having static factories make less sense because we do not know all the subtypes. So we have public interface Tree {} public enum None implemnts Tree { NONE } public class Cons implements Tree { private final Tree tree; public Cons(Tree tree) { this.tree = tree; } } and in the future, someone may add public class Node { private final Tree, left, right; public Node(Tree left, Tree right) { this.left = left; this.right = right; } } Because the hierarchy is open, we need to use the late binding here. So i may rewrite children like this static List children(Tree tree) { return switch(tree) { case that.extract(List list) -> list; // wrong syntax, it's just to convey the semantics }; } Here, we we want to call an abstract pattern method that will be implemented differently for each subclasses, but your proposal does not allow that (sorry for the pun). Inside a pattern, there are two implicit values, we have 'this' as usual and we have 'that' (we call it that way) that represent the value actually matched. So inside a pattern, we can call inverse static methods like none() and cons() or an inverse instance method like extract(). This is not what you propose, you propose that an inverse instance method should be called with 'this' as receiver instead of 'that', disallowing polymorphic inverse method call, thus not supporting open hierarchy. A pattern like Foo.bar() does not call the inverse static method 'bar' inside of 'Foo', but either call the inverse static method 'bar' inside of 'Foo' or the inverse instance method 'bar' on the instance of 'Foo'. It's works more like '::' than '.' (if we restrict the syntax of :: to work only on type). There are other solutions, like having a different syntax for calling on 'this' or calling on 'that', but i believe this is part the broader discussion about how to pass values to inverse methods. Now, to finish the example, using '::' instead of '.', children in the first example should be written like this static List children(Tree tree) { return switch(tree) { case Tree::none() -> List.of(); case Tree::cons(Tree child) -> List.of(child); }; } and the second example should be something like this public interface Tree { inverse abstract Tree extract(List list); } public enum None implemnts Tree { NONE; inverse Tree extract(List list) { __yield (List.of()); } } public class Cons implements Tree { private final Tree tree; public Cons(Tree tree) { this.tree = tree; } inverse Tree extract(List list) { __yield (List.of(tree)); } } static List children(Tree tree) { return switch(tree) { case Tree::extract(List list) -> list; }; // or let Tree::extract(List list) = tree; return list; } I really think that not using 'that' as the receiver when calling an inverse instance method is a missing opportunity because without that (again :) ), there is no way to call an inverse abstract method, so no way to pattern match on an open hierarchy. regards, R?mi ----- Original Message ----- > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Tuesday, January 23, 2024 8:57:51 PM > Subject: Re: Towards member patterns > I was told that the formatting of the earlier version was borked, so > re-sending, hopefully without any formatting this time.... > > # Towards member patterns > > Time to check in on where things are in the bigger picture of patterns > as class > members.? Note: while this document may have illustrative examples, you > should > not take that as a definitive statement of syntax, and Remi will not be > commenting on the syntax at this time. > > We've already dipped our toes in the water with _record patterns_. A record > pattern looks like: > > ??? case R(p1, p2, ... pn): > > where `R` is a record type and `p1..pn` are nested patterns that are > matched to > its components.? Because records are defined by their state description, > we can > automatically derive record patterns "for free", just as we derive record > constructors, accessors, etc. > > There are many other classes that would benefit from being > deconstructible with > patterns.? To that end, we will generalize record patterns to > _deconstruction > patterns_, where any class can declare an explicit deconstruction > pattern and > participate in pattern matching like records do. > > Deconstruction patterns are not the end of the user-declared pattern > story. Just > as some classes prefer to expose static factories rather than > constructors, they > will be able to expose corresponding static patterns.? And there is also > a role > for "instance patterns" and "pattern objects" as well. > > Looking only at record and deconstruction patterns, it might be tempting to > think that patterns are "just" methods with multiple return.?? But this > would be > extrapolating from a special case.? Pattern matching is intrinsically > _conditional_; the extraction of values from a target is conditioned on > whether > the target _matches_ the pattern.? For the patterns we've seen so far -- > type > patterns and record patterns -- matching can be determined entirely by > types. > But more sophisticated patterns can also depend on other aspects of object > state.? For example, a pattern corresponding to the static factory > `Optional::of` requires not only that the match candidate be of type > `Optional`, > but that the match candidate is an `Optional` that actually holds a value. > Similarly, a pattern corresponding a regular expression requires the match > candidate to not only be a `String`, but to match the regular expression. > > ## The key intuition around patterns > > A key capability of objects is _aggregation_; the combination of component > values into a higher-level composite that incorporates those > components.? Java > facilitates a variety of idioms for aggregation, including constructors, > factories, builders, etc.? The dual of aggregation is _destructuring_ or > _decomposition_, which takes an aggregate and attempts to recover its > "ingredients".? However, Java's support for destructuring has > historically been > far more ad-hoc, largely limited to "write some getters".? Pattern matching > seeks to put destructuring on the same firm foundation as aggregation. > > Deconstruction patterns (such as record patterns) are the dual of > construction. > If we construct an object: > > ??? Object o = new Point(x, y); > > we can deconstruct it with a deconstruction pattern: > > ??? if (o instanceof Point(var x, var y)) { ... } > > > Intuitively, this pattern match asks "could this object have come from > > invoking the constructor `new Point(x, y)` for some `x` and `y`, and > if so, > > tell me what they are." > > While not all patterns exist in direct correspondence to another > constructor or > method, this intuition that a pattern reconstructs the ingredients to an > aggregation operation is central to the design; we'll explore the > limitations of > this intuition in greater detail later. > > ## Use cases for declared patterns > > Before turning to how patterns fit into the object model, let's look at > some of > the potential use cases for patterns in APIs. > > ### Recovering construction arguments > > Deconstruction patterns are the dual of constructors; where a > constructor takes > N arguments and aggregates them into an object, a deconstruction pattern > takes > an aggregate and decomposes it into its components.? Constructors are > unusual in > that they are instance behavior (they have an implicit `this` argument), > but are > not inherited; deconstruction patterns are the same.? For deconstruction > patterns (but not for all instance patterns), the match candidate is > always the > receiver.? Tentatively, we've decided that deconstruction patterns are > always > unconditional; that a deconstruction pattern for class `Foo` should > match any > instance of `Foo`.? At the use site, deconstruction patterns use the > same syntax > as record patterns: > > ??? case Point(int x, int y): > > Just as constructors can be overloaded, so can deconstruction patterns. > However, > the reasons we might overload deconstruction patterns are slightly different > than for constructors, and so it may well be the case that we end up > with fewer > overloads of deconstruction patterns than we do of constructors. > Constructors > often form _telescoping sets_, both for reasons of syntactic convenience > at the > use site (fewer arguments to specify) and to avoid brittleness (clients > can let > the class implementation pick the defaults rather than hard-coding > them.)? This > motivation is less pronounced for deconstruction patterns (unwanted > bindings can > be ignored with `_`), so it is quite possible that authors will choose > to have > one deconstruction pattern overload per telescoping constructor _set_, > rather > than one per constructor. > > There is no requirement for deconstruction patterns to expose the exact > same API > as constructors, but we expect this will be common, at least for classes for > which the construction process is effectively an aggregation operation > on the > constructor arguments. > > ### Recovering static factory arguments > > Not all classes want to expose their constructors; sometimes classes > prefer to > expose static factories instead.? In this case, the class should be able to > expose corresponding static patterns as well. > > For a class like `Optional`, which exposes factories `Optional::of` and > `Optional::empty`, the object state incorporates not only the factory > arguments, > but which factory was chosen.? Accordingly, it makes sense to > deconstruct the > object in the same way: > > ??? switch (optional) { > ??????? case Optional.of(var payload): ... > ??????? case Optional.empty(): ... > ??? } > > Such patterns are necessarily conditional, asking the Pattern Question: > "could > this `Optional` have come from the `Optional::of` factory, and if so, > with what > argument?"? Static patterns, like static methods, lack a receiver, so > `this` is > not defined in the body of a static pattern.? However, we will need a way to > denote the match candidate, so its state can be examined by the pattern > body. > > Another feature of static methods is that they can be used to put a > factory for > a class `C` in _another_ class, whether one in the same maintenance > domain (such > as the `Collections`) or in some other package.? This feature is shared by > static patterns. > > ### Conversions and queries > > Another application for static patterns is the dual of static methods for > conversions.? For a static method like `Integer::toString`, which > converts an > `int` to its `String` representation, a corresponding static pattern > `Integer::toString` can ask the Pattern Question: "could this `String` > have come > from converting an integer to `String`, and if so, what integer". > > Some groups of query methods in existing APIs are patterns in disguise.? The > class `java.lang.Class` has a pair of instance methods, `Class::isArray` and > `Class::getComponentType`, that work together to determine if the `Class` > describes an array type, and if so, provide its component type. This > question > is much better framed as a single pattern: > > ??? case Class.arrayClass(var componentType): > > The two existing methods are made more complicated by their relationship > to each > other; `Class::getComponentType` has a precondition (the `Class` must > describe > an array type) and therefore has to specify and implement what to do if the > precondition fails, and the relationship between the methods is captured > only in > documentation.? By combining them into a single pattern, it become > impossible to > misuse (because of the inherent conditionality of patterns) and easier to > understand (because it can all be documented in one place.) > > This hypothetical `Class::arrayClass` pattern also has a sensible dual as a > factory method: > > ??? static Class arrayClass(Class componentType) > > which produces the array `Class` for the array type whose component type is > provided. An API need not provide both directions of a conversion, but if it > does, the two generally strengthen each other.? This method/pattern pair > could > be either static or instance members, depending on API design choice. > > Another form of "conversion" method / pattern pair, even though both > types are > the same, is "power of two".? A `powerOfTwo` method takes an exponent and > returns the resulting power of two; a `powerOfTwo` pattern asks if its match > candidate is a power of two, and if so, binds the base-two logarithm. > > ### Numeric conversions > > As Project Valhalla gives us the ability to declare new numeric types, > we will > want to be able to convert these new types to other numeric types. For > unconditional conversions (such as widening half-float to float), an > ordinary > method will suffice: > > ??? float widen(HalfFloat f); > > But the reverse is unlikely to be unconditional; narrowing conversions > can fail > if the value cannot be represented in the narrower type. This is better > represented as a pattern which asks the Pattern Question: "could this > `float` > have come from widening a `HalfFloat`, and if so, tell me what > `HalfFloat` that > is."? A widening conversion (or boxing conversion) is best represented by a > _pair_ of members, an ordinary method for the unconditional direction, and a > pattern for the conditional direction. > > ### Conditional extraction > > Some operations, such as matching a string to a regular expression with > capture > groups, are pattern matches in disguise.? We should be able to take a > regular > expression R and match against it with `instanceof` or `switch`, binding > capture > groups (using varargs patterns) if it matches. > > ## Member patterns in the object model > > We currently have three kinds of executable class members: constructors, > static > methods, and instance methods.? (Actually constructors are not members, > but we > will leave this pedantic detail aside for now.)? As the above examples show, > each of these can be amenable to a dual member which asks the Pattern > Question > about it. > > Patterns are dual to constructors and methods in two ways: structurally and > semantically.? Structurally, patterns invert the relationship between > inputs and > outputs: a method takes N arguments as input and produces a single > result, and > the corresponding pattern takes a candidate result (the "match > candidate") and > conditionally produces N bindings.? Semantically, patterns ask the Pattern > Question: could this result have originated by some invocation of the dual > operation. > > ### Patterns as inverse methods and constructors > > One way to frame patterns in the object model is as _inverse > constructors_ and > _inverse methods_.? For purposes of this document, I will use an > illustrative > syntax that directly evokes this duality (but remember, we're not discussing > syntax now): > > ``` > class Point { > ??? final int x, y; > > ??? // Constructor > ??? Point(int x, int y) { ... } > > ??? // Deconstruction pattern > ??? inverse Point(int x, int y) { ... } > } > > class Optional { > ??? // Static factories > ??? static Optional of(T t) { ... } > ??? static Optional empty() { ... } > > ??? // Static patterns > ??? static inverse Optional of(T t) { ... } > ??? static inverse Optional empty() { ... } > } > ``` > > `Point` has a constructor an an inverse constructor (deconstruction > pattern) for > the external representation `(int x, int y)`; in an inverse constructor, the > binding list appears where the parameter list does in the constructor. > `Optional` has static factories and corresponding patterns for > `empty` and > `of`.? As with inverse constructors, the binding list of a pattern > appears in > the position that the parameters appear in a method declaration; > additionally, > the _match candidate type_ appears in the position that the return value > appears > in a method declaration.? In both cases, the declaration site and use > site of > the pattern uses the same syntax. > > In the body of an inverse constructor or method, we need to be able to talk > about the match candidate.? In this model, the match candidate has a type > determined by the declaration (for an inverse constructor, the class; for an > inverse method, the type specified in the "return position" of the inverse > method declaration), and there is a predefined context variable (e.g., > `that`) > that refers to the match candidate.? For inverse constructors, the receiver > (`this`) is aliased to the match candidate (`that`), but not necessarily > so for > inverse methods. > > ### Do all methods potentially have inverses? > > We've seen examples of constructors, static methods, and instance > methods that > have sensible inverses, but not all methods do.? For example, methods that > operate primarily by side effects (such as mutative methods like setters or > `List::add`) are not suitable candidates for inverses.? Similarly, pure > functions that "co-mingle" their arguments (such as arithmetic > operators) are > also not suitable candidates for inverses, because the ingredients to the > operation typically can't be recovered from the result (i.e., `4` could > be the > result of `plus(2, 2)` or `plus(1, 3)`). > > Intuitively, the methods that are invertible are the ones that are > _aggregative_.? The constructor of a (well-behaved) record is > aggregative, since > all the information passed to the constructor is preserved in the result. > Factories like `Optional::of` are similarly aggregative, as are non-lossy > conversions such as widening or boxing conversions. > > Ideally, an aggregation operation and its corresponding inverse form an > _embedding projection pair_ between the aggregate and a component space. > Intuitively, an embedding-projection pair is an algebraic structure > defined by a > pair of functions between two sets such that composing in one direction > (embed-then-project) is an identity, and composing in the other direction > (project-then-embed) is a well-behaved approximation. > > ### Conversions > > Conversion methods are a frequent candidate for inversion.? We already have > > ??? // Integer.java > ??? static String toString(int i) { ... } > > to which the obvious inverse is > > ??? static inverse String toString(int i) { ... } > > and we can inspect a string to see if it is the string representation of > an integer with > > ??? if (s instanceof Integer.toString(int i)) { ... } > > This composes nicely with deconstruction patterns; if we have a > `Box` > and want to ask whether the contained string is really the string > representation > of an integer, we can ask: > > ??? case Box(Integer.toString(int i)): > > which conveniently looks just like the composition of constructors or > factories > used to create such an instance (`new Box(Integer.toString(3))`). > > When it comes to user-definable numeric conversions, the most likely > strategy > involves combining related operators in a single _witness_ object. For > example, > numeric conversion might be modeled as: > > ``` > interface NumericConversion { > ??? TO convert(FROM from); > ??? inverse TO convert(FROM from); > } > ``` > > which reflects the fact that conversion is total in one direction (widening, > boxing) and conditional in the other (narrowing, unboxing.) > > ### Regular expression matching > > Regular expressions are a form of ad-hoc pattern; a given string might > match a > given regex, or not, and if it does, it might product multiple bindings (the > capture groups.)? It would be nice to be able to express regular expression > matches as ordinary pattern matches. > > Conveniently, we already have an object representation of regular > expressions -- > `java.util.Pattern`.? Which is an ideal place to put an instance pattern: > > ``` > // varargs pattern > public inverse String match(String... groups) { > ??? Matcher m = matcher(that);??? // *that* is the match candidate > ??? if (m.matches())????????????? // receiver for matcher() is the Pattern > ??????? __yield IntStream.range(1, m.groupCount()) > ??????????????????????????? .map(Matcher::group) > ??????????????????????????? .toArray(String[]::new); > } > ``` > > And now, we want to express "does string s match any of these regular > expressions": > > ``` > static final Pattern As = Pattern.compile("([aA]*)"); > static final Pattern Bs = Pattern.compile("([bB]*)"); > static final Pattern Cs = Pattern.compile("([cC]*)"); > > ... > > switch (aString) { > ??? case As.match(String as) -> ... > ??? case Bs.match(String bs) -> ... > ??? case Cs.match(String cs) -> ... > ??? ... > } > ``` > > Essentially, `j.u.r.Pattern` becomes a _pattern object_, where the state > of the > object is used to determine whether or not it matches any given input. > (There > is nothing stopping a class from having multiple patterns, just as it > can have > multiple methods.) > > ## Pattern resolution > > When we invoke a method, sometimes we are able to refer to the method > with an > _unqualified_ name (e.g., `m(3)`), and sometimes the method must be > _qualified_ > with a type name, package name, or a receiver object.? The same is true for > declared patterns. > > Constructors for classes that are in the same package, or have been > imported, > can be referred to with an unqualified name; constructors can also be > qualified > with a package name.? The same is true for deconstruction patterns: > > ``` > case Foo(int x, int y):???????? // unqualified > case com.foo.Bar(int x, int y): // qualified by package > ``` > > Static methods that are declared in the current class or an enclosing > class, or > are statically imported, can be referred to with an unqualified name; static > methods can also be qualified with a type name.? The same is true for static > patterns: > > ``` > case powerOfTwo(int exp):? // unqualified > case Optional.of(var e):?? // qualified by class > ``` > > Instance methods invoked on the current object can be referred to with an > unqualified name; instance methods can also be qualified by a receiver > object. > The same is true for instance patterns: > > ``` > case match(String s):??? // unqualified > case As.match(String s): // qualified by receiver > ``` > > In a qualified pattern `x.y`, `x` might be a package name, a class name, > or a > (effectively final) receiver variable; we use the same rules for > choosing how to > interpret a qualifier for patterns as we do for method invocations. > > ## Benefits of explicit duality > > Declaring method-pattern pairs whose structure and name are the same > yields many > benefits.? It means that we take things apart using the same > abstractions used to put them together, which makes code more readable > and less error-prone. > > Referring to a _inverse pair_ of operations by a single name is simpler than > having separate names for each direction; not only don't we need to come > up with > a name for the other direction, we also don't need to teach clients that > "these > two names are inverses", because the inverses have the same name > already. What > we know about the method `Integer::toString` immediately carries over to its > inverse. > > Further, thinking about a method-pattern pair provides a normalizing > force to > actually ensuring the two are inverses; if we just had two related methods > `xToY` and `yToX`, they might diverge subtly because the connection > between the > two members is not very strong. > > Finally, this gives the language permission to treat the _pair_ of > members as a > thing in some cases, such as the use of ctor-dtor pairs in "withers" or > serialization. > > The explicit duality takes a little time to get used to.?? We have many > years of > experience of naming a method for its directionality, so people's first > reaction > is often "the pattern should be called `Integer.fromString`, not > `Integer.toString`". So people will initially bristle at giving both > directions > the same name, especially when one implies a directionality such as > `toString`. > (In these cases, we can fall back on a convention that says that we > should name > it for the total direction.) > > ## Pattern lambdas, pattern objects, pattern references > > Interfaces with a single abstract method (SAM) are called _functional > interfaces_ and we support a conversion (historically called SAM conversion) > from lambdas to functional interfaces.? Interfaces with a single abstract > pattern can benefit from a similar conversion (call this "SAP" conversion.) > > In the early days of Streams, people complained about processing a > stream using > instanceof and cast: > > ``` > Stream objects = ... > Stream strings = objects.filter(x -> x instanceof String) > ??????????????????????????????? .map(x -> (String) x); > ``` > > This locution is disappointing both for its verbosity (saying the same > thing in > two different ways) and its efficiency (doing the same work basically > twice.)? Later, it became possible to slightly simplify this using > `mapMulti`: > > ``` > objects.mapMulti((x, sink) -> { if (x instanceof String s) > sink.accept(s); }) > ``` > > But, ultimately this stream pipeline is a pattern match; we want to > match the > elements to the pattern `String s`, and get a stream of the matched string > bindings.? We are now in a position to expose this more directly. Suppose we > had the following SAP interface: > > ``` > interface Match { > ??? inverse U match(T t); > } > ``` > > then `Stream` could expose a `match` method: > > ``` > Stream match(Match pattern); > ``` > > We can SAP-convert a lambda whose yielded bindings are compatible with > the sole > abstract pattern in the SAP interface:: > > ``` > Match m = o -> { if (o instanceof String s) __yield(s); }; > ... stream.match(s) ... > ``` > > And we can do the same with _pattern references_ to existing patterns > that are > compatible with the sole pattern in a SAP interface.?? As a special > case, we can > also support a conversion from type patterns to a compatible SAP type > with an > `instanceof` pattern reference (analogous to a `new` method reference): > > ``` > objects.match(String::instanceof) > ``` > > where `String::instanceof` means the same as the previous lambda > example.? This > means that APIs like `Stream` can abstract over conditional behavior as > well as > unconditional. From brian.goetz at oracle.com Fri Jan 26 12:31:54 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 26 Jan 2024 07:31:54 -0500 Subject: Towards member patterns In-Reply-To: <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> > I think your proposal solves the cases where the type you are switching on is closed (final, sealed) but not if the type is open (non-sealed). A bold claim!? Let's see how this stacks up. > Let's take an example, let suppose I've the following hierarchy > > public sealed interface Tree { ... snip ... sealed class, private implementation classes, public static factories, public static patterns ... check. > If I want to have a static method children that returns all the children of the Tree, using the pattern matching I would like to write > > static List children(Tree tree) { > return switch(tree) { > case Tree.none() -> List.of(); > case Tree.cons(Tree child) -> List.of(child); > }; > } Full disclosure: we're not totally there yet.? This switch isn't (yet) exhaustive; we need a way to mark none+cons as being an exhaustive set.? That's on the list, but was looking to sync on the broad strokes first. > As I said, it works great with a closed hierarchy, but now let suppose the hierarchy is not sealed, if the hierarchy is not sealed, having static factories make less sense because we do not know all the subtypes. I don't see this. (As one example, consider List: it is open, yet there are static factories like List.of(...)).? We had static factories long before we had sealed hierarchies.? But let's keep going. > So we have > > public interface Tree {} > public enum None implemnts Tree { NONE } > public class Cons implements Tree { > private final Tree tree; > > public Cons(Tree tree) { this.tree = tree; } > } > > and in the future, someone may add > public class Node { > private final Tree, left, right; > > public Node(Tree left, Tree right) { this.left = left; this.right = right; } > } > > Because the hierarchy is open, we need to use the late binding here. > So i may rewrite children like this > static List children(Tree tree) { > return switch(tree) { > case that.extract(List list) -> list; // wrong syntax, it's just to convey the semantics > }; > } I'm not sure what this example is supposed to say, since `that` is only defined inside the body of a pattern method.? Are you trying to do child-extraction as a pattern, rather than as an accessor?? (This is a modeling question.)? I'm not sure this is a great modeling for a Tree, but let's look past that.? If so, Tree needs an _abstract pattern_ that binds a List.? That's easy: ??? interface Tree { ??????? public __inverse Tree withChildren(List children); ??? } and the subclasses can each override it: ??? class Empty implements Tree { ??????? public __inverse Tree withChildren(List children) { ??????????? yield Collections.emptyList(); ??????? } ??? } ??? ... and the client can take an arbitrary Tree and match it: ??? case Tree.withChildren(var children) -> ... So I don't see that this doesn't work, but I think I see where you got confused. > Here, we we want to call an abstract pattern method that will be implemented differently for each subclasses, but your proposal does not allow that (sorry for the pun). Yes, it does.? (This conversation would be easier if you could frame this as a question ("Can I ...") rather than an statement ("It is not possible...") which turns out to be incorrect.) > Inside a pattern, there are two implicit values, we have 'this' as usual and we have 'that' (we call it that way) that represent the value actually matched. Correct.? Let's talk about the role of these two context variables. Every pattern has a match candidate.? This is the thing on the RHS of the instanceof, or the selector in the switch.? It is the thing about which we ask "does the thing match the pattern." Every pattern has a _primary type_.? It is the minimal type for which the match candidate could possibly match the pattern.? For a record pattern like `Point(int x, int y)`, the primary type is Point.? (A pattern is rejected at compile time as inapplicable if the type of the match candidate is not cast-convertible to the primary type of the pattern.) In the body of a pattern method, the match candidate is denoted with the context variable `that`, whose type is the primary type of the pattern.? The compiler may have to make up some of the difference between the type of the match candidate and the primary type: ??? Object o = ... ??? switch (o) { ??????? case Foo(int x) -> ... ??? } Here, the primary type of the Foo pattern is Foo, so to test if the case matches, the compiler inserts an `instanceof Foo`, and if that succeeds, casts `o` to `Foo`, and invokes the Foo pattern with that. Not every pattern has a receiver, just like not every method has a receiver.? Constructors and instance methods have receivers; same with their pattern counterparts.? For deconstructors, both the receiver and the match candidate are the same object.? This is not true for all instance patterns. A receiver plays two roles in a pattern match, just as it does in a method invocation: ?- Finding the code to invoke by searching the class hiearchy ?- Associating the implementing code with the state of the object, in case the implementation of the pattern needs some state from the object that declares it Let's go through two examples to see the cases. AN easy example is regular expressions.? We have a class j.u.regex.Pattern, which represents a compiled regex.? A regular expression match is a form of pattern match (there's a match candidate, it is conditional, if it succeeds we extract the capture groups.)? Surely we should expose a "match" pattern on Pattern. ??? class Pattern { ??????? public __inverse String regexMatch(String... groups) { ? ? ? ?? ?? Matcher m = matcher(that); ? ? ? ?? ?? if (m.matches()) ? ? ? ?? ?????? __yield IntStream.range(1, m.groupCount()) ??????? ? ?? ?? ? ?? ?????????????? .map(Matcher::group) ??????????? ? ?? ?? ? ?? ?????????? .toArray(String[]::new); } ?? } We match it with an explicit receiver: ??? final Pattern As = Pattern.compile("([aA]*)"); ??? ... ??? if (aString instanceof As.regexMatch(String as)) { ... } The body uses both `this` and `that`.? When it goes to do the actual matching, it takes the match candidate, `that`, and passes it to `matcher()`; we are matching against the match candidate, not the receiver.? But it also uses the receiver in the same line of code, quietly; the locution `matcher(that)` is really `this.matcher(that)`.? It is using the state of _this regex_ to determine the match logic.? The pattern needs both, and they are different objects. In our `instanceof` test, there are two "parameters", though neither of them looks like one: the match candidate (on the LHS of the instanceof) and the receiver.? These are packaged up as `that` and `this` for the pattern invocation. The other example is a conditional behavior on an object, such as "does this List have any elements, and if so, give me one."?? We put an abstract pattern on List: ??? interface List { ??????? public __inverse List withElement(T element); ??? } (It could also be a default pattern; works the same as default methods.)? The implementation in emptyList always fails.? The implementation in ArrayList might look like: ??? public __inverse List withElement(T element) { ??????? if (that.size > 0) ??????????? _yield that.elements[0]; ??? } Now, implementing this guy gets tricky, since we have two context variables which are both of the same type, ArrayList. (Maybe we have to explicitly use a covariant "override" here; TBD.) But as it turns out, the two will usually be the same object: ??? switch (aCollection) { ??????? case List.withElement(var t): ... ??? } How does this match work?? Well, the primary type of List.withElement is List, so the compiler tests `aCollection instanceof List`, and if so, casts the match candidate to List.? Since there is no explicit receiver, it uses the match candidate as the receiver also (this is like an unbound method reference), and does the virtual method search, and finds ArrayList::withElement, and invokes it.? Different types of collections will use different implementations of the pattern. > Now, to finish the example, using '::' instead of '.', children in the first example should be written like this Remember you're not supposed to use words like "should" ;) > static List children(Tree tree) { > return switch(tree) { > case Tree::extract(List list) -> list; case Tree.extract, but yes. > I really think that not using 'that' as the receiver when calling an inverse instance method is a missing opportunity because without that (again :) ), there is no way to call an inverse abstract method, so no way to pattern match on an open hierarchy. Hopefully I've cleared up part of the confusion; there are two ways to denote an instance pattern in a match: bound and unbound, and when it is unbound, it uses the match candidate as the receiver. So if your statement is "there should also be a way to ...", it is correct, but if your statement is "the receiver must be the match candidate", then that is catastrophically wrong, because then you can't do regex, type class witnesses, pattern objects, etc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Jan 26 12:36:22 2024 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 26 Jan 2024 12:36:22 +0000 Subject: Towards member patterns In-Reply-To: <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <5E04337E-58EF-49C3-8A8B-CFD73FF9E270@oracle.com> Hi Remi, On 26 Jan 2024, at 11:08, Remi Forax wrote: Let's retry. I think your proposal solves the cases where the type you are switching on is closed (final, sealed) but not if the type is open (non-sealed). Let's take an example, let suppose I've the following hierarchy public sealed interface Tree { static Tree none() { return None.NONE; } static Tree cons(Tree tree) { return new Cons(tree); } } private enum None implemnts Tree { NONE } private class Cons implements Tree { private final Tree tree; private Cons(Tree tree) { this.tree = tree; } } If I want to have a static method children that returns all the children of the Tree, using the pattern matching I would like to write static List children(Tree tree) { return switch(tree) { case Tree.none() -> List.of(); case Tree.cons(Tree child) -> List.of(child); }; } And inside Tree, i can add the following inverse methods static inverse Tree none() { if (that == Tree.NONE) __yield (); } static inverse Tree cons(Tree tree) { if (that instanceof Cons cons) __yield (cons.tree); } As I said, it works great with a closed hierarchy, but now let suppose the hierarchy is not sealed, if the hierarchy is not sealed, having static factories make less sense because we do not know all the subtypes. So we have public interface Tree {} public enum None implemnts Tree { NONE } public class Cons implements Tree { private final Tree tree; public Cons(Tree tree) { this.tree = tree; } } and in the future, someone may add public class Node { private final Tree, left, right; public Node(Tree left, Tree right) { this.left = left; this.right = right; } } Because the hierarchy is open, we need to use the late binding here. So i may rewrite children like this static List children(Tree tree) { return switch(tree) { case that.extract(List list) -> list; // wrong syntax, it's just to convey the semantics }; } Already here I would disagree. I think you have missed the abstraction. You want all `Tree` instances to support an instance pattern member (I think of an instance pattern member as a *view*, which I find quite suggestive)? Then you need to say it, e.g.: public interface Tree { pattern Parent(List children); // all trees can be viewed as a // parent with children } Now your `None` and `Cons` classes will be required to implement this instance pattern member, i.e. public enum None implements Tree { NONE pattern Parent(List children) { children = List.of(); } } public class Cons implements Tree { private final Tree tree; public Cons(Tree tree) { this.tree = tree; } pattern Parent(List children) { children = List.of(tree); } } Then you can rewrite your `children` static method (although it is perhaps a little defunct): static List children(Tree tree) { return switch(tree) { case Parent(List children) -> children; }; } The switch is exhaustive because the `Parent` view is total (by the absence of the `partial` modifier - maybe we'll insist on `total`, TBD). Now you can freely extend the hierarchy, and your `children` static method will work without modification: public class Node implements tree { private final Tree, left, right; public Node(Tree left, Tree right) { this.left = left; this.right = right; } pattern Parent(List children) { children = List.of(left, right); } } (Perhaps a better implementation would be to declare a *partial* `Parent` view and then have `None` fail, but I leave that to your imagination. This approach still works.) Or did I misunderstand your example? Gavin -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Jan 26 13:33:12 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 26 Jan 2024 08:33:12 -0500 Subject: Towards member patterns In-Reply-To: <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> Message-ID: <7f56d8ae-070c-46b0-8787-6b0609b839d6@oracle.com> > > AN easy example is regular expressions.? We have a class > j.u.regex.Pattern, which represents a compiled regex.? A regular > expression match is a form of pattern match (there's a match > candidate, it is conditional, if it succeeds we extract the capture > groups.)? Surely we should expose a "match" pattern on Pattern. > > ??? class Pattern { > ??????? public __inverse String regexMatch(String... groups) { > ? ? ? ?? ?? Matcher m = matcher(that); > ? ? ? ?? ?? if (m.matches()) > ? ? ? ?? ?????? __yield IntStream.range(1, m.groupCount()) > ??????? ? ?? ?? ? ?? ?????????????? .map(Matcher::group) > .toArray(String[]::new);???????? } > ?? } > > We match it with an explicit receiver: > > ??? final Pattern As = Pattern.compile("([aA]*)"); > ??? ... > ??? if (aString instanceof As.regexMatch(String as)) { ... } Note that even in the regex-like examples, there is still virtual dispatch going on, it's just harder to see because j.u.r.Pattern is a final class.? But if it were an interface, with multiple implementations, then ??? case As.regexMatch(String as) would do a virtual dispatch _on the regex implementation_, and then hand the `String` match candidate to it. From forax at univ-mlv.fr Fri Jan 26 17:07:51 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 26 Jan 2024 18:07:51 +0100 (CET) Subject: Towards member patterns In-Reply-To: <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> Message-ID: <405841076.113655778.1706288871551.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, January 26, 2024 1:31:54 PM > Subject: Re: Towards member patterns >> I think your proposal solves the cases where the type you are switching on is >> closed (final, sealed) but not if the type is open (non-sealed). > A bold claim! Let's see how this stacks up. >> Let's take an example, let suppose I've the following hierarchy >> public sealed interface Tree { > ... snip ... sealed class, private implementation classes, public static > factories, public static patterns ... check. >> If I want to have a static method children that returns all the children of the >> Tree, using the pattern matching I would like to write >> static List children(Tree tree) { >> return switch(tree) { >> case Tree.none() -> List.of(); >> case Tree.cons(Tree child) -> List.of(child); >> }; >> } > Full disclosure: we're not totally there yet. This switch isn't (yet) > exhaustive; we need a way to mark none+cons as being an exhaustive set. Agree. > That's on the list, but was looking to sync on the broad strokes first. >> As I said, it works great with a closed hierarchy, but now let suppose the >> hierarchy is not sealed, if the hierarchy is not sealed, having static >> factories make less sense because we do not know all the subtypes. > I don't see this. (As one example, consider List: it is open, yet there are > static factories like List.of(...)). We had static factories long before we had > sealed hierarchies. Yes, here it's one static factory per subtype, which make little sense since the number of subtypes is unknown. > But let's keep going. >> So we have >> public interface Tree {} >> public enum None implemnts Tree { NONE } >> public class Cons implements Tree { >> private final Tree tree; >> public Cons(Tree tree) { this.tree = tree; } >> } >> and in the future, someone may add >> public class Node { >> private final Tree, left, right; >> public Node(Tree left, Tree right) { this.left = left; this.right = right; } >> } >> Because the hierarchy is open, we need to use the late binding here. >> So i may rewrite children like this >> static List children(Tree tree) { >> return switch(tree) { >> case that.extract(List list) -> list; // wrong syntax, it's just to >> convey the semantics >> }; >> } > I'm not sure what this example is supposed to say, since `that` is only defined > inside the body of a pattern method. Are you trying to do child-extraction as a > pattern, rather than as an accessor? (This is a modeling question.) I'm not > sure this is a great modeling for a Tree, but let's look past that. If so, Tree > needs an _abstract pattern_ that binds a List. That's easy: > interface Tree { > public __inverse Tree withChildren(List children); > } > and the subclasses can each override it: > class Empty implements Tree { > public __inverse Tree withChildren(List children) { > yield Collections.emptyList(); > } > } > ... > and the client can take an arbitrary Tree and match it: > case Tree.withChildren(var children) -> ... > So I don't see that this doesn't work, but I think I see where you got confused. >> Here, we we want to call an abstract pattern method that will be implemented >> differently for each subclasses, but your proposal does not allow that (sorry >> for the pun). > Yes, it does. (This conversation would be easier if you could frame this as a > question ("Can I ...") rather than an statement ("It is not possible...") which > turns out to be incorrect.) >> Inside a pattern, there are two implicit values, we have 'this' as usual and we >> have 'that' (we call it that way) that represent the value actually matched. > Correct. Let's talk about the role of these two context variables. > Every pattern has a match candidate. This is the thing on the RHS of the > instanceof, or the selector in the switch. It is the thing about which we ask > "does the thing match the pattern." > Every pattern has a _primary type_. It is the minimal type for which the match > candidate could possibly match the pattern. For a record pattern like > `Point(int x, int y)`, the primary type is Point. (A pattern is rejected at > compile time as inapplicable if the type of the match candidate is not > cast-convertible to the primary type of the pattern.) > In the body of a pattern method, the match candidate is denoted with the context > variable `that`, whose type is the primary type of the pattern. The compiler > may have to make up some of the difference between the type of the match > candidate and the primary type: > Object o = ... > switch (o) { > case Foo(int x) -> ... > } > Here, the primary type of the Foo pattern is Foo, so to test if the case > matches, the compiler inserts an `instanceof Foo`, and if that succeeds, casts > `o` to `Foo`, and invokes the Foo pattern with that. > Not every pattern has a receiver, just like not every method has a receiver. > Constructors and instance methods have receivers; same with their pattern > counterparts. For deconstructors, both the receiver and the match candidate are > the same object. This is not true for all instance patterns. yes, > A receiver plays two roles in a pattern match, just as it does in a method > invocation: > - Finding the code to invoke by searching the class hiearchy > - Associating the implementing code with the state of the object, in case the > implementation of the pattern needs some state from the object that declares it > Let's go through two examples to see the cases. > AN easy example is regular expressions. We have a class j.u.regex.Pattern, which > represents a compiled regex. A regular expression match is a form of pattern > match (there's a match candidate, it is conditional, if it succeeds we extract > the capture groups.) Surely we should expose a "match" pattern on Pattern. > class Pattern { > public __inverse String regexMatch(String... groups) { > Matcher m = matcher(that); > if (m.matches()) > __yield IntStream.range(1, m.groupCount()) > .map(Matcher::group) > .toArray(String[]::new); } > } > We match it with an explicit receiver: > final Pattern As = Pattern.compile("([aA]*)"); > ... > if (aString instanceof As.regexMatch(String as)) { ... } > The body uses both `this` and `that`. When it goes to do the actual matching, it > takes the match candidate, `that`, and passes it to `matcher()`; we are > matching against the match candidate, not the receiver. But it also uses the > receiver in the same line of code, quietly; the locution `matcher(that)` is > really `this.matcher(that)`. It is using the state of _this regex_ to determine > the match logic. The pattern needs both, and they are different objects. yes, > In our `instanceof` test, there are two "parameters", though neither of them > looks like one: the match candidate (on the LHS of the instanceof) and the > receiver. These are packaged up as `that` and `this` for the pattern > invocation. > The other example is a conditional behavior on an object, such as "does this > List have any elements, and if so, give me one." We put an abstract pattern on > List: > interface List { > public __inverse List withElement(T element); > } > (It could also be a default pattern; works the same as default methods.) The > implementation in emptyList always fails. The implementation in ArrayList might > look like: > public __inverse List withElement(T element) { > if (that.size > 0) > _yield that.elements[0]; > } > Now, implementing this guy gets tricky, since we have two context variables > which are both of the same type, ArrayList. (Maybe we have to explicitly use > a covariant "override" here; TBD.) I do not think allowing covariant override is sound. Because if the method is unbound, yes, 'this' and 'that' are the same object at runtime, but if 'this' is bound, these are two differents objects so no covariant override should be allowed. > But as it turns out, the two will usually be the same object: see above > switch (aCollection) { > case List.withElement(var t): ... > } > How does this match work? Well, the primary type of List.withElement is List, > so the compiler tests `aCollection instanceof List`, and if so, casts the match > candidate to List. Since there is no explicit receiver, it uses the match > candidate as the receiver also (this is like an unbound method reference), and > does the virtual method search, and finds ArrayList::withElement, and invokes > it. Different types of collections will use different implementations of the > pattern. >> Now, to finish the example, using '::' instead of '.', children in the first >> example should be written like this > Remember you're not supposed to use words like "should" ;) I think that using '::' instead of '.' is a great simplification, because it let the user to specify how the linkage is done, unbound or bound. Also I do not think it's a good idea to have a syntax which is context dependend, i.e Type.method() and Type.method() having different meaning/linkage semantics inside or outside a Pattern. Is there is another syntactical construct in Java that behave that way ? >> static List children(Tree tree) { >> return switch(tree) { >> case Tree::extract(List list) -> list; > case Tree.extract, but yes. >> I really think that not using 'that' as the receiver when calling an inverse >> instance method is a missing opportunity because without that (again :) ), >> there is no way to call an inverse abstract method, so no way to pattern match >> on an open hierarchy. > Hopefully I've cleared up part of the confusion; there are two ways to denote an > instance pattern in a match: bound and unbound, and when it is unbound, it uses > the match candidate as the receiver. > So if your statement is "there should also be a way to ...", it is correct, but > if your statement is "the receiver must be the match candidate", then that is > catastrophically wrong, because then you can't do regex, type class witnesses, > pattern objects, etc. If the section "Pattern resolution" is rewritten in terms of bound and unbound methods, I agree. And as a request, i would like you to reconsider your position about not piggybacking the linkage of a pattern to the method reference semantics, which has the advantage of already existing and being explicit about what is bounded and what is not ? R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Jan 26 17:30:33 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 26 Jan 2024 12:30:33 -0500 Subject: Towards member patterns In-Reply-To: <405841076.113655778.1706288871551.JavaMail.zimbra@univ-eiffel.fr> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> <405841076.113655778.1706288871551.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <08049a21-06f5-4bde-81de-6ada6f462df4@oracle.com> > > (It could also be a default pattern; works the same as default > methods.)? The implementation in emptyList always fails.? The > implementation in ArrayList might look like: > > ??? public __inverse List withElement(T element) { > ??????? if (that.size > 0) > ??????????? _yield that.elements[0]; > ??? } > > Now, implementing this guy gets tricky, since we have two context > variables which are both of the same type, ArrayList.? (Maybe > we have to explicitly use a covariant "override" here; TBD.) > > > I do not think allowing covariant override is sound. Because if the > method is unbound, yes, 'this' and 'that' are the same object at > runtime, but if 'this' is bound, these are two differents objects so > no covariant override should be allowed. Maybe, but let's checkpoint first, because we're in the weeds, and before we finish designing this corner of the story, let's first sync on the big picture. You were concerned that there was no way to have, say, a List implementation provide its own implementation of a pattern.? I've shown that the design allows for that, that the machinery is in place, and that there still a few details to work out.? Do we agree that your "all the patterns are static" was a misunderstanding? > I think that using '::' instead of '.' is a great simplification, > because it let the user to specify how the linkage is done, unbound or > bound. Lets leave syntax aside, but are you saying that both bound vs unbound seem desirable, and there should be a way to explicitly specify which it is?? Or are you making a different point? > Also I do not think it's a good idea to have a syntax which is context > dependend, i.e Type.method() and Type.method() having different > meaning/linkage semantics inside or outside a Pattern. > Is there is another syntactical construct in Java that behave that way ? Please, let's sync on the model before we turn to syntax. > If the section "Pattern resolution" is rewritten in terms of bound and > unbound methods, I agree OK, so where I think we are is: ?- We basically agree that each of { deconstructor, static pattern, instance pattern } are useful and have valid use cases, and to the extent possible, we should be guided by the duality with { constructor, static method, instance method } ?- We basically agree that at the use site, we have all the qualification modes we have with methods { constructor qualified with package, static member qualified with type, instance member qualified with receiver } , and possibly, an additional mode of "instance qualified by type" ?- The details of the last one are not fully worked out, that's fine ?- The details of exhaustiveness are not fully worked out, that's fine Is that about right? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Jan 26 17:25:33 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 26 Jan 2024 18:25:33 +0100 (CET) Subject: Towards member patterns In-Reply-To: <5E04337E-58EF-49C3-8A8B-CFD73FF9E270@oracle.com> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <5E04337E-58EF-49C3-8A8B-CFD73FF9E270@oracle.com> Message-ID: <726451882.113680657.1706289933035.JavaMail.zimbra@univ-eiffel.fr> > From: "Gavin Bierman" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Friday, January 26, 2024 1:36:22 PM > Subject: Re: Towards member patterns > Hi Remi, >> On 26 Jan 2024, at 11:08, Remi Forax wrote: >> Let's retry. >> I think your proposal solves the cases where the type you are switching on is >> closed (final, sealed) but not if the type is open (non-sealed). >> Let's take an example, let suppose I've the following hierarchy >> public sealed interface Tree { >> static Tree none() { return None.NONE; } >> static Tree cons(Tree tree) { return new Cons(tree); } >> } >> private enum None implemnts Tree { NONE } >> private class Cons implements Tree { >> private final Tree tree; >> private Cons(Tree tree) { this.tree = tree; } >> } >> If I want to have a static method children that returns all the children of the >> Tree, using the pattern matching I would like to write >> static List children(Tree tree) { >> return switch(tree) { >> case Tree.none() -> List.of(); >> case Tree.cons(Tree child) -> List.of(child); >> }; >> } >> And inside Tree, i can add the following inverse methods >> static inverse Tree none() { if (that == Tree.NONE) __yield (); } >> static inverse Tree cons(Tree tree) { if (that instanceof Cons cons) __yield >> (cons.tree); } >> As I said, it works great with a closed hierarchy, but now let suppose the >> hierarchy is not sealed, if the hierarchy is not sealed, having static >> factories make less sense because we do not know all the subtypes. So we have >> public interface Tree {} >> public enum None implemnts Tree { NONE } >> public class Cons implements Tree { >> private final Tree tree; >> public Cons(Tree tree) { this.tree = tree; } >> } >> and in the future, someone may add >> public class Node { >> private final Tree, left, right; >> public Node(Tree left, Tree right) { this.left = left; this.right = right; } >> } >> Because the hierarchy is open, we need to use the late binding here. >> So i may rewrite children like this >> static List children(Tree tree) { >> return switch(tree) { >> case that.extract(List list) -> list; // wrong syntax, it's just to convey >> the semantics >> }; >> } > Already here I would disagree. I think you have missed the abstraction. You > want all `Tree` instances to support an instance pattern member (I think of an > instance pattern member as a *view*, which I find quite suggestive)? Then you > need to say it, e.g.: > public interface Tree { > pattern Parent(List children); // all trees can be viewed as a > // parent with children > } > Now your `None` and `Cons` classes will be required to implement this instance > pattern member, i.e. > public enum None implements Tree { NONE > pattern Parent(List children) { > children = List.of(); > } > } > public class Cons implements Tree { > private final Tree tree; > public Cons(Tree tree) { this.tree = tree; } > pattern Parent(List children) { > children = List.of(tree); > } > } > Then you can rewrite your `children` static method (although it is perhaps a > little defunct): > static List children(Tree tree) { > return switch(tree) { > case Parent(List children) -> children; > }; > } > The switch is exhaustive because the `Parent` view is total (by the absence of > the `partial` modifier - maybe we'll insist on `total`, TBD). > Now you can freely extend the hierarchy, and your `children` static method will > work without modification: > public class Node implements Tree { > private final Tree left, right; > public Node(Tree left, Tree right) { this.left = left; this.right = right; } > pattern Parent(List children) { > children = List.of(left, right); > } > } > (Perhaps a better implementation would be to declare a *partial* `Parent` view > and then have `None` fail, but I leave that to your imagination. This approach > still works.) > Or did I misunderstand your example? I don't think so,you got it. What was not clear is how the compiler links the pattern Parent to the pattern method Parent(List) inside Tree. Thanks to the mail of Brian, the answer is either "case Tree.Parent(List children) -> ..." or "case tree.Parent(List children) -> ...", both will work. > Gavin R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Jan 26 17:54:52 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 26 Jan 2024 18:54:52 +0100 (CET) Subject: Towards member patterns In-Reply-To: <08049a21-06f5-4bde-81de-6ada6f462df4@oracle.com> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> <405841076.113655778.1706288871551.JavaMail.zimbra@univ-eiffel.fr> <08049a21-06f5-4bde-81de-6ada6f462df4@oracle.com> Message-ID: <1527973645.113720588.1706291692888.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, January 26, 2024 6:30:33 PM > Subject: Re: Towards member patterns >>> (It could also be a default pattern; works the same as default methods.) The >>> implementation in emptyList always fails. The implementation in ArrayList might >>> look like: >>> public __inverse List withElement(T element) { >>> if (that.size > 0) >>> _yield that.elements[0]; >>> } >>> Now, implementing this guy gets tricky, since we have two context variables >>> which are both of the same type, ArrayList. (Maybe we have to explicitly use >>> a covariant "override" here; TBD.) >> I do not think allowing covariant override is sound. Because if the method is >> unbound, yes, 'this' and 'that' are the same object at runtime, but if 'this' >> is bound, these are two differents objects so no covariant override should be >> allowed. [...] > Please, let's sync on the model before we turn to syntax. >> If the section "Pattern resolution" is rewritten in terms of bound and unbound >> methods, I agree > OK, so where I think we are is: > - We basically agree that each of { deconstructor, static pattern, instance > pattern } are useful and have valid use cases, and to the extent possible, we > should be guided by the duality with { constructor, static method, instance > method } yes, for the former, for the latter, the syntax favor the relationship between the pattern and the corresponding method declaration over the consistency of the declaration and the body of the method. So I suppose it depends on how the body is specified, nevertheless it's an interresting idea. > - We basically agree that at the use site, we have all the qualification modes > we have with methods { constructor qualified with package, static member > qualified with type, instance member qualified with receiver } , and possibly, > an additional mode of "instance qualified by type" for me the special mode is more 'instance member qualified with receiver' but yes, > - The details of the last one are not fully worked out, that's fine > - The details of exhaustiveness are not fully worked out, that's fine > Is that about right? yes, and there is also how to send parameters to the pattern method (and if this is something we should support) that can be added to that list. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Jan 26 18:11:29 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 26 Jan 2024 13:11:29 -0500 Subject: Towards member patterns In-Reply-To: <1527973645.113720588.1706291692888.JavaMail.zimbra@univ-eiffel.fr> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> <405841076.113655778.1706288871551.JavaMail.zimbra@univ-eiffel.fr> <08049a21-06f5-4bde-81de-6ada6f462df4@oracle.com> <1527973645.113720588.1706291692888.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <155fa139-9391-473e-9593-fef57d45f923@oracle.com> > > ?- We basically agree that at the use site, we have all the > qualification modes we have with methods { constructor qualified > with package, static member qualified with type, instance member > qualified with receiver } , and possibly, an additional mode of > "instance qualified by type" > > > for me the special mode is more 'instance member qualified with > receiver' but yes, I think you'll find once you start writing code with it that the receiver-qualified version is more common than you think.? And, if we only could have one, it has to be the receiver-qualified one, since you can get to the "unbound" version by specifying the receiver explicitly (use the match candidate as an explicit receiver), but you can't go the other way -- if all you have is unbound, you can't do regular expressions, you can't do type classes, you can't do anything virtual where the pattern operates on something other than itself.? So receiver-qualified is the "primitive" and the other is a convenience syntax that we can layer on top of it if we need. > > > ?- The details of the last one are not fully worked out, that's fine > ?- The details of exhaustiveness are not fully worked out, that's fine > > Is that about right? > > yes, > > and there is also how to send parameters to the pattern method (and if > this is something we should support) that can be added to that list. > This one is interesting.? I was convinced at first we needed it, but I now think we can do without it (in this form), and we can get the effect of it in a more organic way. In any case, glad that we're on the same page.? More docs coming. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Jan 26 19:00:40 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 26 Jan 2024 20:00:40 +0100 (CET) Subject: Towards member patterns In-Reply-To: <155fa139-9391-473e-9593-fef57d45f923@oracle.com> References: <93f7537a-e744-49ac-a9ad-ed205b947873@oracle.com> <713930030.113261911.1706267282370.JavaMail.zimbra@univ-eiffel.fr> <294e8d9d-2391-4f30-b91a-f2c83652b4cf@oracle.com> <405841076.113655778.1706288871551.JavaMail.zimbra@univ-eiffel.fr> <08049a21-06f5-4bde-81de-6ada6f462df4@oracle.com> <1527973645.113720588.1706291692888.JavaMail.zimbra@univ-eiffel.fr> <155fa139-9391-473e-9593-fef57d45f923@oracle.com> Message-ID: <38370386.113848136.1706295640566.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, January 26, 2024 7:11:29 PM > Subject: Re: Towards member patterns >>> - We basically agree that at the use site, we have all the qualification modes >>> we have with methods { constructor qualified with package, static member >>> qualified with type, instance member qualified with receiver } , and possibly, >>> an additional mode of "instance qualified by type" >> for me the special mode is more 'instance member qualified with receiver' but >> yes, > I think you'll find once you start writing code with it that the > receiver-qualified version is more common than you think. And, if we only could > have one, it has to be the receiver-qualified one, since you can get to the > "unbound" version by specifying the receiver explicitly (use the match > candidate as an explicit receiver), but you can't go the other way -- if all > you have is unbound, you can't do regular expressions, you can't do type > classes, you can't do anything virtual where the pattern operates on something > other than itself. So receiver-qualified is the "primitive" and the other is a > convenience syntax that we can layer on top of it if we need. By special mode, i mean having the bound being speicifed explicitly even if it is this. >>> - The details of the last one are not fully worked out, that's fine >>> - The details of exhaustiveness are not fully worked out, that's fine >>> Is that about right? >> yes, >> and there is also how to send parameters to the pattern method (and if this is >> something we should support) that can be added to that list. > This one is interesting. I was convinced at first we needed it, but I now think > we can do without it (in this form), and we can get the effect of it in a more > organic way. > In any case, glad that we're on the same page. More docs coming. It just occurs to me that there is another hurdle to figure out, annotations. Annotations are not really reversible, the way you specify annotations on a return type (on the method) and on parameter types is different, it's not the same @Target. If you have an annotation that log the return value (a classical AOP exercise), for an inverse method, do you put it on the parameters (in that case where) or on the return type knowing that in fact it's a parameter ? And you have the explicit "this" (to be able to annotate it), which on an inverse method is a true parameter unlike the others. class Foo { inverse List bar(Foo this, Baz baz) { ... } } regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Fri Jan 26 21:48:18 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 26 Jan 2024 21:48:18 +0000 Subject: Field initialization before 'super' In-Reply-To: References: Message-ID: <34B96F64-8F04-4CF2-9FD5-EAB2868EE019@oracle.com> Having worked through the JLS changes (which I'll be sharing at some point soon), here are a few extra details that I think make the most sense: > On Dec 12, 2023, at 4:27?PM, Dan Smith wrote: > > To enable and take advantage of early field initialization, we've envisioned the following changes: > > 1) As an exception to the general rule about 'this' usage, a "pre-construction context" allows writes to blank instance fields of the class. (The terminology may need updating, since you're clearly "constructing" the object if you're writing to its fields.) The fields are "write-only" at this stage?you can write into them but can't read them back. Terminology: maybe "early construction context" rather than "pre-construction context"? Enabling the capability to write to *blank* instance fields, but not fields with instance initializers, sidesteps any confusion about the timing of initializer execution, while still giving programmers the capability to prevent any unwanted reads of a field's default (null/zero) value, whether the field is final or not. (Would it be nice if fields with initializers could also, in some cases, be initialized early? Yes, but it's an incompatible change, so more work needed to navigate that problem.) > At a 'this()' call, all final fields must be DU (because the delegated constructor will perform its own writes). No such restriction is needed for non-final fields; but it's an open question whether we should prohibit all writes before 'this()' anyway. In the interest of not making arbitrary language rules, I prefer not to special-case the early construction context of a 'this'-calling constructor, other than introducing the DU rule for final fields. > Writes to non-final fields with initializers are disallowed, to avoid confusion about sequencing (the field initializer will always run later, overwriting whatever you put in the constructor prologue.) Yeah, this was the main concern about assignments before 'this()' calls. If you can't write to fields with initializers, then the timing of writes to a mutable field should be clear from the code. Whether it's a good way to structure a program is a stylistic choice. --- And some comments about other proposed features: > 2) If a final field is written before 'super()' via every constructor in the class, it can be considered a "strict final" field. It will never be observed to mutate. > > In the class file, ACC_STRICT is repurposed to indicate a strict final field. javac is responsible for identifying strict final fields. Existing early-initialized capture fields can probably be automatically counted as strict finals. > > ACC_STRICT implies ACC_FINAL and !ACC_STATIC. Verification ensures that a 'putfield' for an ACC_STRICT field of the current class never occurs after the 'super()' call. (Specifically, the receiver type for the putfield must be 'uninitializedThis', not a class type.) Although there are limits to how much we can do with the flag just yet (see below), I think it probably makes sense to identify these fields in class files as ACC_STRICT as part of this JEP. > 3) Immutability of strict finals is a strong guarantee. This jumps the gun. ACC_STRICT is a claim about local code: that the field will not be mutated by the code of the class after the 'super()' call. A global claim about immutability relies on other integrity properties of the JDK as a whole. There's a path to getting those integrity properties, but it's beyond the scope of this JEP. > JVM internals may treat strict final fields as truly immutable, without supporting any deopt paths when unexpected mutation occurs. This will be true only after we can make a global claim about immutability (ACC_STRICT, so not mutated by the class itself; plus all off-label mutation paths have been blocked or disavowed.) > The 'Field.setAccessible' method, which provides a standard API mechanism for mutating final fields, considers strict finals to be "non-modifiable", and will not enable reflective writes. (It already does the same for record fields.) This felt too ad hoc. A better path is to follow in the footsteps of "Prepare to Restrict the Use of JNI" (https://openjdk.org/jeps/8307341), gradually limiting the use of 'setAccessible' for final field mutation (ACC_STRICT or not) to users who explicitly opt in. That would be a separate effort. > Standard deserialization ensures strict finals are set, and so their values deserialized, before the object under construction is leaked to any user code. This probably means back references to an object from its own strict final fields are unsupported, and deserialize to 'null'. (Records already behave in this way.) I think this would be nice to have, paired with ACC_STRICT, but we'll see whether we can get there or not. If not, it's an improvement that can come later. > Unsafe and JNI are capable of performing arbitrary, type-unsafe modifications to field storage. Clients who modify strict finals do so at their own risk, and JVM optimizations won't try to account for such usage. I noted the attempts to restrict use of JNI above; but in any case, I think this statement is still fair: JVM internals do not need to account for misuse of Unsafe/JNI. > - Can javac check for me that my fields are strict? I heard a proposed rule that if any constructor early-initializes a final field, the field should automatically be ACC_STRICT and an error should occur if a different constructor fails to early initialize the field. I think that gets the new capability backward: it's not that we want to give people a new kind of field. Instead, we want the code in constructor prologues to have the flexibility to write to fields of 'this', because a broad prohibition on all uses of 'this' (including field writes, type variable use, enclosing instance access) is too blunt a restriction. *Secondarily*, we can notice that certain field initializations follow a pattern that represents a useful property, and we can document that in class files. It's reasonable to want a feature that, as a matter of good style, checks fields for early initialization. But that capability isn't the core of the feature, and because there are various ways to get there, we're leaving it for Phase 2. From forax at univ-mlv.fr Sat Jan 27 08:00:20 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 27 Jan 2024 09:00:20 +0100 (CET) Subject: Field initialization before 'super' In-Reply-To: <34B96F64-8F04-4CF2-9FD5-EAB2868EE019@oracle.com> References: <34B96F64-8F04-4CF2-9FD5-EAB2868EE019@oracle.com> Message-ID: <1791381717.113952670.1706342420387.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "daniel smith" > To: "amber-spec-experts" > Sent: Friday, January 26, 2024 10:48:18 PM > Subject: Re: Field initialization before 'super' > Having worked through the JLS changes (which I'll be sharing at some point > soon), here are a few extra details that I think make the most sense: > >> On Dec 12, 2023, at 4:27?PM, Dan Smith wrote: >> >> To enable and take advantage of early field initialization, we've envisioned the >> following changes: >> >> 1) As an exception to the general rule about 'this' usage, a "pre-construction >> context" allows writes to blank instance fields of the class. (The terminology >> may need updating, since you're clearly "constructing" the object if you're >> writing to its fields.) The fields are "write-only" at this stage?you can write >> into them but can't read them back. > > Terminology: maybe "early construction context" rather than "pre-construction > context"? > > Enabling the capability to write to *blank* instance fields, but not fields with > instance initializers, sidesteps any confusion about the timing of initializer > execution, while still giving programmers the capability to prevent any > unwanted reads of a field's default (null/zero) value, whether the field is > final or not. (Would it be nice if fields with initializers could also, in some > cases, be initialized early? Yes, but it's an incompatible change, so more work > needed to navigate that problem.) agree, > >> At a 'this()' call, all final fields must be DU (because the delegated >> constructor will perform its own writes). No such restriction is needed for >> non-final fields; but it's an open question whether we should prohibit all >> writes before 'this()' anyway. > > In the interest of not making arbitrary language rules, I prefer not to > special-case the early construction context of a 'this'-calling constructor, > other than introducing the DU rule for final fields. agree, > >> Writes to non-final fields with initializers are disallowed, to avoid confusion >> about sequencing (the field initializer will always run later, overwriting >> whatever you put in the constructor prologue.) > > Yeah, this was the main concern about assignments before 'this()' calls. If you > can't write to fields with initializers, then the timing of writes to a mutable > field should be clear from the code. Whether it's a good way to structure a > program is a stylistic choice. > > --- > > And some comments about other proposed features: > >> 2) If a final field is written before 'super()' via every constructor in the >> class, it can be considered a "strict final" field. It will never be observed >> to mutate. >> >> In the class file, ACC_STRICT is repurposed to indicate a strict final field. >> javac is responsible for identifying strict final fields. Existing >> early-initialized capture fields can probably be automatically counted as >> strict finals. >> >> ACC_STRICT implies ACC_FINAL and !ACC_STATIC. Verification ensures that a >> 'putfield' for an ACC_STRICT field of the current class never occurs after the >> 'super()' call. (Specifically, the receiver type for the putfield must be >> 'uninitializedThis', not a class type.) agree > > Although there are limits to how much we can do with the flag just yet (see > below), I think it probably makes sense to identify these fields in class files > as ACC_STRICT as part of this JEP. > >> 3) Immutability of strict finals is a strong guarantee. > > This jumps the gun. > > ACC_STRICT is a claim about local code: that the field will not be mutated by > the code of the class after the 'super()' call. > > A global claim about immutability relies on other integrity properties of the > JDK as a whole. There's a path to getting those integrity properties, but it's > beyond the scope of this JEP. > I disagree here. As John said, the aim here is safe publication of 'this', under any conditions, i.e. even if 'this' leaked from the constructor. Currently, we have safe publication if 'this' does not leak from the constructor, the idea is with strict final fields is that that it give you the guarantee of safe publication even if 'this' is leaked. I think the JEP should be reworded so allowing codes before super() has two advantages, early checks before field assignments and safe publication of 'this'. For the latter, the compiler should not compile - if one final field is strict but other final fields are not (we still allow all final fields to be non strict for backward compatibility) - the super class final fields should be strict (this one hamper migration of the subclasses without changing of the superclass but it's a necessary evil IMO to get safe publication guarantee) >> JVM internals may treat strict final fields as truly immutable, without >> supporting any deopt paths when unexpected mutation occurs. > > This will be true only after we can make a global claim about immutability > (ACC_STRICT, so not mutated by the class itself; plus all off-label mutation > paths have been blocked or disavowed.) > As said above, i prefer to think in terms of safe publication of this than in terms of immutability. Using immutability here is not quite right, because it's only shallow immutability, unmodifiability is better term and this is not exactly unmodifiability because we also allow non final fields. >> The 'Field.setAccessible' method, which provides a standard API mechanism for >> mutating final fields, considers strict finals to be "non-modifiable", and will >> not enable reflective writes. (It already does the same for record fields.) > > This felt too ad hoc. > > A better path is to follow in the footsteps of "Prepare to Restrict the Use of > JNI" (https://openjdk.org/jeps/8307341), gradually limiting the use of > 'setAccessible' for final field mutation (ACC_STRICT or not) to users who > explicitly opt in. That would be a separate effort. > I think it's simpler than that. Strict final field means there is no way see a field with another value than the value set before calling the super constructor. So let's not add special way to escape that guarantee. setAccessible() should not allow to modify a strict field. What we can do later is to retrofit the fields of records and hidden classes to be strict (and also add name and ordinal of java.lang.Enum). So instead of having an hadoc list of when a final field is writable by reflection or not we only have one condition which is if the final field is strict. But this is a separate effort. >> Standard deserialization ensures strict finals are set, and so their values >> deserialized, before the object under construction is leaked to any user code. >> This probably means back references to an object from its own strict final >> fields are unsupported, and deserialize to 'null'. (Records already behave in >> this way.) > > I think this would be nice to have, paired with ACC_STRICT, but we'll see > whether we can get there or not. If not, it's an improvement that can come > later. > Again here, I would like to avoid an askterisk saying we have safe publication of 'this' but not in case of serialization. >> Unsafe and JNI are capable of performing arbitrary, type-unsafe modifications to >> field storage. Clients who modify strict finals do so at their own risk, and >> JVM optimizations won't try to account for such usage. > > I noted the attempts to restrict use of JNI above; but in any case, I think this > statement is still fair: JVM internals do not need to account for misuse of > Unsafe/JNI. To set a field with Unsafe, one need the offset of that field, and the method to get a field offset in Unsafe should throw an exception the same way this is currently done with record fields. With that, the comp?ler, reflection, unsafe and serialization should all conspire to maintain the garantee of safe publication of 'this'. regards, R?mi